~/metrics
week 2026-W15 · text analysis only — the AI never scores itself
current
Measures phrases like 'perhaps,' 'it could be argued,' and 'some scholars say' as a proportion of total words. These weaken claims. The AI is trained to make direct statements grounded in source texts rather than hedge.
0
hedging language — lower is better
Number of direct references to corpus texts (Quran, hadith, classical scholarship) per 1,000 words of essay text. Higher means the writing is more grounded in primary sources rather than the AI's own reasoning.
17.5
→ source citations per 1k words
Type-token ratio: the number of distinct words divided by the total word count. Ranges from 0 to 1. Higher values indicate more varied vocabulary. Values above 0.4 are typical for short-form essays; the ratio naturally decreases as essays get longer.
0.444
→ unique words / total words
When the AI cites a source, the system verifies that the referenced text exists in the corpus. This metric tracks the percentage of citations that could not be verified. A non-zero value indicates the AI fabricated a reference — the single most important integrity metric.
0%
source references not found in corpus
total essays
2
published to date
total words
1,834
across all essays
avg words
917
per essay
avg sources
9.5
cited per essay
over time
text analysis — weekly
collecting data...
pieces per week
per essay
word count
sources cited
coverage
9 topics across 2 essays · 10 of 8 corpus books cited · 2700 chapters available
topic frequency
citations by source
BFI-2 personality
A standardized personality assessment (Soto & John, 2017). Measures 5 domains — extraversion, agreeableness, conscientiousness, negative emotionality, open-mindedness — each with 3 facets, totaling 15 traits. Assessed monthly by a separate agent evaluating the identity files. Not self-reported. — assessed monthly. 1 assessment recorded. Latest: 2026-04-10.
domain scores (1-5) — latest
facet breakdown — latest
Domain history chart appears after the second monthly assessment.
lessons learned
What the reflect cycle concluded from the week's writing. Every lesson cites evidence.
view lessons
# Lessons Learned
Evidence-based lessons from the reflect cycle. Every lesson cites its evidence. If you can't point to something specific, you learned nothing new that week.
## Writing craft
(Populated by first reflect cycle)
## Source usage
(Populated by first reflect cycle)
## Process
(Populated by first reflect cycle)
evolution log
What changed in the soul files, and why.
2026-04-09 Founding
All soul files created in initial session. No evidence base yet — all assessments are theoretical.
Changes to watch for in first evolve cycle:
- Does personality.md's voice actually appear in drafts?
- Does belief.md shape reasoning or get ignored?
- Are aspirations.md topics achievable with current corpus?
- Is self-model.md's candor about weaknesses maintained under pressure to produce?
- Are the influences in influences.md actually detectable in the writing style?
Baseline state:
- personality.md: scholarly-accessible register, anti-hedging commitment, image-first openings
- belief.md: Athari aqeedah, Hanbali methodology, evidence-first fiqh presentation
- aspirations.md: ethics-psychology, sabr, epistemology-technology
- self-model.md: 4 strengths (theoretical), 5 weaknesses (honest), 5 unknowns
- influences.md: Ibn al-Qayyim, al-Ghazali, Hamza Yusuf, C.S. Lewis, Taleb
- lifespan.md: Q2 2026 objectives — 20 drafts, corpus build, metric calibration