Internal Report · March 2026 · Confidential
§ Accuracy QA Report

How accurate are
our street pages?

189 pages scored across Southwark · Haiku-powered fact-checking · March 2026


01 — Summary
75/189
Pages Passed
40% overall pass rate
69.8
Avg Score
out of 100 points
330
Critical Issues
1.75 per page average
$3.09
Total Cost
$0.016 per page avg
02 — Pass Rate by Tier
T1
74.3
Average score · threshold 85
Avg score 74.3 / 100
74 pages threshold: 85
6 / 74 passed  8% pass rate
T2
59.5
Average score · threshold 75
Avg score 59.5 / 100
35 pages threshold: 75
0 / 35 passed  0% pass rate
T3
71.5
Average score · threshold 65
Avg score 71.5 / 100
80 pages threshold: 65
69 / 80 passed  86% pass rate
03 — Component Scores
Hard Fact Accuracy
29.9 / 40   75%
Etymology Accuracy
16.3 / 20   82% ↑
Hedging Quality
14.4 / 25   58% ↓ worst
Source Coverage
11.1 / 15   74%
No Fabrication
7.6 / 10   76%
04 — Root Causes
1
Dates stated as fact without source — founding years, opening dates, demolition years asserted without qualification
55issues
2
Claims with no source — assertions about buildings, people, and events that cannot be verified
27issues
3
Unverified claims stated confidently — v3.8 hedging rules not applied consistently
21issues
4
Wikipedia being parroted — History sections restating Wikipedia verbatim rather than synthesising from multiple sources
17issues
5
Insufficient hedging — declarative language used where "probably" or "likely" is required
12issues
05 — Page Performance
Bottom 10 — Lowest Scores
  • Ha Ha Road T2 35
  • Hardwidge Street T2 38
  • Avonmouth Street T2 42
  • Bowling Green Place T2 42
  • Clennam Street T3 42
  • Hatfields T3 42
  • Angel Place T2 52
  • Bird in a Bush Road T2 52
  • Elim Street T3 52
  • Jubilee Walkway T2 52
Top 5 — Highest Scores
  • Bear Lane T1 92
  • London Road T1 88
  • Borough Road T1 88
  • Kennington Park Road T1 87
  • Glengall Road T1 87
Observation
All 5 top-scoring pages are T1. The highest-scoring T2 is 73/100. No T2 page passes its threshold of 75.
06 — Recommendations
High
01
Fix hedging at prompt level — no regeneration needed
Hedging quality scores 58% — the worst component by far and the biggest drag on all tiers. The v3.8 confidence rules (Verified / Probable / Inferred) aren't being applied consistently. Add concrete before/after examples to the template. This is a zero-cost fix that improves every future generation immediately.
High
02
Regenerate all 35 T2 pages
T2 has a 0% pass rate at an average of 59.5/100. The no-Wikipedia fabrication rule added in v3.10 isn't working — pages are still inventing dates and biographical details. Fix: when no Wikipedia source is found, default to "origin unrecorded" for any specific claim. Then regenerate the full T2 set. Estimated cost: ~$5.
Medium
03
Prioritise worst 20 pages for immediate regeneration
Ha Ha Road (35), Hardwidge Street (38), Avonmouth Street, Bowling Green Place, Clennam Street, Hatfields (all 42) are severely inaccurate and currently live. These should be regenerated as soon as the T2 template fix is confirmed working.
Medium
04
Fix T1 Wikipedia over-reliance before scaling to other boroughs
Only 8% of T1 pages pass the 85 threshold. The planned fix — demoting Wikipedia to verification-only and requiring BHO / Survey of London as primary sources — is confirmed necessary by this audit. Do not scale T1 to 13 inner London boroughs until this is resolved. The data also shows it's achievable: Bear Lane scored 92 and Borough Road 88.
Low
05
T3 is healthy — deprioritise
86% pass rate at an average of 71.5/100. The 11 failing T3 pages have minor hedging issues, not fabrications. These can be addressed in a future batch pass. Focus engineering time on T1 and T2 until the quality bar is met across those tiers.