Methodology
We don't guess what AI thinks about you. We ask each engine directly and score what it actually says. Every scoring decision is documented, versioned, and public.
Core principle
The thing being tested produces the output being scored.
When you test ChatGPT's response, ChatGPT is the one answering. The scoring is derived from analyzing each engine's real output against the rubric criteria. We don't ask one model what another model thinks. Every finding is grounded in a real response from the engine being evaluated.
Five pillars
40 criteria across 5 weighted pillars
Discoverability
25%Can answer engines find you?
Citation Presence
30%When buyers ask, do you get cited?
Answer Quality
20%When cited, is the description accurate?
Source Authority
15%What sources do engines cite when mentioning you?
Content Architecture
10%How well is your site structured for AI consumption?
Scoring pipeline
From query to playbook in 6 steps
Evidence gathering
Query panel sends 50 prompts to each of the 5 AI engines. Each engine responds as itself. Playwright crawls your site for technical signals.
Response analysis
Parse each engine's raw response into structured evidence: citations found, claims made, sources referenced, factual statements extracted.
Rubric scoring
Structured evidence scored against criterion definitions. Deterministic criteria scored programmatically. Judgment criteria scored against anchored descriptors.
Aggregation
Weighted sum across criteria → pillar score → overall GOAT Score (0-100).
Playbook generation
For every criterion below threshold, generate engine-specific remediation playbooks grounded in the real findings.
Versioning
Every run is versioned. Methodology changes are auditable. audit_type field determines which rubric applies.
Disclosures
Synthetic panel ≠ real user queries
GoatEO runs a curated panel of 50 queries through AI engine APIs. These approximate what real buyers ask, but they are not real buyer sessions. The panel composition is published and versioned.
API responses ≠ consumer UX
API responses may differ from web and mobile app experiences. UI-specific features (like citation cards or image results) are not captured. Google AI Overviews are captured via SERP proxy to approximate the consumer experience.