A technical checklist of the schema markup, content structure, and crawler access requirements that make a site eligible to be cited by ChatGPT, Perplexity, Gemini, and Grok.
These six schema types are the foundation of a citation-eligible site. Prioritize in order.
JSON-LD Organization schema with name, url, description, foundingDate, industry, sameAs (LinkedIn, Crunchbase, G2, Wikipedia if applicable), and knowsAbout. This is the primary entity signal AI platforms use to understand what your brand is.
Validate with Google's Rich Results Test and confirm sameAs includes at least 3 external profiles.
Every FAQ section on your site should be marked up with FAQPage JSON-LD. This allows AI platforms to extract specific question-answer pairs and cite them in response to matching prompts.
Aim for 6–10 questions per page, written as natural-language queries (not keyword phrases).
Every key landing page should answer its primary query in the first 1–2 sentences. AI systems extract 'answer paragraphs' and favor pages that lead with the answer rather than bury it in body copy.
Test each page by asking its target query to Perplexity — if it doesn't cite your page, the answer structure likely needs revision.
Ensure GPTBot, PerplexityBot, ClaudeBot, GoogleBot, and BingBot are not blocked in your robots.txt. Many sites accidentally block AI crawlers during security or bot-blocking updates.
Use Amplerank's robots.txt checker to verify all AI bots have access to your key pages.
Any page with step-by-step instructions should include HowTo JSON-LD schema. How-to queries are among the highest-volume AI search categories — structured markup significantly improves citation rates for this query type.
Include meaningful HowToStep descriptions (not just titles) — AI platforms use the step text, not just the step name.
Blog posts, guides, and reports should include Article JSON-LD with headline, description, author (Person + Organization), datePublished, dateModified, and publisher. This enables AI models to attribute editorial content to your brand with proper authorship signals.
Include the author's name as a Person type alongside the Organization — both signals matter for E-E-A-T.
AI platforms can only cite content they can crawl and index. Run through this list quarterly.
Schema makes you discoverable. Content structure determines whether AI extracts and cites you.
Amplerank checks your schema, crawler access, content signals, and citation rates in one place — then shows you exactly what to fix first.
Start your auditAI citation eligibility is a technical state — not just a content quality judgment. A site can have excellent content but be ineligible for AI citations due to blocked crawlers, missing schema, or content buried behind JavaScript rendering. The minimum viable set of requirements for citation eligibility: Organization schema on the homepage, FAQPage schema on FAQ content, no AI bots blocked in robots.txt, a sitemap.xml, and direct-answer content structure on key pages. Meeting these requirements doesn't guarantee citations — but failing any of them systematically suppresses them across all AI platforms simultaneously.
Can I check whether AI crawlers are blocked on my site?
Yes — use Amplerank's robots.txt checker tool at amplerank.ai/tools/robots-txt-checker. Enter your domain and it will identify whether GPTBot, PerplexityBot, ClaudeBot, BingBot, and Googlebot are allowed or blocked. You can also check manually by visiting yourdomain.com/robots.txt in a browser.
What does 'direct-answer content structure' mean technically?
Direct-answer content structure means your page's opening text contains a clear, complete answer to the primary query the page targets — before any background information, feature lists, or context-setting. Technically, the answer should appear in the first paragraph element of the page's main content area, ideally within the first 150–200 words. AI platforms use the opening content of a page as their primary extraction target for synthesized answers.