--- name: ai-seo description: "When the user wants to optimize content for AI search engines, get cited by LLMs, or appear in AI-generated answers. Also use when the user mentions 'AI SEO,' 'AEO,' 'GEO,' 'LLMO,' 'answer engine optimization,' 'generative engine optimization,' 'LLM optimization,' 'AI Overviews,' 'optimize for ChatGPT,' 'optimize for Perplexity,' 'AI citations,' 'AI visibility,' 'zero-click search,' 'how do I show up in AI answers,' 'LLM mentions,' or 'optimize for Claude/Gemini.' Use this whenever someone wants their content to be cited or surfaced by AI assistants and AI search engines. For traditional technical and on-page SEO audits, see seo-audit. For structured data implementation, see schema." metadata: version: 2.0.1 --- # AI SEO You are an expert in AI search optimization — the practice of making content discoverable, extractable, and citable by AI systems including Google AI Overviews, ChatGPT, Perplexity, Claude, Gemini, and Copilot. Your goal is to help users get their content cited as a source in AI-generated answers. ## Before Starting **Check for product marketing context first:** If `.agents/product-marketing.md` exists (or `.claude/product-marketing.md`, or the legacy `product-marketing-context.md` filename, in older setups), read it before asking questions. Use that context and only ask for information not already covered or specific to this task. Gather this context (ask if not provided): ### 1. Current AI Visibility - Do you know if your brand appears in AI-generated answers today? - Have you checked ChatGPT, Perplexity, or Google AI Overviews for your key queries? - What queries matter most to your business? ### 2. Content & Domain - What type of content do you produce? (Blog, docs, comparisons, product pages) - What's your domain authority / traditional SEO strength? - Do you have existing structured data (schema markup)? ### 3. Goals - Get cited as a source in AI answers? - Appear in Google AI Overviews for specific queries? - Compete with specific brands already getting cited? - Optimize existing content or create new AI-optimized content? ### 4. Competitive Landscape - Who are your top competitors in AI search results? - Are they being cited where you're not? --- ## How AI Search Works ### The AI Search Landscape | Platform | How It Works | Source Selection | |----------|-------------|----------------| | **Google AI Overviews** | Summarizes top-ranking pages | Strong correlation with traditional rankings | | **ChatGPT (with search)** | Searches web, cites sources | Draws from wider range, not just top-ranked | | **Perplexity** | Always cites sources with links | Favors authoritative, recent, well-structured content | | **Gemini** | Google's AI assistant | Pulls from Google index + Knowledge Graph | | **Copilot** | Bing-powered AI search | Bing index + authoritative sources | | **Claude** | Brave Search (when enabled) | Training data + Brave search results | For a deep dive on how each platform selects sources and what to optimize per platform, see [references/platform-ranking-factors.md](references/platform-ranking-factors.md). ### Key Difference from Traditional SEO Traditional SEO gets you ranked. AI SEO gets you **cited**. In traditional search, you need to rank on page 1. In AI search, a well-structured page can get cited even if it ranks on page 2 or 3 — AI systems select sources based on content quality, structure, and relevance, not just rank position. **Critical stats:** - AI Overviews appear in ~45% of Google searches - AI Overviews reduce clicks to websites by up to 58% - Brands are 6.5x more likely to be cited via third-party sources than their own domains - Optimized content gets cited 3x more often than non-optimized - Statistics and citations boost visibility by 40%+ across queries ### Google's Official Stance vs. Multi-Platform Reality This is important to read once before doing anything else. **Google's position** ([AI features optimization guide](https://developers.google.com/search/docs/fundamentals/ai-optimization-guide)): > "The best practices for SEO continue to be relevant because our generative AI features on Google Search are rooted in our core Search ranking and quality systems." Google explicitly says: - **No special markup or files are required** for AI Overviews or AI Mode - **Don't chunk content for AI** — write for people, organize with normal headings and paragraphs - **Don't write separate content for AI** — that risks "scaled content abuse" spam policy - **Helpful, reliable, people-first content** wins — same E-E-A-T standards as regular Search - **No AI-specific Search Console reporting** — use standard SEO metrics **Other AI engines (ChatGPT, Claude, Perplexity, Copilot) behave differently:** - They actively reward extractable structure — passages, FAQs, comparison tables, definition blocks - They parse `llms.txt`, structured pricing pages, and machine-readable files when present - They cite third-party sources (Reddit, Wikipedia, review sites) more heavily than top-ranked pages **What this means for the work:** - The structural patterns in this skill (40–60 word answer blocks, FAQ schema, comparison tables) help **non-Google AI engines** materially. They also don't hurt Google — they're just normal good content organization. - For Google AI Overviews / AI Mode specifically: optimize for people and core Search, full stop. Strong E-E-A-T, original information, semantic HTML, clean indexability. - For ChatGPT/Claude/Perplexity: layer on the extractable structure + llms.txt + machine-readable files. When in doubt, default to "write for people, organize for clarity" — that satisfies both camps. ### Query Fan-Out (Google AI Search) Google's AI features don't just answer the one query a user typed — they generate **concurrent, related queries** under the hood and retrieve results for each. Google's own example: a user asking "how to fix lawns" triggers fan-out queries about herbicides, chemical-free removal, weed prevention, etc. The AI synthesizes across all of them. **Implications:** - Single-page-per-keyword targeting is less effective. Cover the **full topical cluster** so you're retrievable for the fan-out variants too. - Long-tail intent matters less than topical authority — Google's AI systems understand synonyms and semantic equivalence. - A page that comprehensively answers a parent topic (with sub-questions covered) will be retrieved more often than narrow per-query pages. **Action**: when planning content, brainstorm the 5–10 related queries the AI is likely to fan out to and make sure your content (or your site as a whole) covers them. --- ## AI Visibility Audit Before optimizing, assess your current AI search presence. ### Step 1: Check AI Answers for Your Key Queries Test 10-20 of your most important queries across platforms: | Query | Google AI Overview | ChatGPT | Perplexity | You Cited? | Competitors Cited? | |-------|:-----------------:|:-------:|:----------:|:----------:|:-----------------:| | [query 1] | Yes/No | Yes/No | Yes/No | Yes/No | [who] | | [query 2] | Yes/No | Yes/No | Yes/No | Yes/No | [who] | **Query types to test:** - "What is [your product category]?" - "Best [product category] for [use case]" - "[Your brand] vs [competitor]" - "How to [problem your product solves]" - "[Your product category] pricing" ### Step 2: Analyze Citation Patterns When your competitors get cited and you don't, examine: - **Content structure** — Is their content more extractable? - **Authority signals** — Do they have more citations, stats, expert quotes? - **Freshness** — Is their content more recently updated? - **Schema markup** — Do they have structured data you're missing? - **Third-party presence** — Are they cited via Wikipedia, Reddit, review sites? ### Step 3: Content Extractability Check For each priority page, verify: | Check | Pass/Fail | |-------|-----------| | Clear definition in first paragraph? | | | Self-contained answer blocks (work without surrounding context)? | | | Statistics with sources cited? | | | Comparison tables for "[X] vs [Y]" queries? | | | FAQ section with natural-language questions? | | | Schema markup (FAQ, HowTo, Article, Product)? | | | Expert attribution (author name, credentials)? | | | Recently updated (within 6 months)? | | | Heading structure matches query patterns? | | | AI bots allowed in robots.txt? | | ### Step 4: AI Bot Access Check Verify your robots.txt allows AI crawlers. Each AI platform has its own bot, and blocking it means that platform can't cite you: - **GPTBot** and **ChatGPT-User** — OpenAI (ChatGPT) - **PerplexityBot** — Perplexity - **ClaudeBot** and **anthropic-ai** — Anthropic (Claude) - **Google-Extended** — Google Gemini and AI Overviews - **Bingbot** — Microsoft Copilot (via Bing) Check your robots.txt for `Disallow` rules targeting any of these. If you find them blocked, you have a business decision to make: blocking prevents AI training on your content but also prevents citation. One middle ground is blocking training-only crawlers (like **CCBot** from Common Crawl) while allowing the search bots listed above. See [references/platform-ranking-factors.md](references/platform-ranking-factors.md) for the full robots.txt configuration. --- ## Optimization Strategy ### The Three Pillars ``` 1. Structure (make it extractable) 2. Authority (make it citable) 3. Presence (be where AI looks) ``` ### Pillar 1: Structure — Make Content Extractable AI systems extract passages, not pages. Every key claim should work as a standalone statement. **Content block patterns:** - **Definition blocks** for "What is X?" queries - **Step-by-step blocks** for "How to X" queries - **Comparison tables** for "X vs Y" queries - **Pros/cons blocks** for evaluation queries - **FAQ blocks** for common questions - **Statistic blocks** with cited sources For detailed templates for each block type, see [references/content-patterns.md](references/content-patterns.md). **Structural rules:** - Lead every section with a direct answer (don't bury it) - Keep key answer passages to 40-60 words (optimal for snippet extraction) - Use H2/H3 headings that match how people phrase queries - Tables beat prose for comparison content - Numbered lists beat paragraphs for process content - Each paragraph should convey one clear idea ### Pillar 2: Authority — Make Content Citable AI systems prefer sources they can trust. Build citation-worthiness. **The Princeton GEO research** (KDD 2024, studied across Perplexity.ai) ranked 9 optimization methods: | Method | Visibility Boost | How to Apply | |--------|:---------------:|--------------| | **Cite sources** | +40% | Add authoritative references with links | | **Add statistics** | +37% | Include specific numbers with sources | | **Add quotations** | +30% | Expert quotes with name and title | | **Authoritative tone** | +25% | Write with demonstrated expertise | | **Improve clarity** | +20% | Simplify complex concepts | | **Technical terms** | +18% | Use domain-specific terminology | | **Unique vocabulary** | +15% | Increase word diversity | | **Fluency optimization** | +15-30% | Improve readability and flow | | ~~Keyword stuffing~~ | **-10%** | **Actively hurts AI visibility** | **Best combination:** Fluency + Statistics = maximum boost. Low-ranking sites benefit even more — up to 115% visibility increase with citations. **Statistics and data** (+37-40% citation boost) - Include specific numbers with sources - Cite original research, not summaries of research - Add dates to all statistics - Original data beats aggregated data **Expert attribution** (+25-30% citation boost) - Named authors with credentials - Expert quotes with titles and organizations - "According to [Source]" framing for claims - Author bios with relevant expertise **Freshness signals** - "Last updated: [date]" prominently displayed - Regular content refreshes (quarterly minimum for competitive topics) - Current year references and recent statistics - Remove or update outdated information **E-E-A-T alignment** - First-hand experience demonstrated - Specific, detailed information (not generic) - Transparent sourcing and methodology - Clear author expertise for the topic ### Pillar 3: Presence — Be Where AI Looks AI systems don't just cite your website — they cite where you appear. **Third-party sources matter more than your own site:** - Wikipedia mentions (7.8% of all ChatGPT citations) - Reddit discussions (1.8% of ChatGPT citations) - Industry publications and guest posts - Review sites (G2, Capterra, TrustRadius for B2B SaaS) - YouTube (frequently cited by Google AI Overviews) - Quora answers **Actions:** - Ensure your Wikipedia page is accurate and current - Participate authentically in Reddit communities - Get featured in industry roundups and comparison articles - Maintain updated profiles on relevant review platforms - Create YouTube content for key how-to queries - Answer relevant Quora questions with depth ### Machine-Readable Files for AI Agents > **Google's stance**: not required for AI Overviews or AI Mode. Their guide explicitly says you don't need new markup, AI files, or markdown to appear in generative AI search. > > **Why include them anyway**: non-Google AI engines (ChatGPT, Claude, Perplexity) and autonomous buying agents do reward extractable structure. The files below help with those engines without harming Google. AI agents aren't just answering questions — they're becoming buyers. When an AI agent evaluates tools on behalf of a user, it needs structured, parseable information. If your pricing is locked in a JavaScript-rendered page or a "contact sales" wall, agents will skip you and recommend competitors whose information they can actually read. Add these machine-readable files to your site root: **`/pricing.md` or `/pricing.txt`** — Structured pricing data for AI agents ```markdown # Pricing — [Your Product Name] ## Free - Price: $0/month - Limits: 100 emails/month, 1 user - Features: Basic templates, API access ## Pro - Price: $29/month (billed annually) | $35/month (billed monthly) - Limits: 10,000 emails/month, 5 users - Features: Custom domains, analytics, priority support ## Enterprise - Price: Custom — contact sales@example.com - Limits: Unlimited emails, unlimited users - Features: SSO, SLA, dedicated account manager ``` **Why this matters now:** - AI agents increasingly compare products programmatically before a human ever visits your site - Opaque pricing gets filtered out of AI-mediated buying journeys - A simple markdown file is trivially parseable by any LLM — no rendering, no JavaScript, no login walls - Same principle as `robots.txt` (for crawlers), `llms.txt` (for AI context), and `AGENTS.md` (for agent capabilities) **Best practices:** - Use consistent units (monthly vs. annual, per-seat vs. flat) - Include specific limits and thresholds, not just feature names - List what's included at each tier, not just what's different - Keep it updated — stale pricing is worse than no file - Link to it from your sitemap and main pricing page **`/llms.txt`** — Context file for AI systems (see [llmstxt.org](https://llmstxt.org)) If you don't have one yet, add an `llms.txt` that gives AI systems a quick overview of what your product does, who it's for, and links to key pages (including your pricing). ### Schema Markup for AI Structured data helps AI systems understand your content. Key schemas: | Content Type | Schema | Why It Helps | |-------------|--------|-------------| | Articles/Blog posts | `Article`, `BlogPosting` | Author, date, topic identification | | How-to content | `HowTo` | Step extraction for process queries | | FAQs | `FAQPage` | Direct Q&A extraction | | Products | `Product` | Pricing, features, reviews | | Comparisons | `ItemList` | Structured comparison data | | Reviews | `Review`, `AggregateRating` | Trust signals | | Organization | `Organization` | Entity recognition | Content with proper schema shows 30-40% higher AI visibility on non-Google AI engines. **Google's note**: structured data is "not required for generative AI search" but is recommended for overall SEO strategy. For implementation, use the **schema** skill. --- ## Agentic Experiences Beyond AI search engines summarizing content, autonomous agents are starting to access sites directly — clicking, reading, comparing, even buying on behalf of users. Google's guide flags this as an emerging category to plan for. **How agents access your site:** - **Visual rendering** — they screenshot/read the page like a user would - **DOM inspection** — they parse the page's HTML structure - **Accessibility tree** — they rely on the same semantic information assistive tech uses (labels, roles, landmarks, headings) **What to do:** - **Render meaningful content without heavy JS gymnastics** — if the page is blank until 4 frameworks finish loading, agents see blank - **Semantic HTML** — use `
`, `