AI Search Optimization Glossary

All important terms about AI search engines and website optimization clearly explained

A

Accessibility

Accessibility

Website accessibility for all users, including people with disabilities and AI bots. Important elements include alt texts for images, correct heading structure, ARIA labels and semantic HTML. AI search engines prefer accessible websites as they are easier to crawl and understand.

SEO

AI readiness

The degree to which a website is optimized for AI search engines like ChatGPT, Claude and Perplexity. Includes structured data, semantic HTML, clear metadata and crawlable content. An AI-ready website is better understood by AI bots and cited more frequently in AI-generated answers.

Technical

Alt text

Alternative text describing images. Read aloud by screen readers and helps AI bots understand image content. Format: <img alt="Image description">. Important for accessibility and AI readiness, as bots cannot "see" images.

Technical

ARIA labels

Accessible Rich Internet Applications - HTML attributes providing additional information for screen readers and AI bots. Example: aria-label="Search field", role="navigation". Improves semantic structure and AI comprehensibility.

B

SEO

Bot / Crawler

Automated programs that crawl and index websites. Examples: Googlebot (Google), GPTBot (ChatGPT), ClaudeBot (Claude). Bots follow links, read content and collect data for search engines or AI models. The robots.txt controls which bots have access.

Content

Breadcrumbs

Navigation path showing page structure (e.g. Home > Products > Category). Helps users and bots understand website hierarchy. Can be structured with Schema.org BreadcrumbList for better AI comprehensibility.

C

AI

ChatGPT

AI chatbot by OpenAI that crawls websites via GPTBot. Uses crawled content to answer questions. Websites can block or allow GPTBot in robots.txt. Important for AI readiness: structured data and clear content.

AI

Claude

AI assistant by Anthropic that crawls websites with ClaudeBot. Analyzes content for context and answers. Prefers well-structured, accessible websites with clear metadata and Schema.org markup.

SEO

Content quality

Measure of website content quality. Factors: text length, text-code ratio, internal linking, heading structure, lists and tables. AI bots prefer substantial, well-structured content over thin pages.

SEO

Crawlability

The ability of bots to crawl a website. Influenced by robots.txt, meta tags (noindex/nofollow), sitemaps, and internal linking. Good crawlability is essential for AI readiness.

Technical

Canonical Tag

HTML element defining the preferred URL of a page. Format: <link rel="canonical" href="https://example.com/page/">. Prevents duplicate content issues and helps bots index the canonical version.

AI

ClaudeBot

Official crawler from Anthropic for Claude. User-Agent: "ClaudeBot". Can be blocked in robots.txt: "User-agent: ClaudeBot" + "Disallow: /". Crawls websites for training and updating the Claude AI model.

SEO

Core Web Vitals

Three Google metrics: LCP (load time), INP (responsiveness) and CLS (visual stability). Official ranking factor since 2021. AI crawlers abort on slow pages faster than Googlebot.

Technical

Crawl Budget

The number of pages a bot crawls per unit of time. Noindex pages, duplicates and slow servers reduce effective crawl budget for important content.

D

Technical

Disallow

Instruction in robots.txt that excludes bots from crawling certain areas. Example: “Disallow: /” blocks all bots. “Disallow: /admin/” blocks only the admin area. Critical for AI readiness: accidental blocking prevents indexing.

E

SEO

E-E-A-T

Experience, Expertise, Authoritativeness, Trustworthiness — Google's quality criteria for content. AI models prefer sources with high E-E-A-T scores when citing in generated answers.

F

SEO

FAQ Schema

Schema.org markup for question-and-answer content (FAQPage). Enables rich snippets in Google. AI models prefer FAQ structures for direct answers to user questions.

SEO

Featured Snippet

Highlighted excerpt in Google search results (position 0). Pages that earn featured snippets are more frequently cited by AI search engines as they are considered authoritative sources.

G

SEO

GEO (Generative Engine Optimization)

The optimization of content for AI-powered search engines and chatbots such as ChatGPT, Claude, and Perplexity. GEO goes beyond classic SEO and focuses on structured data, natural language, context, and semantic markup in order to be cited in AI-generated responses. Synonymous with AI readiness and AI visibility.

AI

Google Gemini / Bard

Google's AI chatbot uses Google Extended Bot for crawling. Integrated into Google Search. Websites can use robots.txt to control whether Google Extended has access.

AI

GPTBot

Official crawler from OpenAI for ChatGPT. User agent: “GPTBot.” Can be blocked in robots.txt: “User-agent: GPTBot” + “Disallow: /.” Crawls websites to collect training data and current information.

H

Technical

H1 tag

Main heading of a page (Heading 1). Best practice: exactly one H1 per page, describing the main topic. Important for SEO and AI readiness, as bots use H1 as the primary content indicator.

Accessibility

Heading hierarchy

Logical structure of headings (H1 → H2 → H3). Do not skip levels! Incorrect: H1 → H3. Correct: H1 → H2 → H3. Helps bots and screen readers understand the content structure.

Technical

HTML sitemap

Human-readable overview of all pages on a website (usually /sitemap.html). Complements the XML sitemap and improves navigation for users and bots.

I

SEO

Indexing

The process by which a search engine adds a page to its database. Required for appearing in search results and AI answers. Can be prevented by noindex tags, robots.txt or missing internal links.

Technical

INP (Interaction to Next Paint)

Core Web Vital measuring page response time to user input. Replaced FID in 2024. Good: under 200ms. Needs improvement: 200–500ms. Poor: over 500ms. High INP indicates too much JavaScript.

J

Structured Data

JSON-LD

Google's preferred format for structured data. Embedded in <head> as <script type="application/ld+json">. Easier to implement than Microdata and does not affect visible HTML.

K

SEO

AI visibility

Synonymous with AI readiness. Describes how well a website is found, understood, and cited by AI search engines such as ChatGPT, Claude, and Perplexity. Factors: structured data (Schema.org), semantic HTML, crawlability (robots.txt), content quality, and accessibility. The established term in the German market for optimization for AI bots.

AI

AI search engine

Search engine that uses AI to generate answers instead of just showing links. Examples: ChatGPT, Claude, Perplexity, Google Gemini. Uses web content to create natural language answers.

L

Technical

Lang attribute

HTML attribute that defines the language of a page. Format: <html lang="en">. Helps bots and screen readers recognize the correct language. Important for international SEO.

Technical

LCP (Largest Contentful Paint)

Core Web Vital measuring load time of the largest visible element. Good: under 2.5s. Needs improvement: 2.5–4s. Poor: over 4s. Common causes: unoptimized images, slow server.

SEO

Lighthouse Score

Google's automated audit tool for Performance, Accessibility, Best Practices and SEO. Scores 0–100, above 90 is considered good. Available in Chrome DevTools and PageSpeed Insights.

Technical

llms.txt

A simple text file in Markdown format located in the root directory of a website — accessible at yourdomain.com/llms.txt. It provides AI systems like ChatGPT, Claude and Perplexity with structured information about the website: description, offering, target audience and important pages. Similar to robots.txt for crawlers, but as a content description rather than a ruleset. Community standard since 2024, supported by Anthropic and Perplexity.

M

SEO

Meta description

Short description of a page in the HTML head. Displayed in search results. Optimal: 150-160 characters. Important for click-through rate, also helps AI bots understand the page content.

Technical

Meta tag

HTML element in the <head> that provides meta information. Examples: description, keywords, robots, viewport. Meta tags control how bots treat the page (index/do not index).

N

Technical

NOINDEX

Meta tag that instructs search engines NOT to index a page. Format: &lt;meta name="robots" content="noindex"&gt;. CRITICAL: Prevents the page from appearing in search results. Common mistake in AI readiness!

Technical

NOFOLLOW

Instruction to bots not to follow links on a page. As a meta tag: &lt;meta name="robots" content="nofollow"&gt; or link attribute: &lt;a rel="nofollow"&gt;. Prevents link authority from being passed on.

O

SEO

Open Graph

Meta tags for social media sharing (Facebook, LinkedIn). Format: og:title, og:description, og:image. Controls how links are displayed on social media. Improves presentation when sharing.

P

AI

Perplexity

AI search engine that crawls websites with PerplexityBot. Specializes in fact-based answers with source references. Often cites well-structured websites with clear data.

Technical

PageSpeed

Website loading speed. AI crawlers from ChatGPT, Perplexity and Claude have shorter timeouts than Googlebot. Pages with high TTFB may not be crawled by AI bots at all.

AI

PerplexityBot

Official crawler from Perplexity AI. User-Agent: "PerplexityBot". Can be controlled via robots.txt. Perplexity specializes in fact-based answers with source references.

R

Technical

robots.txt

Text file in the root directory (/robots.txt) that defines crawling rules for bots. Controls which areas may be crawled. Critical for AI readiness: “Disallow: /” blocks ALL bots!

Content

RSS feed

XML format for content updates (Really Simple Syndication). Enables automatic content distribution. Helps bots discover new content quickly. Format: application/rss+xml or application/atom+xml.

SEO

Rich Snippets

Enhanced search results with additional information such as ratings, prices or FAQs. Enabled by structured data (Schema.org). Increase click-through rates in search results and signal high content quality to AI systems.

S

SEO

Schema.org

Vocabulary for structured data on the web. Defines types such as Article, Product, Organization. Embedded in JSON-LD, Microdata, or RDFa. Essential for AI readiness—helps bots understand content semantically.

SEO

SEO (Search Engine Optimization)

Optimization of a website for search engines. Includes on-page (content, meta tags, structure) and off-page (backlinks) measures. Basis for AI readiness, but AI optimization goes beyond this (structured data, semantic HTML).

Technical

Sitemap

XML file that lists all URLs of a website (usually /sitemap.xml). Helps search engines find and index pages. Submitted in robots.txt or Google Search Console. Important for complete indexing.

Strukturierte Daten

Structured Data

Machine-readable data in the code that describes content semantically. Formats: JSON-LD (preferred), microdata, RDFa. Uses Schema.org vocabulary. Enables rich snippets in search engines and better AI understanding.

T

SEO

Title tag

Title of a page in the HTML head. Displayed in browser tabs and search results. Optimal: 50-60 characters. Most important on-page SEO factor. Should contain the main keyword and encourage clicks.

Technical

TTFB (Time to First Byte)

Time until the browser receives the first byte from the server. Good: under 800ms. Needs improvement: 800ms–1.8s. Poor: over 1.8s. High TTFB can cause AI crawlers to abort crawling.

SEO

Twitter Card

Meta tags for displaying links on X (Twitter). Types: summary, summary_large_image. Format: <meta name="twitter:card" content="summary_large_image">. Improves visibility when sharing on social media.

U

Technical

User agent

Identifier sent by a bot or browser. Examples: “Googlebot,” “GPTBot,” “ClaudeBot.” Used in robots.txt to control specific bots. Important: Define each bot individually in robots.txt!

V

Technical

Viewport Meta Tag

HTML tag controlling viewport behavior on mobile devices. Standard: <meta name="viewport" content="width=device-width, initial-scale=1.0">. Without it the page does not display correctly on smartphones — important for mobile-first indexing.

W

Technical

Wildcards (*)

Placeholders in robots.txt for multiple URLs. User-agent: * = all bots. Disallow: /*.pdf = all PDF files. Useful for efficient crawling rules.

X

Technical

XML sitemap

Machine-readable sitemap format for search engines (/sitemap.xml). Contains URLs, last modification, priority. Helps bots find all pages. Should be linked in robots.txt: “Sitemap: https://example.com/sitemap.xml”