SEO - AI Discoverability
qoliber SEO AI Discoverability is a stand-alone commercial product, currently available to qoliber extensions licensees only.
General Information
In 2024 and 2025, the web's largest crawlers stopped being search engines. ChatGPT, Claude, Perplexity, Gemini, Copilot — every major AI assistant now runs a fleet of bots that read public web pages, train on the content, and surface answers without sending the user back to your site. Some merchants want this exposure (it's still distribution, even without a click). Others see it as their content being commercially repurposed without consent.
Either position is defensible — but the choice was, until now, technically painful to express on a Magento store. There's no Magento admin field for "block GPTBot but allow ClaudeBot for chat answers". Most stores ship a static robots.txt written years before any of these bots existed.
The Qoliber SEO AI Discoverability extension gives merchants a single, opt-in admin surface for telling AI crawlers exactly how to treat the store. Three independently-togglable signals — robots.txt per-bot rules, X-Robots-Tag HTTP headers, and an /llms.txt manifest — let you take the policy you want and ship it without writing a line of code.
What Does This Extension Do?
- Per-bot
robots.txtrules — appendUser-agent: <bot>/Disallow: /entries for any AI crawler. Default list curated to cover GPTBot, ChatGPT-User, OAI-SearchBot (OpenAI), ClaudeBot + anthropic-ai (Anthropic), CCBot (Common Crawl), PerplexityBot, Google-Extended (Gemini training opt-out that does not affect Search ranking), FacebookBot, Bytespider, Amazonbot. X-Robots-TagHTTP header on every frontend response — defaults tonoai, noimageai(the emerging convention for opting your content out of AI training without affecting search-engine visibility)./llms.txtendpoint per the llmstxt.org draft spec — an admin-authored, plain-text manifest telling LLMs which pages on your store actually represent your brand. The new convention for "your-product-page-is-here, your-PR-page-is-here, ignore-the-affiliate-mirror".- Master switch + per-feature gating — every surface ships disabled. You opt in to each independently after reviewing the defaults.
- Admin-editable bot list and directives — edit the default list in admin as crawler policies evolve. New bot? Add a row. Anthropic renames their crawler? Update one field.
Why Is This a Game Changer?
- Genuinely first to market on Magento. No commercial Magento SEO suite ships AI-crawler controls today. This is the modern equivalent of "we have a robots.txt editor" — and we're the first to ship it for the post-2024 crawler landscape.
- Standards-based, not screen-scraping. We emit the signals OpenAI, Anthropic and the rest of the industry have publicly committed to honour. This isn't a WAF and it isn't user-agent blocking — it's the cooperative-protocol layer.
- Three signals, one decision. A merchant who wants to opt out of AI training flips one master switch and gets
robots.txt,X-Robots-Tagand/llms.txtaligned. No "we set the header but forgot the manifest" drift. - Search-safe.
Google-Extended(the Gemini-training-only opt-out) is on the default list precisely because blocking it does not affect Google Search ranking. Other modules conflate the two.
How Does It Work?
Three independent surfaces, each gated by its own admin toggle:
robots.txt— a plugin appends per-bot blocks to the file Magento generates. Master switch off → no changes to Magento's default output.X-Robots-Tagheader — a frontend HTTP plugin sets the directive on every storefront response. Disabled → Magento's default header is untouched./llms.txt— a custom controller answers the route (Magento's standard router can't serve a.txtpath with a leading dot in the segment). Disabled → returns 404, exactly as if the module weren't installed.
This extension does not block traffic at the request layer. It emits opt-out signals — the cooperative side of the protocol. AI crawlers that ignore robots.txt and X-Robots-Tag are unaffected; for active blocking you'll want a WAF (Cloudflare, Fastly, etc.) in addition.
Example Output
robots.txt (excerpt with the module enabled)
User-agent: GPTBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: PerplexityBot
Disallow: /
User-agent: Google-Extended
Disallow: /X-Robots-Tag HTTP header
X-Robots-Tag: noai, noimageai/llms.txt (admin-authored)
# MyStore
> Premium yoga apparel and accessories. Family-owned since 2014.
## Pages we recommend
- /about
- /shipping-and-returns
- /our-fabrics
## Pages to ignore
- /search
- /customer/account
- /checkoutConfiguration at a glance
Find the module under Stores → Configuration → Qoliber → SEO: AI Discoverability. Master switch on, then opt in to each surface (robots, header, llms.txt) and edit the default content as your policy requires.
What's New in 2.1
First public release of the module. Ships with:
- A default bot list curated against current OpenAI / Anthropic / Common Crawl / Perplexity / Google / Meta / TikTok / Amazon documentation.
- 13 integration tests across three files locking master/feature gating, whitespace deduplication in
robots.txt, header overwrite behaviour, and 404 response when/llms.txtis disabled. - Editorial choice: every surface ships off. Operators must explicitly enable each one after reviewing the defaults.