AI-Only Pages
AI-Only Pages
Description
AI-Only Pages gives you granular control over which search engine bots can index each page on your WordPress site — while simultaneously making those pages more discoverable and useful for AI crawlers like ChatGPT, Claude, and Perplexity.
The core idea: you have content that is perfect for AI training pipelines and retrieval-augmented generation (RAG) systems, but you do not want that content competing for rankings in Google, Bing, or Yahoo. AI-Only Pages lets you mark those pages as AI-only: they disappear from traditional search engine indexes while becoming first-class citizens in the AI ecosystem.
What it does
- Per-bot noindex — Block individual bots (Googlebot, Bingbot, Yandexbot, etc.) with a checkbox per bot per page. Checking one bot blocks it; the others still index normally.
- “Block All” master toggle — One click blocks all 10 supported search engine bots simultaneously.
<meta>tags and HTTP headers — Both<meta name="googlebot" content="noindex, nofollow">HTML meta tags andX-Robots-TagHTTP headers are emitted, covering all crawling contexts. Works correctly on all public post types including Pages and custom post types.- SEO plugin integration — Suppresses Yoast SEO, WP Core, and RankMath’s global
<meta name="robots">tag on AI-only pages so there is no conflict between the global tag and your per-bot tags. - Sitemap exclusion — AI-Only pages are automatically removed from all XML sitemaps (Yoast SEO and WP Core sitemaps are both supported).
/llms-index.txt— A plain-text AI discovery file served atyoursite.com/llms-index.txtlisting all AI-only pages with their titles and last-modified dates. AI crawlers can use this file to find your AI-optimised content directly. Can be toggled on/off from the settings page.- Token Diet — clean AI output — When an AI crawler visits an AI-only page, the plugin serves a cleaned version of the HTML with navigation, sidebars, footers, cookie banners, inline styles, SVGs, and iframes stripped out. AI models receive pure content with minimal noise.
- Global Settings Page — A top-level “AI-Only Pages” menu in the WordPress admin sidebar lets you configure Token Diet and LLM Index behaviour globally, without touching code.
- Caching plugin notice — If WP Rocket, LiteSpeed Cache, or another full-page caching plugin is detected, an admin notice explains how to configure it to work alongside this plugin.
The Settings Page
A full settings page is available under AI-Only Pages in the WordPress admin sidebar. It provides:
Section 1 — Instructions & Status: A “How It Works” guide covering the meta box, Token Diet, and LLM Index. A live, clickable URL to your /llms-index.txt file with a green/red status indicator showing whether the index is active.
Section 2 — LLM Index Settings: A toggle to enable or disable /llms-index.txt globally. When disabled, the endpoint returns a 404.
Section 3 — Token Diet Master Control: A master toggle to enable or disable Token Diet entirely. When off, AI bots receive raw, full HTML — identical to what human visitors see.
Section 4 — Granular Token Diet Stripping: Individual toggles for each category of content stripped:
- Strip structural layout (headers, footers, sidebars, navigation, cookie banners)
- Strip
<style>tags and embedded CSS - Strip
<svg>elements (major token bloaters) - Strip
<iframe>elements (maps, embeds, social widgets) - Strip
<form>elements (Warning: removes WooCommerce Add to Cart buttons) - Strip
<script>tags (Note:application/ld+jsonschema is always preserved)
Supported Search Engine Bots
Googlebot (Web), Googlebot-Image, Googlebot-News, Googlebot-Video, AdsBot-Google, Bingbot, Slurp (Yahoo), DuckDuckBot, Baiduspider, YandexBot.
AI Bots Welcomed
GPTBot, ChatGPT-User, ClaudeBot, PerplexityBot, YouBot, Meta-ExternalAgent, Amazonbot, Bytespider, Diffbot, cohere-ai, anthropic-ai, AI2Bot, OAI-SearchBot, and more. These bots are detected automatically and served cleaned content when they visit an AI-only page.
Developer-Friendly
Every major behaviour is extensible via WordPress filters. See the Developer Reference section below. The Settings class hooks into filters at priority 5, leaving priorities 10 and above free for developer overrides — so your custom add_filter() calls always win.
Using the Plugin
Per-page control
- Open any post or page in the WordPress editor.
- Find the AI-Only Pages meta box in the right sidebar.
- Check individual bots to block them, or use Block from ALL search engine bots to check all at once.
- Click Publish or Update to save. The noindex tags take effect immediately.
- Visit
yoursite.com/llms-index.txtto confirm your page appears in the AI content index.
Note: The master toggle requires JavaScript. The individual checkboxes always work regardless of JS state.
Global settings
- Go to AI-Only Pages in the WordPress admin sidebar.
- Review the “How It Works” section and confirm your
/llms-index.txtURL is live. - Use the LLM Index Settings card to enable or disable the discovery file.
- Use the Token Diet — Master Control card to enable or disable all output cleaning.
- Use the Token Diet — Granular Stripping card to select exactly which HTML elements are stripped from AI output.
- Click Save Settings.
Developer Reference
All filters are applied inside AIOnly\Pages\Plugin. The Settings class hooks at priority 5; standard developer priority is 10+.
aionly_ai_crawler_signatures
Array of User-Agent substrings used for Layer 1 bot detection.
@param string[] $signatures
@return string[]
aionly_strip_selectors
CSS-style selector strings passed to Pass 1 of Token Diet (structural removal).
Supports element tag, #id, and .class (one class, no combinators).
@param string[] $selectors
@return string[]
aionly_strip_token_bloat_tags
XPath query strings passed to Pass 2 of Token Diet (tag removal).
@param string[] $queries
@return string[]
aionly_allowed_attributes
HTML attribute names kept on every element by Pass 3 of Token Diet.
Everything else is stripped.
@param string[] $attributes
@return string[]
aionly_should_clean_output
Boolean. Return false to disable Token Diet entirely for a specific post.
@param bool $enabled Default: true.
@param \WP_Post $post
@return bool
aionly_enable_xrobots_headers
Boolean. Return false to suppress X-Robots-Tag HTTP headers.
@param bool $enabled Default: true.
@param \WP_Post $post
@return bool
aionly_cache_ttl
Filter the transient TTL in seconds.
@param int $ttl Default: 600 (10 minutes).
@return int
aionly_llms_index_lines
Filter the array of text lines that make up llms-index.txt before output.
@param string[] $lines Array of lines (including comment lines).
@param int[] $active_ids Post IDs included in the index.
@return string[]
aionly_supported_post_types
Array of public post type slugs the plugin should support.
@param string[] $post_types
@return string[]
aionly_use_heuristic_bot_detection
Boolean. Return false to disable Layer 2 heuristic bot detection.
@param bool $enabled Default: true.
@return bool
Code Examples
Disable heuristic bot detection (uptime monitors):
add_filter( 'aionly_use_heuristic_bot_detection', '__return_false' );
Preserve WooCommerce forms (developer override — wins over settings page):
add_filter( 'aionly_strip_token_bloat_tags', function( $queries ) {
return array_filter( $queries, function( $q ) {
return $q !== '//form';
} );
} );
Add a custom strip selector:
add_filter( 'aionly_strip_selectors', function( $selectors ) {
$selectors[] = '.advertisement';
$selectors[] = '#newsletter-popup';
return $selectors;
} );
Keep class attributes in AI output:
add_filter( 'aionly_allowed_attributes', function( $attrs ) {
$attrs[] = 'class';
return $attrs;
} );
Add a custom AI crawler signature:
add_filter( 'aionly_ai_crawler_signatures', function( $sigs ) {
$sigs[] = 'FutureBot';
return $sigs;
} );
Restrict to specific post types:
add_filter( 'aionly_supported_post_types', function( $types ) {
return [ 'post', 'page' ]; // Only posts and pages.
} );
Disable Token Diet on a specific post (always wins, priority 10 > settings priority 5):
add_filter( 'aionly_should_clean_output', function( $enabled, $post ) {
if ( 42 === $post->ID ) {
return false; // Post 42 serves full HTML to AI bots.
}
return $enabled;
}, 10, 2 );
Read a single setting value in custom code:
$token_diet_on = '1' === \AIOnly\Pages\Settings::get( 'token_diet_enabled' );
$all_settings = \AIOnly\Pages\Settings::get_settings(); // Full array.
Installation
Automatic installation
- In your WordPress admin, go to Plugins Add New.
- Search for “AI-Only Pages”.
- Click Install Now, then Activate.
- After activation, go to Settings Permalinks and click Save Changes to flush rewrite rules so
/llms-index.txtbegins working immediately. - Visit AI-Only Pages in the sidebar to configure global settings.
Manual installation
- Download the plugin zip file.
- Upload the
ai-only-pagesfolder to/wp-content/plugins/. - Activate the plugin from the Plugins menu.
- Go to Settings Permalinks and click Save Changes.
Plugin folder structure
After installation the plugin occupies exactly this structure inside
/wp-content/plugins/ai-only-pages/:
ai-only-pages/
├── ai-only-pages.php Root loader. Contains the plugin header WordPress
│ reads for name/version. Performs PHP and WP version
│ gates. Registers activation/deactivation hooks.
│ Contains zero modern PHP syntax so it is safe to
│ parse on PHP 5.x without fatal errors.
│
├── includes/
│ ├── class-plugin.php The core plugin class. All bot detection, meta
│ │ boxes, output buffering, Token Diet, LLM Index,
│ │ and SEO plugin overrides live here.
│ │
│ └── class-settings.php The Settings class. Registers the top-level admin
│ menu, the settings page, and all WordPress Settings
│ API fields. Hooks into core plugin filters at
│ priority 5 to alter behaviour dynamically from
│ saved options.
│
├── assets/
│ └── js/
│ └── admin.js Vanilla JavaScript for the meta box. Handles the
│ "Block from ALL" master toggle and keeps it in
│ sync with individual bot checkboxes. No jQuery.
│ Enqueued only on post.php and post-new.php.
│
├── uninstall.php Clean removal. Deletes all plugin options and
│ post meta when the plugin is deleted via the
│ WordPress admin Plugins screen.
│
└── readme.txt This file.
Why this structure?
The split between ai-only-pages.php and class-plugin.php is intentional and critical. WordPress parses the root plugin file to read its header (Plugin Name:, Version:, etc.) before any PHP runs. If the root file used modern PHP syntax and the site ran PHP 7.0, WordPress would throw a fatal parse error before the version gate could display a helpful admin notice. Keeping the root file at PHP 5.4 syntax means the gate always runs and users always see a readable error instead of a white screen.
Both class-plugin.php and class-settings.php use PHP 7.4+ syntax and are both loaded by the root loader after the version gates have passed. No manual wiring is required.
Faq
Google may have already crawled and cached the page before you activated the plugin. It can take days or weeks for Google to re-crawl and respect the new noindex directive. If you need faster removal, submit the URL to Google Search Console’s URL Removal tool.
Also verify that the noindex tag is actually appearing on the page: view source and search for <meta name="googlebot".
This was a bug fixed in version 1.3.1. Both output_noindex_tags() and output_xrobots_headers() incorrectly checked publicly_queryable to decide whether to proceed. WordPress’s built-in page post type has publicly_queryable = false, causing both methods to silently bail out without writing any tags. Update to 1.3.1 to resolve this.
Yes. The plugin overrides the global <meta name="robots"> tag that Yoast SEO and RankMath output on AI-only pages. Without this override, Yoast might output <meta name="robots" content="index, follow"> which would conflict with the per-bot tags. On AI-only pages, the global tag is suppressed entirely; only the per-bot tags remain.
First, check that LLM Index is enabled on the AI-Only Pages settings page.
If it is enabled, go to Settings Permalinks and click Save Changes without changing anything. This flushes WordPress’s rewrite rules, which registers the /llms-index.txt URL pattern.
This flush happens automatically on plugin activation, but some server configurations (particularly Nginx without try_files) may need a manual flush or a server config update.
Full-page caching plugins (WP Rocket, LiteSpeed, W3 Total Cache, etc.) serve responses from a disk cache before WordPress runs. The plugin’s output buffer never fires on cached pages.
To fix this, configure your caching plugin to exclude AI-Only page URLs from its cache:
WP Rocket: Settings Cache Never Cache URL(s). Add the slug of each AI-only page.
LiteSpeed Cache: LiteSpeed Cache Cache Do not cache URIs.
W3 Total Cache: Performance Page Cache Never cache the following pages.
Alternatively, add a custom rule to exclude pages with the _aionly_active cookie, or contact your host’s support team — managed WordPress hosts often expose this setting in their dashboard.
The plugin uses two-layer bot detection. Layer 1 matches known AI crawler signatures. Layer 2 (heuristic) flags requests with no browser engine string in the User-Agent AND no Accept-Language header — a combination that every real browser always sends, but that many CLI tools and monitoring services do not.
The simplest fix is to configure your monitoring tool to send an Accept-Language header. Alternatively, disable heuristic detection entirely:
add_filter( 'aionly_use_heuristic_bot_detection', '__return_false' );
Yes, if “Strip <form> elements” is enabled in the settings (it is by default). WooCommerce add-to-cart buttons are rendered inside <form> elements. AI crawlers cannot interact with forms anyway — they only read content. If you want AI crawlers to see your product CTAs, turn off “Strip forms” on the AI-Only Pages settings page, or add a developer filter:
add_filter( 'aionly_strip_token_bloat_tags', function( $queries ) {
return array_filter( $queries, function( $q ) {
return $q !== '//form';
} );
} );
No. Disabling a toggle simply passes more of the original HTML through to the AI crawler. The page is never broken — it may just contain more noise that uses up the crawler’s context window. The defaults are optimised for maximum signal-to-noise ratio.
No. Settings only affect the output buffer and filter callbacks — they have no impact on WordPress rewrite rules. Changes take effect on the very next AI crawler request.
All settings are stored in a single wp_options row with the key aionly_pages_settings as a serialised array. You can inspect or export it like any other WordPress option.
class-settings.php uses the same public add_filter() hooks that the core plugin exposes to developers. Specifically:
aionly_should_clean_output— used to disable Token Diet when the master toggle is off.aionly_strip_token_bloat_tags— used to build a dynamic XPath query array from granular toggles.aionly_strip_selectors— used to empty the structural selector list when layout stripping is off.template_redirectat priority 0 — used to return a 404 for/llms-index.txtwhen the LLM Index is disabled.
All these hooks run at priority 5, which means developer overrides at priority 10 (the WordPress default) always take precedence. Your custom filters always win.
Reviews
Changelog
1.3.3 — 2026-03-11
- Fixed: Missing
assets/js/admin.js— the meta box “Block from ALL” master toggle was non-functional in 1.3.2 due to the JavaScript file being omitted from the release package. - Added:
uninstall.php— clean removal of all plugin data (aionly_pages_settingsoption and all_aionly_*post meta) when the plugin is deleted via the WordPress admin. - Added:
LICENSEfile (GPLv2 full text). - Updated:
Tested up tobumped to WordPress 6.9.2.
1.3.2 — 2026-03-01
- Fixed: Output buffer opened by Token Diet (
ob_start()) is now explicitly closed via ashutdownhook, preventing potential buffer-stack conflicts with other plugins. Addresses WordPress.org Plugin Review Team feedback.
1.3.1 — 2026-02-20
- Fixed: Noindex
<meta>tags andX-Robots-Tagheaders were not emitted on WordPress Pages and non-post custom post types. Both methods incorrectly checkedpublicly_queryable— WordPress’s built-inpagepost type has this set tofalse, causing both to silently return without writing any tags. Fixed by checkingpublicinstead. - Fixed: Settings page CSS was not loading on some WordPress setups.
wp_add_inline_style()was attached to thewp-adminhandle which is not guaranteed to be registered in the required state. Fixed by registering a dedicatedaionly-settings-uihandle and attaching inline CSS to that. - Fixed: Settings admin menu was not appearing because
class-settings.phpwas missing itsrequire_oncein the root loader. - Fixed: Removed placeholder
Plugin URIheader pointing to a non-existent URL, which produced a broken “Visit plugin site” link in the Plugins list. - Cleaned: Removed dead
add_settings_section()andadd_settings_field()calls that had no effect sincedo_settings_sections()is never called. - Security:
$_SERVER['HTTP_USER_AGENT']and$_SERVER['HTTP_ACCEPT_LANGUAGE']now passed throughsanitize_text_field( wp_unslash() )before use. - i18n: Added
load_plugin_textdomain()so translations load correctly.
1.3.0 — 2026-02-19
- New:
includes/class-settings.php— full WordPress Settings API integration adding a top-level “AI-Only Pages” admin menu with four visual cards. - New: LLM Index toggle — enable/disable
/llms-index.txtglobally. When disabled, the endpoint returns a 404. - New: Token Diet master toggle — enable/disable all AI output cleaning globally without touching code.
- New: Six granular Token Diet toggles — independently control stripping of structural layout,
<style>tags,<svg>elements,<iframe>elements,<form>elements, and<script>tags (with the schemaapplication/ld+jsonpreservation guarantee always enforced). - New: Live
/llms-index.txtURL displayed on the settings page with a green/red status badge. - New: “How It Works” explainer built into the settings page — no need to consult the readme for basic orientation.
- Architecture: Settings class hooks into core plugin filters at priority 5, ensuring developer
add_filter()calls at priority 10+ always override settings-page values. - Architecture: All settings stored in one
wp_optionsrow (aionly_pages_settings) to minimise database overhead.
1.2.1 — 2026-02-19
- Fixed: Restored Yoast SEO, WP Core, and RankMath global robots override filters that were inadvertently removed in v1.2.0. Without these, Yoast’s global
<meta name="robots" content="index, follow">tag overrode per-bot noindex tags — the core feature was broken for sites using Yoast or RankMath. - Fixed: Double-encoding bug in the “AI-optimized & listed” status badge.
esc_html_e()was applied to a string already containing&, producing&amp;which rendered as literal “&” in the browser. - Fixed:
save_meta_data()now always syncs the_aionly_activederived flag on every valid save, not only when individual bot values change. This self-heals any flag desync caused by direct DB edits, imports, or third-party plugins. - Fixed: Pro upsell link now includes
rel="noopener noreferrer"ontarget="_blank"to prevent reverse tabnapping. - Improved:
admin.jsnow listens to thechangeevent instead ofclick. Thechangeevent is the semantically correct event for checkbox state and handles keyboard (Space bar) and programmatic changes correctly. - Improved: Added
function_exists()guards to all global functions in the root file to prevent “Cannot redeclare function” fatal errors if the file is somehow processed more than once.
1.2.0 — 2026-02-19
- Fixed: Asset path bug — PHP enqueued
assets/js/admin.jsbut the file was located atassets/admin.js. The JS file 404’d and the “Block from ALL” button was dead on arrival. - Fixed:
DOMContentLoadedwrapper removed fromadmin.js. Scripts enqueued within_footer=trueexecute after that event fires; the callback was never running. - Fixed:
admin.jsnow usesclassListAPI instead of fragileclassName.indexOf()string matching for class detection. - Fixed: Restored missing
before_delete_postcache-clearing logic that was inadvertently merged withtransition_post_statusinto a single variadic function. - Fixed: Heuristic bot detection (Layer 2) restored after it was silently removed in a prior refactor.
- Fixed:
junk_queriesloop now usesiterator_to_array()to snapshot the liveDOMNodeListbefore iterating. Iterating a live list while removing nodes caused silent skips. - Improved: All inline attribute iteration now collects attributes into an array before removal, preventing NamedNodeMap reindexing skips.
- Improved: Post ID resolved explicitly from
$_GET['post']/$_POST['post_ID']inenqueue_admin_assets(), removing reliance on the implicit global$post.
1.1.6 — 2026-02-19
- Major stability pass following an external AI code review that missed three deployment-blocking bugs while reporting “zero bugs found.”
- Fixed JS path (assets/js/ subfolder), DOMContentLoaded timing, and class name mismatch between PHP and JS.
1.1.3 — 2026-02-19
- Fixed
plugin_dir_url()path calculation usingdirname(dirname(__FILE__)). - Fixed XPath injection via unsanitized filter values in
aionly_strip_token_bloat_tags. - Fixed
get_post()fragility — now resolves post ID explicitly from request superglobals.
1.1.2 — 2026-02-19
- Introduced Token Diet V2 with three-pass HTML cleaning (structural, bloat, attributes).
- Added master “Block All” toggle with JavaScript event delegation.
- Added X-Robots-Tag HTTP headers alongside HTML meta tags.
- Added Yoast SEO, WP Core, and RankMath robots override filters.
- Added heuristic bot detection layer (no browser UA markers + no Accept-Language).
1.1.0
- Added /llms-index.txt discovery file with transient caching.
- Added caching plugin compatibility notice.
- Added sitemap exclusion for Yoast SEO and WP Core sitemaps.
1.0.3
- Fixed nonce verification — nonces are now post-specific to prevent cross-post replay.
- Fixed capability check using post type’s own capability type.
- Added
transition_post_statushook to clear transient cache on post status changes.
1.0.1
- Initial public release.
- Per-bot noindex meta box with 10 supported search engine bots.
- Transient-cached active post ID query.
- Activation/deactivation rewrite rule management.