Markdown for AI Agents
Markdown for AI Agents
Description
Markdown for AI Agents is a lightweight WordPress plugin that enables HTTP content negotiation for your site’s content. When a client (like an AI agent or a custom script) requests a page with the Accept: text/markdown header, the plugin intercepts the request and returns a clean, structured Markdown representation of the post or page content.
This is ideal for AI crawlers, RAG (Retrieval-Augmented Generation) systems, and non-browser clients that prefer machine-friendly text over complex HTML.
Important note: This plugin is primarily a developer/integration tool. Human visitors browsing your site will never see any difference — the Markdown output is only served when explicitly requested via the Accept: text/markdown HTTP header. Normal browser requests always receive the standard HTML page.
Key Features:
- Automatically detects
Accept: text/markdownheaders. - Converts HTML content to clean Markdown using the League HTMLToMarkdown library.
- Strips away theme layout, navigation, headers, footers, and sidebars — serving only the main content.
- Adds useful HTTP response headers:
Content-Type: text/markdown,Vary: Accept, andX-Markdown-Word-Count. - Respects WordPress visibility rules and filters.
- No configuration required — works out of the box for posts, pages, and custom post types.
How It Works
This plugin uses a standard web technique called HTTP content negotiation. The same URL on your site can serve different representations of the same content depending on what the client asks for:
- A regular browser sends
Accept: text/htmlreceives your normal HTML page. - An AI agent sends
Accept: text/markdownreceives a clean Markdown version of the same page.
No extra URLs, no duplicate content, no configuration needed. The plugin hooks into WordPress’s template_redirect action, detects the Accept header, captures the rendered HTML, converts it to Markdown, and returns it with appropriate headers.
Why Markdown for AI Agents?
When building RAG (Retrieval-Augmented Generation) applications or AI pipelines that ingest web content, HTML is extremely noisy. A typical WordPress page contains thousands of tokens worth of HTML tags, inline styles, navigation menus, scripts, and layout markup — none of which carries meaning for an AI model.
Serving clean Markdown instead can reduce token consumption by up to 60%, which means:
- Lower API costs — fewer tokens ingested when loading pages into vector stores or LLM pipelines.
- Faster processing — less text for the model to parse, filter, and discard.
- Better retrieval accuracy — higher signal-to-noise ratio improves the quality of RAG results.
- Simpler pipelines — no need for custom HTML stripping logic on the client side; the plugin handles it server-side.
Any AI agent, crawler, or ingestion script that sends Accept: text/markdown in its request header will automatically receive the clean Markdown version — no extra URLs, no separate endpoints, no changes to your content workflow.
Installation
- Upload the
markdown-for-ai-agentsfolder to the/wp-content/plugins/directory, or install it directly via the WordPress Plugins screen. - Activate the plugin through the Plugins menu in WordPress.
- The plugin works immediately with no settings to configure.
Verifying it works — Option 1: Using curl (recommended for developers)
Open your terminal and run the following command, replacing the URL with any post or page on your site:
curl -H "Accept: text/markdown" https://your-site.com/sample-page/
You should see plain Markdown text returned instead of HTML. For example:
curl -H "Accept: text/markdown" https://your-site.com/hello-world/
A successful response will begin with the post title as a Markdown heading (e.g. # Hello World) followed by the post content in Markdown format, with no HTML tags, navigation, or sidebar content.
Verifying it works — Option 2: Using a browser extension (no terminal required)
- Install the ModHeader extension for Chrome or Firefox (free, available in their respective extension stores).
- Open ModHeader and add a new request header: Name =
Accept, Value =text/markdown. - Enable the header and visit any post or page on your WordPress site.
- Your browser will display the raw Markdown text of that page instead of the styled HTML version.
- Disable or remove the header in ModHeader when you are done to return to normal browsing.
Verifying it works — Option 3: Using an online HTTP client
Tools like Hoppscotch (free, browser-based) allow you to make HTTP requests with custom headers without installing anything:
- Go to https://hoppscotch.io
- Enter any post or page URL from your site.
- Under Headers, add
Accept=text/markdown. - Click Send and you will see the Markdown response in the response panel.
Screenshots
Faq
No. Standard browser requests always receive the normal HTML version of your pages. The Markdown output is only served when a client explicitly includes Accept: text/markdown in its HTTP request header. No regular browser sends this header by default.
All singular content types are supported: standard Posts, Pages, and any registered Custom Post Types. Archive pages, category pages, and the homepage (if set to a blog feed) are not served as Markdown.
Use any of the three verification methods described in the Installation section above. The quickest check is: curl -H "Accept: text/markdown" https://your-site.com/sample-page/. A working response will return plain text starting with your post title as a Markdown # heading.
The Markdown output contains the full rendered post or page content — the title and body — converted to Markdown. Navigation menus, sidebars, footers, and <script> and <style> tags are automatically stripped out to provide a clean, token-efficient result for AI consumption.
The response includes:
Content-Type: text/markdown; charset=utf-8— tells the client the content format.Vary: Accept— informs caches that the response varies based on the Accept header, preventing cached HTML from being served to Markdown clients.X-Markdown-Generator: Markdown for AI Agents— identifies the plugin.X-Markdown-Word-Count: [number]— the word count of the Markdown content.
No. The Vary: Accept header is set on Markdown responses so that HTTP caches (including CDNs) correctly cache HTML and Markdown separately. Search engine crawlers do not send Accept: text/markdown headers, so they will always receive and index the normal HTML version of your pages.
The League HTMLToMarkdown library is bundled inside the plugin under includes/lib/HtmlToMarkdown/. No additional installation steps are required.
Yes. Because the plugin captures the fully rendered HTML output (after WordPress and any theme or plugin has finished building the page), it works regardless of whether your content is built with the Classic Editor, Gutenberg blocks, Elementor, or any other page builder.
Reviews
Changelog
1.0.0
- Initial release.
