Feeding markdown to LLMs is already the standard.
With compression and precise content targeting, you can cut
token usage dramatically.
https://compress.new/https://example.com
HTML was built for browsers. Markdown was built for reading.
When you're feeding content to an LLM, the format matters more than you think.
The same content in different formats consumes wildly different amounts of your context window. HTML is bloated with tags, attributes, and boilerplate that carry zero semantic value for an LLM.
Typical results for a 3,000-word article page. Actual savings vary by content.
Markdown preserves document structure — headings, lists, links, emphasis — with minimal overhead. No closing tags, no attributes, no class names. LLMs already speak it natively, so you get more content per request and more accurate parsing.
Whether you're building agents, running research, or processing content at scale — clean markdown is the starting point.
Give your agent the ability to read any webpage. Connect via MCP or API and let it fetch, compress, and reason over web content autonomously.
Pull articles, documentation, or reports into your LLM context. Strip the noise, keep the signal — analyze multiple sources in a single prompt.
Ingest web content into your retrieval-augmented generation system. Clean markdown chunks better, embeds better, and retrieves more accurately.
Track changes on competitor pages, pricing tables, or documentation. Markdown diffs are clean and meaningful — no noise from layout shifts.
Target specific page sections with CSS selectors. Pull structured data from schema.org tags. Get exactly what you need — nothing more.
Convert technical docs, API references, or help pages into a format your AI assistant can reason over. Feed entire doc sites into your context.
Every feature is opt-in via query parameters.
Combine them to build the exact output
your pipeline needs.
Scans markdown for repeated substrings and replaces them with short tokens like §1,
§2. Presets control aggressiveness: low, med, high, true.
compress (no value) defaults to med.
Repetitive content wastes context window. Compression is only returned when the final output (including note + dictionary) saves at least 5% tokens; otherwise the original markdown is returned.
Removes structural chrome — headers, footers, navigation, sidebars — by stripping common landmark elements and IDs, then isolates the primary content area of the page before selector filtering.
Most webpages are 70%+ boilerplate. Stripping it means your LLM reads the article, not the cookie banner.
Extracts only the DOM nodes matching a CSS selector before conversion. Supports any valid CSS
selector — classes, IDs, attribute selectors, combinators. When combined with main_only=true,
matching happens after main-content cleanup.
When you know exactly where the content lives, a targeted selector gives you a clean extraction with zero noise.
Extracts the page title, description, URL, Open Graph type, and canonical image from meta tags, then prepends them as YAML front matter at the top of the markdown.
Gives your LLM structured context about the source — what the page is, where it came from — without parsing the content itself.
Extracts <script type="application/ld+json"> blocks containing schema.org
data before they are stripped, then appends them as fenced JSON blocks at the end of the output.
Schema.org carries rich semantic data — recipes, products, articles, events — that is invisible in rendered HTML but highly valuable for LLM reasoning.
By default, images are stripped from the output to save tokens. This flag preserves them as standard markdown image syntax with alt text and title attributes.
Useful when images carry meaning — diagrams, charts, screenshots — and your downstream model supports vision or you need the URLs.
The body is the converted markdown text. Two custom headers report token counts using o200k_base encoding (GPT-5 compatible). Compression is only kept when total token savings are at least 5%:
Content-Type: text/markdown; charset=utf-8 x-markdown-tokens: 1247 ← always present x-markdown-tokens-compressed: 312 ← with valid compress preset; tokens saved vs. uncompressed (0 if savings < 5%)
All errors return JSON with a human-readable message:
Content-Type: application/json; charset=utf-8
{"error": "reason the request failed"}
robots.txt disallows automated access.
Connect your AI agent directly to compress.new.
Convert any URL to markdown without leaving your workflow.
The Model Context Protocol is an open standard that lets AI agents call external tools. The compress.new MCP server exposes a fetch_url tool that converts any URL to clean, token-optimized markdown — with full support for all query parameters like compress, main_only, and selector.
Instead of copy-pasting content into your prompt, your agent fetches and compresses it automatically — keeping your context window lean.
Add the server to your agent's MCP configuration. No API key required:
{
"mcpServers": {
"compress-new": {
"url": "https://mcp.compress.new"
}
}
}
Where to place the config depends on your agent:
claude mcp add compress-new https://mcp.compress.new or add to .claude/settings.json
claude_desktop_config.json under mcpServers
.cursor/mcp.json in your project root
~/.codeium/windsurf/mcp_config.json
Once connected, your agent can call the tool directly:
// Agent prompt: "Read this article and summarize the key points" // The agent calls the MCP tool behind the scenes, // fetches clean markdown, and reasons over it.
compress.new/ to any URL and you get a markdown version of the
page — stripped of navigation, ads, and boilerplate — ready for AI context windows.
main_only=true) strips headers, footers, and sidebars to keep only the
primary content; CSS selector targeting (selector=".article") lets you extract
just the DOM nodes you need (after main-content cleanup when both flags are used); and
compression (compress or compress=high) deduplicates repeated text
with preset-based aggressiveness — applied only when it actually saves at least 5% of tokens. Together these
features can reduce context window usage by up to 90%.
compress.new/ to any URL in your browser's address bar — for example,
compress.new/https://example.com. You can also use it as an API with curl or
fetch. Add query parameters like ?main_only=true or ?compress to
customize output. Compression presets are low, med, high, and true;
omitting the value defaults to med. Compression is returned only when it reduces total tokens by
at least 5%.
§1, §2. A
compact dictionary is prepended so the model can expand tokens during reasoning. Compression is kept only when
it saves at least 5% tokens overall; otherwise compress.new returns the original markdown untouched. The compress
param defaults to med when no value is given; use compress=low,
compress=med, compress=high, or compress=true to control
aggressiveness.
compress.new/1.0. You can block
it by adding the following to your robots.txt:User-agent:
compress.new
Disallow: /robots.txt directives. You can also block specific paths
instead of your entire site by adjusting the Disallow rules.