See what AI crawlers are *allowed* to fetch, and whether your pages are easy to extract into clean text.
Who this is for
SEOs, content teams, and marketers who want to understand how AI crawlers might see a page — and what rules (robots, headers, WAF blocks) may prevent discovery or clean extraction.
What this tool checks
- robots.txt access for common AI crawlers (Allowed / Blocked / Unknown)
- WAF / bot-block signals (403/429/503 patterns, common block fingerprints)
- llms.txt presence (optional preference file; helpful for some tools)
- Sitemap discovery (common sitemap endpoints + reachability)
- Indexing + snippet directives (meta robots + X-Robots-Tag)
- Page extraction signals (content length, link density, JS render risk, structured data)
- robots.txt builder to generate/copy/download a clean baseline with per-bot overrides
What it does NOT do
This does not check “rankings” or whether a model will cite you. It does not guarantee crawler behavior. It’s a fast, practical diagnostic based on what can be fetched + read.
How to use it
Paste a page URL → Analyze → review the crawler access table, WAF signals, indexing directives, and extraction signals. If you want to change access rules, use the robots.txt builder to generate a version you can deploy.