CCBot User-Agent & Blocking Rules
Operated by Common Crawl
Scrapers Caution
What is CCBot?
CCBot is the crawler for Common Crawl, a non-profit that scrapes the web to provide open datasets. While useful for research, it consumes significant bandwidth.
Test if CCBot can access your URL
Enter URL to test if CCBot is allowed or blocked.
Is CCBot Safe?
It is recommended to use caution with CCBot.
This bot may have legitimate uses but can be resource-intensive or used for competitive intelligence.
For more information about our safety badges, visit our documentation.
User-Agent String used by CCBot
CCBotHow to Block CCBot?
Add the following standard robots.txt rule to prevent CCBot from accessing your site.
User-agent: CCBot Disallow: /