CCBot User-Agent & Blocking Rules

Operated by Common Crawl

Scrapers Caution

What is CCBot?

CCBot is the crawler for Common Crawl, a non-profit that scrapes the web to provide open datasets. While useful for research, it consumes significant bandwidth.

Owner Common Crawl
Official Documentation Visit Website

Test if CCBot can access your URL

Enter URL to test if CCBot is allowed or blocked.

https://
CCBot is  ...

Is CCBot Safe?

It is recommended to use caution with CCBot.

This bot may have legitimate uses but can be resource-intensive or used for competitive intelligence.

For more information about our safety badges, visit our documentation.

User-Agent String used by CCBot

CCBot

How to Block CCBot?

Add the following standard robots.txt rule to prevent CCBot from accessing your site.

User-agent: CCBot
Disallow: /