Statistics of Common Crawl Monthly Archives

Statistics of Common Crawl's web archives released on a monthly base:

size of the crawls - number of pages, unique URLs, hosts, domains, top-level domains (public suffixes), cumulative growth of crawled data over time
top-level domains - distribution and comparison
top-500 registered domains
crawler-related metrics - fetch status, etc.
overlaps between monthly crawls
distribution of

All metrics presented here are generated from Common Crawl's URL index data using the code of the cc-crawl-statistics project. Inspired by Sebastian Spiegler's Statistics of the Common Crawl Corpus 2012.