The Web Laboratory: GetCrawls Tool
This tool's documentation is available here.
The main use for this tool is to see what web crawls are available in the database, and what identification number they were assigned.
| CrawlID | CrawlName | Time Period | Total Files per Internet Archive |
Total ARC Size | Total DAT Size | Total Crawl Size | Total Files per Crawl as Recorded |
Download Order | Completed Date |
|---|---|---|---|---|---|---|---|---|---|
| 3 | DJ | Jan-April 2002 | 161,734 | 9.2 | 0.602 | 9.8 | 204,560 | 2nd | 06/11/2006 |
| 4 | DV | Jan-April 2004 | 126,266 | 23.7 | 1.9 | 25.6 | 515,260 | 4th | unknown |
Legend:
- Total Files IA - The number of files for the crawl as per the Internet Archive
- Total ARC Size - ARC size of the crawl as per the database
- Total DAT Size - DAT size of the crawl as per the database
- Total Crawl Size - equals Total ARC Size + Total DAT Size
- Total Files by RC - This is the count of the total number of files per crawl as recorded in the WebLibraryTracking Databas.
- Download Order - The order in which crawls have to be downloaded
- 1st - Crawls in 2005
- 2nd - Crawls in 2002
- 3rd - Crawls in 2003
- 4th - Crawls in 2004
- 5th - Crawls in 2001
- Completed Date - Date on which a crawl download is completed
