Talk:Heritrix

Heritrix and archive.org
Is archive.org using heritrix to compile its archive? Is there any relation between heritrix and the wayback? — Preceding unsigned comment added by 4.238.210.3 (talk) 06:18, 4 October 2007 (UTC)

Arc file size
"An Arc file stores multiple archived resources in a single file in order to avoid managing a large number of small files. The file consists of a sequence of URL records, each with a header containing metadata about how the resource was requested followed by the HTTP header and the response. Arc files range between 100 to 600 MB."

That seems like a very arbitrary range. I have ARC files that are several KB, and I imagine they can get much larger than 600MB. —Preceding unsigned comment added by 99.240.219.118 (talk) 17:00, 7 April 2009 (UTC)

External links modified
Hello fellow Wikipedians,

I have just modified one external link on Heritrix. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
 * Added archive https://web.archive.org/web/20060111160619/http://wiki.lib.umn.edu/DI2/HowToCrawl to http://wiki.lib.umn.edu/DI2/HowToCrawl

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

Cheers.— InternetArchiveBot  (Report bug) 23:24, 2 November 2017 (UTC)