Gretchen Nadasky IMLS Grant Project Pratt Institute MSLIS LIS-698 Dr. Tula Giannini
THE TOOLS – The Internet Archive crawls the web at various intervals with Heritrix software that takes snapshots of individual pages. The pages can be searched by URL using the Wayback Machine interface. A customizable version called Archive-It allows users to enter “seed” URL’s that the software will access and harvest. The crawler also captures URL’s that are associated with the seed site. The interface has a reporting tool for users to do quality assurance and an interface to allow access to the harvested materials.