Gretchen Nadasky IMLS Grant Project Pratt Institute MSLIS LIS-698 Dr. Tula Giannini
The raw data is included in the attached spreadsheet. auctionwebsiteanalysisdata
Auction catalogs are an essential tool to the sale of art and are foundational text for art researcher. The Frick Art Reference Library (FARL) in New York has one of the largest collections of sales catalogs and is a recognized leader in art scholarship globally. Web-publishing of auction catalogs is both an opportunity and a preservation concern for art libraries. The “Reframing Collections for a Digital Age” project seeks identify how born-digital auction catalogs can be preserved, archived, described and made accessible to researchers.
This phase of the project focused on prioritizing auction catalog websites for preservation using Archive-It. Web-archiving is not an automated process and can be costly and time-consuming. Identifying specific websites to be preserved can be viewed as a collection development process. However, in order to avoid replicating preservation efforts by the Internet Archive an experiment on the Wayback Machine was conducted to determine if a link to catalogs in the Wayback Machine could be added to FARL records instead of running a customized archiving program.
Three studies were done on 137 auction house websites:
Concurrent with this research was the continuation of a live web-archiving pilot project. The final product will be a priority list to web-archive. (The priority list will not be made public).
Auction Websites Profile Summary: The studied identified the primary language of 75.9% of the websites is English or German and 76% of the websites contained at least some ‘archive-friendly’ HTML catalog content.
Print Acquisition: The survey revealed that FARL continues to receive print catalogs from 72% auction houses whose websites we examined. A brief review of costs to acquire was also undertaken. Prices for auction subscriptions can range from $25 per catalog to $4,000 a year.
|Auction House||Print Status||Auction House||Print Status||Auction House||Print Status|
|Galerie Hassfurther||No longer receive||Aste di antiquariato Boetto||On request||Zürichsee Auktionen||Discontinued|
|AB Stockholms Auktionsverket||No longer receive||Goteborgs Auktionsverk||On request||Finarte Casa d’Aste||Discontinued|
|Castells & Castells||No longer receive||Bukowskis||On request||Auction of illustration||On-line only|
|ALIS Auction House||No longer receive||James R Bakker||On request||Nagyházi Galéria||With payment|
|Important American fine art||No longer receive||Auktion Schneider-Henn||On request||Farsetti arte||With payment|
|Leonard Joel||No longer receive||Jeschke Meinke & Hauff||On request||Hampel||With payment|
|Mü-Terem Galéria||No longer receive||Lawson’s||On request||Bloomsbury Auctions||With payment|
|Hagelstam||No longer receive||Hodgins Art Auctions Ltd||On request||Kieselbach||With payment|
|Auktionshaus Stahl||No longer receive||Shinwa Art Auction||On request|
|Cheffins Fine art auctioneers||No longer receive||Schmidt Kunstauktionen||On request|
|Galerie Národní 25||No longer receive||Auktionshaus Ineichen||On request|
|Bay East Auctions||No longer receive||Casa d’aste Babuino||On request|
|Castellana, Subastas de Arte||No longer receive|
|Semenzato Casa d’aste||No longer receive|
|Fernando Duran, Subastas de Arte||No longer receive|
Wayback Machine Preservation: The investigation showed that 27% of the auction house websites catalogs are being “automatically” (though usually only partially) captured by the Wayback Machine. The chart below show some of the technical issues that can arise in web-archiving. The top three issues are: the software opens into the live website, the website but not the catalogs are archived, and broken links.
Initiating and maintaining a web-crawling schedule can be complex and take time and attention to detail. Once the candidate or ‘seed’ is chosen a test crawl is run to determine what URL’s, if any can be harvested. The results of the test crawl reveal if the set is blocked by robots.txt, if there are an excess of unusable pages being collected and if the crawl can capture a large portion of the site. If the test crawl looks successful seed is added to the regular schedule. Once the crawl is completed several reports must be examined to see if adjustments must be made. In addition, at this point a user can actually examine the archived materials for completeness. One advantage is that the archived site has been indexed by the program and can be key-word searched.
The crawls that were run from September 2012-2013 had varying results and levels of success. Some issues were:
Some of these issues could be addressed by changing the scope but overall the process was cumbersome. The best results came from smaller sites that used mostly HTML technology and those that posted pdf versions of the catalog on pages close to the home page.
The results suggests that auction catalogs made available on-line can become an important resource for for FARL, and preserving these materials is a priority of the library. Although most auction houses send print catalogs, the trend is starting to change so the shift to collecting web-based materials will need to accelerate. FARL and other institutions cannot rely on large-scale web-archiving initiatives like Internet Archive to preserve auction catalogs. The difficulty accessing catalogs underscore that web-crawlers are designed to capture web pages as opposed to documents posted to the web.
Executing web-crawls using Archive-It indicates that even archive-friendly materials can be hard to capture. Web-archiving initiatives requires an investment of time to customize collection policy and crawl scope. Therefore, priorities must be set on which websites can and should be archived. The attached data provides the criteria that can be sorted to design an efficient web-archiving strategy. The priority list delivered to FARL is confidential.
The raw data is included in the attached spreadsheet.
The next phase of the “Reframing Collections for the Digital Age” project might include using the priority list developed here to establish a formal preservation and access program for born-digital art resources at FARL. It would also involve the next step in the workflow of adding item-level metadata to the archived materials and creating a record in FRESCO. Regular quality assurance checks are necessary to make sure the links work and that the crawls continue to gather the appropriate materials. In an ideal situation, there would be a dedicated member of the FARL staff to undertake these steps while also engaging with collaborators globally in the art community.