Skip to main content
padlock icon - secure page this page is secure

Curation of the End-of-Term Web Archive

Buy Article:

$17.00 + tax (Refund Policy)

The Classification of the End-of-Term Archive research project at the University of North Texas Libraries is investigating the feasibility of machine-generated classification of websites in the 16-terabyte End-of-Term (EOT) Web Archive. The research is being conducted concurrently in two areas: Archive Classification and Web Archive Metrics.

A set of 1,151 URLs within the EOT Archive was analyzed using link analysis methods to identify related groupings or clusters. Investigations into visualization of the underlying relationships among the URLs were also conducted. Subject Matter Experts (SMEs) in the classification of government information manually classified the same set of URLs using the Superintendent of Documents (SuDocs) Classification Numbering System, which is a hierarchical scheme that groups government publications by federal agencies. The SME-classification will serve as the criterion to evaluate the effectiveness of the link analysis.

In a parallel work area of the project, metrics for Web archives were discussed in a focus group with the SMEs, who identified key criteria libraries would likely employ in acquiring materials from Web archives. Participants also identified two service models libraries will need from Web archive service providers: acquisition and access models. A subsequent survey of Federal Depository Libraries measured the demand for each of these models, as well as libraries' perceived capabilities to support long-term preservation and local hosting of materials from Web archives. It appears that some existing library metrics, but more importantly, standard usage statistics will be essential metrics.
No Reference information available - sign in for access.
No Citation information available - sign in for access.
No Supplementary Data.
No Article Media
No Metrics

Document Type: Research Article

Publication date: January 1, 2011

More about this publication?
  • The IS&T (digital) Archiving Conference offers a unique opportunity for imaging scientists and those working in the cultural heritage community (curators, archivists, librarians, photographers etc) from around the world to come together to discuss the most pressing issues related to the digital preservation and stewardship of hardcopy, and other cultural heritage documents and objects. Authors come from museums, archives, libraries, government institutions, industry and academia. Cutting edge topics related to multispectral and 3D imaging, as well as best practices for workflow, sharing, standards, and asset/collection management and dissemination are explored in papers presented at this annual, international event.

    Please note: For purposes of its Digital Library content, IS&T defines Open Access as papers that will be downloadable in their entirety for free in pertuity. Copyright restrictions on papers vary; see individual paper for details.

  • Editorial Board
  • Information for Authors
  • Submit a Paper
  • Subscribe to this Title
  • Membership Information
  • Terms & Conditions
  • Author guidelines
  • IS&T publication guidelines
  • IS&T publication policy
  • Ingenta Connect is not responsible for the content or availability of external websites
  • Access Key
  • Free content
  • Partial Free content
  • New content
  • Open access content
  • Partial Open access content
  • Subscribed content
  • Partial Subscribed content
  • Free trial content
Cookie Policy
Cookie Policy
Ingenta Connect website makes use of cookies so as to keep track of data that you have filled in. I am Happy with this Find out more