Newsletter Signup
Where current and emerging technology trends meet.
TecTrendsInformation Sources, Inc.
  | About TecTrends | Email Signup | Contact Us
 Live Search:
Live Search | Articles | Companies | TecTerms | Products
  Loading TecTrends Live Search - please wait... 
View Noteworthy Articles      PRNewswire
 
Article

Title: A crisis for Web preservation: Fugitive documents published...

Author: Olsen, Florence Article Type: Product Analysis
Source: Federal Computer Week, v18 n20 p60(2) Publication Date: Jun 21, 2004
  ISSN: 0893-052X
  Illustrations: Charts
URL of Publication: http://www.fcw.com

Daniel Greenstein, head of the California Digital Library, says information is disappearing from government Web sites at an alarming rate due to the Federal Depository Library Program's inability to keep up with tasks required to catalog and preserve access to government documents published only on the Web. Public access, therefore, to such publications is uneven or worse. The Government Printing Office (GPO) runs the depository program, and officials are trying to address the problem of fugitive documents. To capture such publications automatically, GPO officials has considered the use of Web harvesting technologies. A notice had been published in May 21004 asking vendors for information about Web-crawler and data mining technologies that could help in locating fugitive government publications, but Greenstein says such technologies, although effective in capturing documents from the Web surface, are less effective at capturing information from the Deep Web (where database and dynamic Web pages reside). The number of government documents published on the Web is higher than the number of print publications, and many online publications are uncataloged and unavailable at depository libraries because federal officials fail to notify GPO that the publications exist. The GPO needs technology that can locate and capture government information on the Web in any format; examine file content and metadata associated with the file; follow rules for capture of government information and avoid capture of information that fails to conform to the rules; tolerate rule changes; and conduct automated comparisons between newly captured information and stored information to avoid duplication.

Special Features: Charts

Products:
Digital Libraries

TecTerms:


[Get Copyright Permissions] Click here for copyright permissions!
Copyright 2004-2008 Information Sources Inc.
 


Home About TecTrends About Us Contact Us Privacy Statement Terms and Conditions

TecTrends | P.O. Box 8120 | Berkeley CA 94707 | (510) 525-6220 | Email: tectrends@tectrends.com
© 2006 INFORMATION SOURCES INC | All rights reserved.