of the American Society for Information Science and Technology     Vol. 28, No. 3      February / March 2002

Search

Go to
Bulletin Index

bookstore2Go to the ASIST Bookstore

 

Copies

Annual Meeting Coverage

ASIST 2001 Keynote: Brewster Kahle

Providing Universal Access to Human Knowledge

by Steve Hardin

Brewster Kahle is president of Alexa Internet, a company he founded in 1996 and recently sold to amazon.com. He can be reached by e-mail at brewster@archive.org. Steve Hardin, associate librarian at Indiana State University, can be reached by e-mail at libhard@cmi.indstate.edu.

Brewster Kahle, perhaps best known as the developer of the Wide Area Information Server, better known as WAIS, presented the opening plenary address at the 2001 ASIST Annual Meeting, on Sunday, November 4, in Washington, DC.

The concept of providing universal access to human knowledge overwhelms many of the people who think about it. But that doesn't mean it's not a worthwhile goal. Kahle says he's striving to do just that. In the process, he's created the Internet Archive (www.archive.org) a collection taking up more electronic space than the Library of Congress.

Kahle says the last people to attempt to provide universal access to human knowledge were the ancient Greeks, with their concept of the encyclopedia and their library at Alexandria. Experts say the Greeks managed to provide access to about half their world's knowledge. Today, we're continually adding information to our archives, but we're losing ground. Still, now that digital technology is more widespread, we're in a position to address the goal again. Kahle notes that many people today are concerned about the legal issues that accompany such a goal. But what is our civic responsibility in this matter? He says legislators all want libraries; they just don't know how they fit in now.

Some critics say publishers should do all the archiving. In Kahle's view, this approach has some problems. Some of the roles of libraries as third-party checks would not be served. For example, he downloaded the Adobe eBook version of Project Gutenberg's copy of Alice in Wonderland. But the rules say the eBook cannot be copied, printed, lent or read aloud. Having publishers keep the only copy of an item just won't work, he believes.

How can the people archiving the Web deal with the rights issues? What does it mean to "lend" digital materials? What implications does that have for interlibrary loan? Every time someone reads something, is it copied? When collection began in 1996, observers predicted attorneys would descend on them. So far, Kahle says, that hasn't happened. He says the head of the Copyright Office told him that while people can preemptively sue someone for copyright violations, they usually send a "cease and desist" letter first. An earlier Web system, DejaNews, adopted a "post and purge the complainers" strategy, where they made a search interface to the materials that had been publicly distributed for free in the past and gave the original posters the ability to opt out. This is the same system the search engines use. Some sites take out parts of their sites, as does the New York Times. Kahle said, "The newspaper of record refuses to be recorded."

Kahle's Alexa (a name he took to honor the great library at Alexandria) Web collection now comprises more than 100 terabytes. There are 16 million sites, and more than 10 billion pages have been added over the past five years. Alexa has more text than the Library of Congress and the digital storage costs only $300,000. While not everything saved is of the highest value, you can find real gems in the collection if you have the proper search tools.

The free Alexa service shows which organization originally hosted a Web page and when. There's subject indexing, too, bringing users to related and competitive links built by using path and link analysis. So far, there are 80-million "catalog entries" to places on the Web served by the Alexa service.

In 1996, the Internet Archive cooperated with the Smithsonian Institution to collect information on that year's election. In 2000, the Library of Congress commissioned the Archive to create an archive of the presidential election. The election collection runs between two and three terabytes.

And just this fall, the Internet Archive put up the Wayback Machine (the title taken from the old Peabody and Sherman cartoons) to archive out-of-print Web pages. Kahle showed his audience the way the Yahoo home page looked in December 1996. He also visited the White House during the same time period and brought up President Clinton's remarks from September 10, 1996, on efforts to combat terrorism. In some ways, things haven't changed that much. "If you don't have a memory, society really loses something," Kahle says.

"Help people remember, learn and create," Kahle says. "What a great thing to wake up and do every morning!" The ideal, according to Kahle, is for anyone to be able to walk into any library and gain access to the world's collections. He told the audience to imagine a shoeless, HIV-positive child in Uganda who walks a day to get to a library and can then access the latest medical information. The idea is profound, he says, and added we can do this only if ASIST members and others help.

The traditional approach to library borrowing is for a library to buy a copy of a book and lend it to its patrons. Video streaming represents a new variation on that theme. The TelevisionArchive (www.tvarchive.org) is making available copies of TV news broadcasts. For the events of September 11, there's a collection of television coverage from around the world. Kahle says they started archiving all Web sites and 20 TV channels for a one-week period from September 11 through September 18. The archive is a way to get world reaction were there really people cheering in the streets about the terrorist attacks? This way, you can look at TV programs from Islamic nations instead of just watching American reporting.

Kahle concluded by noting the goal is to provide "universal access to human knowledge, one page at a time, one patron at a time." But he can't do it all the job must be done by lots of people trained in how to do it. It's up to us as a community to provide access to these materials.

For Further Reading

Kahle, B., Prelinger, R. & Jackson, M.E. (2001). Public access to digital material. D-Lib Magazine, [Online], 7. Available: http://www.dlib.org/dlib/october01/kahle/10kahle.html

How to Order


ASIST Home Page

American Society for Information Science and Technology
8555 16th Street, Suite 850, Silver Spring, Maryland 20910, USA
Tel. 301-495-0900, Fax: 301-495-0810 | E-mail:
asis@asis.org

Copyright 2001, American Society for Information Science and Technology