The Stanford Digital Library Project

The Stanford Integrated Digital Library Project will develop enabling technologies for an integrated "virtual" library to provide an array of new services and uniform access to networked information collections. The Integrated Digital Library will create a shared environment linking everything from personal information collections, to collections of conventional libraries, to large data collections shared by scientists.

[Figure 1 Image]
Figure 1

Today, users of different computer communications networks are able to communicate effectively because network protocols have become more integrated. The Digital Library project aims for similar integration at a much higher level than transport-oriented exchange protocols available for inter-network communications. High level concepts and protocols will allow users and developers to interact with diverse library resources in a unified manner. To shield users from unimportant details of accessing diverse sources, an abstraction layer, an "Information Bus," mediates communication between different clients, sources, and services. The Information Bus protocols will allow users and applications to express their goals in a language that expediently describes information management tasks and objects. Using the protocols, users can navigate and manage the "information space" in a consistent and unified way.

Figure 1 provides a conceptual view of the Information Bus architecture. The Information Bus itself is depicted (though not physically realized) as a central conduit of information. Each component interacts with the Bus, possibly through an intermediate protocol machine handling translations between native commands and commands recognized in Information Bus protocols. The interface clients manage user interaction. The information sources can be provided by any number of independent parties.

The final element of Figure 1 is the collection of library services, the real contribution to the network. For example, search programs can cut across different servers and formats to find relevant items using a uniform base of "meta-information," similar to a card catalog entry, containing dates, languages, size, and cost. Search programs using this meta-information could find, filter, and visually represent items by their source, purpose, history, and intended audience. The consistent protocols of the Information Bus enable persistent subscription services that perform information search, filtering, and notification based on users' needs. For example, the search might be modified according to the time and budget allotted, the user's expertise level, or the user's tradeoffs between quantity and relevance of results.

On the information providing side, a "publisher" can provide users a selected set of items to browse and access in a unified "publication," even though the items themselves are drawn from diverse servers and system types. Other services will catalogue, index, maintain consistency, archive, or collect revenues for information providers. For both providers and users of information, these services can be a mix of interactive applications (e.g., browsers) and automated agents which users program to perform tasks independently.

We have selected computing literature as the initial domain in which to demonstrate integration and develop a test suite of services that exhibit the system's potential. Our partner organizations have large, diverse collections of these materials and an extensive user community to provide evaluation.

For more information see:
http://www-diglib.stanford.edu/diglib or, send electronic mail to diglib@diglib.stanford.edu

Principal Investigators: Hector Garcia-Molina, Yoav Shoham, Terry Winograd
Partner Organizations: Association for Computing Machinery, Bell Communications Research, Enterprise Integration Technologies, Hewlett-Packard Labs, Hughes Research Labs, Interconnect Technologies Corp., Interval Research Corp., Knight-Ridder Information Services, NASA/Ames Research Center, NASA/Ames Library, O'Reilly and Associates, Stanford University Libraries, WAIS Inc. and Xerox PARC

The Stanford Digital Libraries Group is a partnership of Stanford faculty, staff, and students with commercial and non-profit ventures. It spans research groups, divisions, and departments within Stanford University and unites companies with different expertise, seeking to bring together theoreticians and practioners in the development of concepts and systems to revolutionize access to information.