Beyond Browsing: Shared Comments, SOAPs, Trails, and On-line Communities

Martin Röscheisen

Christian Mogensen

Terry Winograd

Computer Science Department
Stanford University
Stanford, CA 94305, U.S.A.


This paper describes a system we have implemented that enables people to share structured in-place annotations attached to material in arbitrary documents on the WWW. The basic conceptual decisions are laid out, and a prototypical example of the client-server interaction is given. We then explain the usage perspective, describe our experience with using the system, and discuss other experimental usages of our prototype implementation, such as collaborative filtering, seals of approval, and value-added trails. We show how this is a specific instantiation of a more general "virtual document" architecture in which, with the help of light-weight distributed meta information, viewed documents can incorporate material that is dynamically integrated from multiple distributed sources. Development of that architecture is part of a larger project on Digital Libraries that we are engaged in.

Virtual Documents, Meta Information, World-Wide Web, Group Annotations, SOAPs, Collaborative Filtering, Shared Workspaces, CSCW, Workgroups.

1 Introduction

There are many different reasons why people want to communicate to each other about specific things they find as networked information resources. These include comments and annotations of the kind a workgroup would share about their common area of interest, the ability to have newsgroup-like fora associated with specific items on the "net", value-added trails that link items together that someone considers being connected under a particular view, systematic critique and review information, "Seals of Approval" (SOAPs), or filters in support of enabling people to make sense of whatever information is presented to them.

All of these applications have in common two properties that are not associated with the standard mechanisms for hypertext (e.g., HTML):

For example, consider a set of "consumer reports" annotations provided by a review organization and attached to product catalogs on the web, or the private (within the group) comments made by our local research group as part of their joint reading of the WWW conference proceedings. In each case, the annotations themselves need to be kept separate from the annotated documents and access to them handled in a uniform way for appropriate subscribers.

We have developed a general mechanism for shared annotations, which enables people to annotate arbitrary documents at any position in-place, share comments/pointers with other people (either publicly or privately), and create shared "landmark" reference points in the information space. The framework represents a further step towards giving people a presence on the Web, laying the foundation for on-line communities.

This paper describes a general meta-information architecture ("BRIO") and an implementation of this general architecture, called "ComMentor," which we have developed for annotating pages on the WorldWideWeb. which realizes such usage scenarios in a particular instantiation; this includes a basic client-server protocol, meta-information description language ("PRDM"), a server system (currently based on an NCSA http 1.3 server with CGI scripts written in PERL), and a remodeled NCSA xMosaic 2.4 browser with interface augmentations to provide access to our extended functionality [CHI95, SLIDES].

The idea of a system enabling a multiplicity of independent individuals to create light-weight value-added "trails" through a document space was envisaged most prominently by Vannevar Bush as early as 1945 [BUSH]. The ComMentor system can be thought of as a tool which such "trail blazers" use to add value to contents, and which other people use to find guidance based on these human-created super-structures.

There are a number of existing systems that incorporate mechanisms that are related to our current architecture. These include the annotations facility in Lotus Notes (which requires making available "hooks" for annotation attachment in a given document), annotations in Acrobat (which are in-place but not shared), and various other in-place facilities which are based on a shared file system to allow multiple people to share comments (these include ForComment(TM), and some versions of MS Word--both of which are not incremental, that is, only one user can write comments at a given time, and then pass on this right.) In the World-Wide Web arena, there has been ongoing discussion about the appropriate mechanisms for group annotations, but these have generally assumed whole-document attachment (rather than attachment in place) and universal (rather than access-controlled) distribution. They have not been developed into widely used facilities. There are a few experimental systems dealing with annotations of various kinds (mentioned in the references) including Ubique [UBIQUE], which uses a proprietary architecture, geared towards synchronous user-user communication rather than value-added structures.

In the first section of this paper, we describe the overall architecture including a description of the user view and a note on how documents are synthesized. In the second section, we present sample scenarios for the client-server interaction. In the remainder of the paper, we discuss our experience using the facility and describe a variety of usage scenarios that we are currently aware of.

Further technical detail is available in the extended technical report [CSDTR95], from the Stanford Digital Library Project.

2 System Architecture

In this section, we describe the basic system structure, the meta-information server design, the user view of the annotation system, and the current interface design of the browser. We also include a brief technical note on how documents are dynamically synthesized.

2.1 Basic Structure

The basic system architecture is a depicted in Figure 1:

Users interact with a browser to retrieve documents from various document servers. In addition, there are meta-information servers from which relevant information can be retrieved according to the protocol defined in this paper. Meta-information "items" like annotations are organized into "sets" to which members of "groups" have access. For example, depending on the context set by the user, the browser can decide to retrieve annotations from certain meta information servers for every document the user looks at, and display a version of the document in which annotations to various segments are shown at their appropriate position in the text.

Conceptually, a "browser" can be understood as consisting of

Note that there are different ways to implement such a structure: For better interactivity, the complete dashed box can be realized in one address space (which amounts to "augmenting Mosaic", for instance); alternatively, the document synthesis module can be factored out, and the renderer controlled by an independent application (which amounts to a "remote controlled Mosaic plus a proxy server"). For example, instead of reloading a possibly large document after adding or deleting an annotation, a browser that incorporates the document synthesis function can provide the same visual feedback by just adding or deleting the annotation internally. The main disadvantage of incorporating the document synthesis function is that people cannot use their unaugmented browsers for this functionality. We have designed the code structure in a way that allows us to experiment with the potential trade-offs involved in this spectrum.

2.2 Server Design

The following depiction shows the internal structure of a meta-information server, with annotation information as an example of meta-information items.

The figure shows the basic entities: members, who form groups, which access annotation sets that contain annotations (from left to right).

The architecture is based on "annotation sets". Every annotation belongs to a particular set and annotates a particular page at some specific location. Every set is associated with a particular server ("annotation server") and identity (like a URL). The server holds the index (but not necessarily the contents) of all the annotations in the set, and will in general be distinct from the server which provides the annotated documents (and will not necessarily be under the control or resource allocation of the original writer). Each annotation set can be thought of as a "distributed document," stored in one place but consisting of a set of annotations that are associated with pages anywhere on the Web.

Access control is managed per annotation set. Some obvious examples are

Sets are particular information objects to which authenticated people stand in a certain authorization relation; groups are the human-organizational objects. There may well be more than one annotation set associated with a group, each used for different purposes.

The mechanism for authenticating access to meta-information sets (by virtue of being member in a group with appropriate access authorization) is an issue which is kept orthogonal in the design: any authentication mechanism (private keys, public keys, etc.) can be used. This allows us to separate out group membership (with a possibly very secure approval mechanism) and multiple sets of annotations (e.g. on different topics) which anyone in a specific group can easily access. Users can either have read or read/write access to a set by virtue of their group membership. Since there is no net-wide standard for individual identity and access control group membership, we have implemented a simple mechanism for establishing individual identities (with associated passwords) and group membership. We expect in the future that these will be generic network functions, and we will use the standard mechanisms that emerge.

For illustration, consider the equivalent in the UNIX file system: before being able to use a workstation, one needs an account ("group membership"), and once this is established, the user needs to authenticate herself. Then, once "logged in", users can create files and directories ("Sets") in whatever way is authorized by their group memberships for a particular location (e.g. write access for the group 'users' in the home directory). This distinction allows for differently strong authentication methods to be used without inhibiting the ease of creating directories/sets.

2.3 User view of the annotation system

The following points summarize some of the basic design decisions for supporting annotation meta-information for the user:
  1. The browser provides a simple mechanism to identify a place on a currently visible page and write a comment to it within a set you specify (assuming, of course, that you have write privileges to the set). At any time you will have a default set for writing. The browser makes it possible to have comments from any of the open sets show up in place on the page whenever you view a page that has a comment in that set (i.e., you do not have to know in advance what is annotated in which set).

  2. The mechanism for specifying locations within a document includes redundant information and update information that makes it possible to determine whether a document was modified. In making an annotation, the user simply selects a region in the document being annotated, and the browser stores redundant information about it. When a page changes, an attempt is made to relocate the attachment point for the comment based on mechanisms such as embedded tags and/or string match. If that cannot be done, it is placed at the end of the document with a notation that it has been displaced, along with the (human-readable) context it was attached to.

  3. Users can browse for annotation sets, and put together their personal selection. The selected sets are stored as part of the user's profile for their browser, which is built from the information about the user on the annotation server. Among these sets, users can designate specific ones as "active". If a set is active, then the comments in this set for a document being retrieved are retrieved from the comment server for this set. Information integration is supported both at server-side and at browser-side.

  4. Annotations can be indicated in an interface in a number of ways, including marginal markings (as in LaTeX), format-marking of annotated text (as in WWW browsers with underlined anchors), in-line presentation of the annotation text, and in-place annotation indicators. We are experimenting with several of these. The current browser uses in-place markers, with character-size in-lined images marking annotation points, and optional highlighting of the annotated text element. The type of marker image can be user-determined, depending on the perspective a user wants to gain on the ensemble of annotations: if interested in authorship, the images can show each author's face or individual icon; if the identity of the group sharing the annotations is more central, then the images can show a small icon representing the group or the specific annotation set. Although the images are kept small in order not to interfere with document content (often as small as 16x22 pixels), informal experiments have suggested they have sufficient discriminability and identifiability even for faces.

  5. The browser interface includes mechanisms which support different styles of reading annotations. The annotation icons themselves are "hot" links: if selected with the left mouse button (which is the Mosaic convention), a full document view of the comment is displayed. This view can then contain other images of the author (in larger size) as well as further links, for instance, to longer elaborations; it may also contain further annotations or follow-up comments (see below).

    Annotations can also be examined with a light-weight viewer which pops up a small window (which looks like a PostIt(TM)) when the icon is selected with the middle button and removes it when the button is released. This tool is a generic meta viewer which can be used in a variety of contexts to get "preview" information faster than it is possible with a full document-view window. It is useful as a general interface augmentation for mosaic browsers in contexts such as examining whether or not to follow a hyperlink, which might lead to an expensive document.

    Annotations can be filtered according to different selections; a typical usage is to filter review information for category labels to see only documents with a certain rating. Note that queries for particular items from the meta information server are supported by its database backend.

    Since annotations are documents too, they can be recursively annotated.

  6. Annotations are considered write-once. They cannot be edited (but of course additional ones can be made). People with appropriate privileges can remove them. Annotation sets can have different policies with regard to deleting.

  7. A tour of the annotations in a set that have been created more recently than a given time can be queried from the servers. Such a tour is a list of links, each of which will bring the user to where the comment in the annotated page is located. This greatly extends the value of the annotations, since it enables users to find annotations that are distributed over a number of documents, without needing to check all the pages of those documents for changes. The fact that the list is limited to a specific set makes it more useful than a general "what's new" list.

2.4 Dynamically Synthesizing Documents: The Merge Library

The merge library is the set of procedures that actually synthesize the document that is then rendered from the documents and annotations that are relevant in a given context. It contains procedures that take a document and a PRDM list of comment items, and return a document where the comments are in-lined and given an appropriate rendering. The procedures are specific for the content type of the document.

The method of attachment for HTML and plain text is currently based on string position trees (Patricia trees for positions; cf. [KNUTH]) applied to a canonical representation of the document. Each comment object has associated with it the highlighted text string as well as the position identifier string.

Position identifier strings are useful in this context because they are by definition the smallest internal identifying string, and therefore are likely to be robust against modifications of the underlying document. They also allow for changes to be detected. Beyond the minimal string, there is some redundantly stored information to give a context for reattachment in case the document was modified and the minimal string itself does not uniquely identify a position any more.

If the position cannot be recovered, the browser appends the corresponding item marked as "Unassigned" to the end of the document.

For a flexible system with maximum interactivity and generality, it is quite useful to have incremental insertion mechanisms; that is, not all relevant meta-information pertaining to a document in a given context has to be known in the beginning. An example would be that a user activates a new set, or one set is returned by a slower server, or another server has already pre-merged the meta-information from some of the sets while the browser merges the rest of the sets. Then we want to be able to merge in the additional meta-information from the relevant sets incrementally. Note that the requirement of incremental insertions determines how the merge algorithm has to look like. For example, global position identifiers such as simple position counts would not work in a straight-forward sense since they are affected by previous insertions. This raises the need for a canonical representation of a text in a certain format which embodies all the features that are significant for attachment but do still allow text transformations which are invariant with respect to the canonical form. (For example, inserting an HTML comment into an HTML file should preferably not affect the position identifiers.)

Procedures for other content types, especially images (and possibly external viewers), have not yet been examined. If the type already supports a concept of attached annotations (e.g., PDF in the Acrobat viewer), then we could make use of that directly in identifying the position of annotations which are stored on a group server.

3 Client-Server Interaction: Sample Scenarios

In this section, we give two sample scenarios of how browser and server interact: For a technical specification of the client-server protocol, please refer to the appendix in [CSDTR95].

3.1 Retrieving Document and Annotations

A document and the annotations of the activated annotation sets are retrieved concurrently from document server and annotation servers, respectively.

Fetching a document

[Load URL into browser]

Retrieving the base document is the standard document browsing ("GET") interaction.

Requesting Annotations

[Request annotations from annotation server]

Whenever Joe accesses a new page, the browser sends out a request for annotations related to the URL just loaded to the annotation server; this request includes:

Server Returns Annotations

[Get annotations from server]

The server sends back a string of meta-information which the client uses to merge with the original document. The document is then rendered with the annotations inside it.

Combine Annotations with Document

[Merge annotations with text]

The merging of the document and the annotations is done by a set of procedures based on redundant information and minimal length tree descriptors which uniquely identify the position in a document and which are designed to be maximally robust against change of the underlying document. (See also the section on the "merge library" for synthesizing documents from other documents and accompanying meta information.)

3.2 Joining a Group

Before we can read comments in a set, we must belong to a group with access to that annotation set.

Fetching group meta-information

[Group information sent from annotation server]

Setting up an annotation server requires augmenting a conventional http server with the appropriate scripts. The groups which are available at such a server are described in conventional HTML pages which are augmented by meta-information as to who is the owner, whether the group is public, and other properties. These pages can be browsed in the usual way. We have set up an initial What's New type list which contains currently only our servers, but others are likely to come, and people can use such lists to browse for groups they want to join in the same way they also browse for other documents (which also means that such information defines the interface for search "agents" which can be looking for new groups of relevance).

Group: User Interface

[JOIN REQ Sent to annotation server]

The annotation server sends back a page containing meta-information about the group. The browser extracts this information, displays the page, and indicates in its user interface that the current group is eligible for a request for membership.

Example: The Join Group button is made active.

A new user 'Joe' checks out the server and discovers a few interesting annotation sets there. Unfortunately, they are all closed to non-members, so Joe applies to join a group named Outsiders which is described as:

This group is intended for interested parties outside the lab. It will allow you to access the following annotation sets and contribute to the discussion of our research by adding your own annotations. The documents of interest tend to reside on this server. Check each annotation set for details.

Join-Group Request and Approval

[JOIN REQ Sent to annotation server]

The group administrator receives e-mail notification of the request and approves the request by sending a reply back to the annotation server's mail gateway. In other words, joining a group is much like joining a mailing list. However, a group is merely an access authorization unit. Groups have nothing to do with the actual creation of annotations; they are dealt with fairly infrequently (like setting up a new UNIX account for someone).

Adding an Annotation Set

[PRDMitemSet sent from annotation server]

After receiving an e-mail notification of the approval, Joe decides to go back to the annotation server and select a few annotation sets. On viewing a page describing an annotation set, his browser detects the embedded meta-information and enables the "Add Annotation Set" button in the user interface.

Joe adds the annotation set "HTML Standardization" to his list of subscribed sets. His browser pulls out the embedded meta-information and adds the information to the list of stored annotation sets. The meta-information includes the set's name and location as well as some caching data such as which documents have been annotated.

Joe then activates the set. Annotations for all active sets are always requested from annotation servers by the browser--conceptually for every browsed document (but a caching mechanism by which only those document servers are queried which are known to be annotated leads to an efficient implementation).

4 Usage Scenarios

In this section, we present some usage scenarios which follow naturally from the described architecture.

Shared WWW Comments

This is the basic annotation scenario described above with annotations organized into sets accessed by declared groups. It is possible to share comments with undeclared users by declaring the pseudo-person anyone to have read access to the annotation set.

A collection of annotations are usually rendered in-line as small icons. Each icon is a link to a dynamically generated HTML page containing the annotation text. Such a page can then be annotated recursively, or a follow-up comment can be made (which is usually rendered linearly)--providing a threaded stream of comments.

Note that the actual visual presentation of comments is independent of the meta information transmitted; different browser designers can experiment with different renderings without affecting the underlying functionality.

A useful feature when using this sort of set is the ability to query the set for recent additions. If one is interested only in annotations which are less than a day old, or written by other members of a particular group, then the meta information server can perform these selections as queries over the meta information database, and make this information available as links to the relevant locations. This makes it easy to track new annotations to a given document or a particularly interesting discussion.

Personal Annotations

Anyone can create an annotation set which does not give access to anyone else except oneself. This is the case of personal annotations stored on a server. By storing annotations remotely (instead of in the local file system), the same generic searching and cataloguing functions as on any other set are available.

Annotation sets can replace the notion of a hotlist: the user simply marks interesting pages with respect to a particular set as she visits them. Marking a page adds a piece of meta-information to the chosen set. A straightforward extension is the use of multiple hotlists, like one for each topic: different sets. Such hotlists would then be retrieved by asking the server for a list of all documents in the set. An additional benefit is that the hotlist can be shared by several clients simultaneously.

Another use of personal sets is the possibility of creating very informal shared sets that exist outside the formal group structure. In other words, if someone wants to share information with his office mate, it is possible to create a set to which both have read and write access.

Landmarks, Tours, and Trails

In order to navigate a large distributed web of documents, it is useful to have "landmarks", that is, places which people are familiar with and which they choose as reference points. The annotation mechanism can be used to generalize existing concepts used on the WWW such as the notion of "hotlists". Leaving a mark on interesting documents and then querying the annotation server for a list of these annotations, in fact provides a "hotlist" which can be shared. The fact that this is embedded in a generic mechanism (with all other searching and cataloguing functionality available) in conjunction with being more dynamic (for instance, `most recent' queries are possible which give only the recently modified part of a structure) makes it even more useful. Landmarks are particularly useful in combination with a tour mechanism which we have implemented: Taken together, this realizes a form of collaborative filtering.

The tour mechanism is currently implemented simply as a (specifically typed) list of links for each comment thread and a two-pane browsing interface such that the tour context is always maintained in one view and the tour focus is rendered in the other view (e.g. the annotated document with the comment included). Each such tour link will bring the user to where the comment in the annotated page is located. Since it enables users to find annotations that are distributed over a number of documents without needing to check for changes all the pages of those documents, this greatly extends the value of the annotations. We are investigating the advantages of more sophisticated graphical visualizations, such as maps.

Trails can be applied to implement multiple guided tours through the same content without confusing the users: For illustration consider a set of paintings and their descriptions such as they can found in the "Web Louvre". It is now possible to create annotation sets "impressionist painters", "chronological tour", and others, and add trails marks (a special annotation type) to each set which form a linked tour according to the set semantics. Then, if people wish to look at the paintings in a chronological order, they can activate the corresponding set, and follow along a tour by clicking on the trail mark icons. If they wish a different tour, then they can turn on another set. In any case, they will not be confused by multiple signs on a given page, because the annotation architecture allows for underlying contents and connecting super-structures to be separated (in the static representation) and dynamically synthesized together based on a chosen user context.

Seals of Approval (SOAPs)

A seal of approval (SOAP) is an idea which occurred in the Interpedia Project [IP93] and later assimilated into the URC draft [DM94]. A seal of approval is meta-information containing a rating describing another document. In our decentralized model, there may be several SOAP authorities (that is, servers containing SOAPs), and several dimensions along which ratings are made. The semantics of the various dimensions can also be described using meta-information.

As a hypothetical example, consider the case that someone has a high opinion of the quality ratings which a certain academy issues about documents, and wants to be advised by these ratings while browsing for documents. Then, such an academy could create a Guidance SOAP; this is just an annotation set where the access authorizations are chosen such that anyone can read it, but only fellows of the academy can write new comments. Then, interested users can can enable the agency's annotation set, and when they browse documents, they can quickly get an idea of how valuable it will in all likelihood be to spend time on a given document: by glancing at the seals. Note that with the tour mechanism it is moreover possible to look explicitly at those documents for which a very positive rating exists.

SOAPs can be implemented in the given system in a straight-forward way; they are just annotations whose content follows some category system. Browsers can then exploit this systematicity by associating some action model to the various categories; such actions can range from popup hints to not retrieving a particular page.

5 Future Work

We are currently testing the various usages as part of our internal project communication in the context of the Stanford Integrated Digital Library Project [DL94]. That project is developing a general architecture, called an InfoBus, for integrating "library services" of all kinds for the production, dissemination, maintenance, search and access of information objects. The kinds of services described in the above scenarios are among the services that would form the overall capacities of a digital library. The InfoBus architecture is based on a CORBA object model, and our implementations will be modified to work within that model. We are gathering experimental data to be able to evaluate the design of specific annotation-based services, in simple pilot applications used by participants in the digital libraries project. These include extensions to materials that are not in HTML (e.g., Adobe Acrobat), and to using the mechanisms as a basis for "virtual place" in on-line communities.

Members of the Project on People, Computers, and Design at Stanford Unversity, in particular Michelle Baldonado and Steve Cousins, have provided valuable feedback throughout the design and development of the system.


Vannevar Bush (1945). As we may think. The Atlantic Monthly, July. URL:
Martin Röscheisen, Christian Mogensen, and Terry Winograd (1995). Interaction Design for Shared WWW Comments. Short Paper, CHI95.
DEC, HP, HyperDesk, NCR, ObjectDesign, and SunSoft (1993). The Common Object Request Broker: Architecture and Specification. OMG Document Number 93.xx.yy. December 1993.
Martin Röscheisen, Christian Mogensen, and Terry Winograd (1995). Beyond Browsing. Forthcoming Technical Report. [Extended version of this paper. Technical appendices.] Computer Science Department, Stanford University. See
Hector Garcia-Molina, Yoav Shoam, and Terry Winograd (1994). Stanford Integrated Digital Library Project. Computer Science Department, Stanford University. (NSF/ARPA/NASA proposal). URL:
Ron Daniel, and Michael Mealling (1994). URC Scenarios and Requirements. Draft, Internet Engineering Task Force, November 21.
Daniel LaLiberte (1994). HyperNews. URL:
David R. Woolley (1994). Conferencing on the Web. URL:
Interpedia Project (1993). URL: news:comp.infosystems.interpedia.
Jim Davis (1994). CONOTE: small group annotation experiment. Jim Davis and Dan Huttenlocher. URL:
Francis Heylighen (1994). The Principia Cybernetica Web. URL:
Gramlich (1994). Public annotation systems. URL:
Knuth, D. (1973). The Art of Computer Programming. Vol. 3. Addison-Wesley.
Nathaniel Borenstein, and N. Freed (1993). Multipurpose Internet Mail Extensions. Draft, Internet Engineering Task Force.
Mosaic Design Team (Sept 1993). Group Annotations in NCSA Mosaic. URL:
John Mallery (1995). Openmeet. URL:
Martin Röscheisen, Christian Mogensen, and Terry Winograd (1995). Slides for Presentation. Third International WWW Conference, Darmstadt, Germany. URL:
Virtual Places Architecture, the Doors Server, and the Sesame Navigator. URL: