Martin Röscheisen, Christian Mogensen, and Terry Winograd
Computer Science Department
Stanford University, Stanford, CA 94305, U.S.A.
{rmr,mogens,winograd}@cs.stanford.edu
Shared annotations associated with networked information resources allow people to communicate about what they see and read. Potential applications include workgroup interaction, newsgroup-like fora, and personal information management, and can also provide the platform for reviewed information like Consumer Reports attached to product descriptions, "Seals of Approval" (SOAPs), or filters in support of enabling people to make sense of whatever information is presented to them. A general annotation facility would enable people to annotate arbitrary documents at any position in-place, to share comments/pointers with other people (either publicly or privately), create shared "landmark" reference points in the information space.
Previous systems include the annotations facility in Lotus Notes (which requires making available ``hooks'' for annotation attachment in a given document), annotations in Acrobat (which are in-place but not shared), and various other in-place facilities which are based on a shared file system to allow multiple people to share comments; examples here include ForComment, and some versions of MS Word-both are not incremental, that is, only one user can write comments at a given time, and then pass on this right. Then, there is a range of work on conferencing systems, which can be seen as commenting systems, usually with some dialogue model.
Since its beginning, the World-Wide Web (WWW) has featured as one of the obvious platforms where group commenting might be able to flourish, but apart from an early experiment in NCSA Mosaic with a group annotation facility [2] (which was neither in-place nor designed in a way that scaled with the volume annotations), there are only personal annotations in most currently available browsers-and there is little indication that these are widely used.
In [1], we have described an architecture and the corresponding protocols for a generic group commenting system which has been prototyped for the World-Wide Web: The architecture is based on "annotation sets". Every annotation belongs to a particular set and annotates a particular page at some specific location. Every set is associated with a particular server ("annotation server") and identity (like a URL). The server holds the index (but not necessarily the contents) of all the annotations in the set, and will in general be distinct from the server which provides the annotated document. Access control is managed per annotation set; examples include private, workgroup, or public annotations.
In this paper, we survey some of the interaction design issues encountered. In particular, the visual rendering and the interaction models.
Annotations can be indicated in the interface in a number of ways, including marginal markings (as in LaTeX), format-marking of annotated text (as in WWW browsers with underlined anchors), in-line presentation of the annotation text, and in-place annotation indicators. We are experimenting with several of these. The current browser uses in-place markers, with character-size in-lined images marking annotation points, and optional highlighting of the annotated text element. The choice of marker image can be user-determined, depending on the perspective a user wants to gain on the ensemble of annotations: if interested in the author, the image can show the author's face or individual icon; if the identity of the group sharing the annotations is more central, then the images can be chosen to contain a small icon representing the annotation set. Note that although the images are sometimes only 16x22 pixels, informal experiments confirmed sufficient discriminability and identifiability.
The annotation icons themselves are ``hot'' links: if clicked with the left mouse button (which is the Mosaic convention), a full document view of the comment is displayed. This view may contain other images of the author (in larger size) as well as further links, for instance, to longer elaborations; it may also contain further annotations or follow-up comments (see below).
Annotations can also be examined with a lightweight viewer which pops up a small window (which looks like a PostIt) when the icon is selected with the middle button and removes it when the button is released. This tool is a generic meta viewer which can be used in a variety of contexts to get "preview" information faster than it is possible with a full document-view window. It is useful as a general interface augmentation for mosaic browsers in contexts such as examining whether or not to follow a hyperlink which might lead to an expensive document.
We have identified a number of different interaction models, each with its own specific structure and corresponding interface affordances:
In the initial browser design, the interface reflected the generality of the annotation mechanism in that it was possible to annotate basically anything anywhere in any order. This turned out to be confusing for the most common usages. For example, there are lots of ways conceivable in which one could reply to an annotation made by someone else-but if there is no structural guidance for such a comment of type "reply", then this will easily become confusing for subsequent readers.
This led us to rephrase the problem of ``annotating a document'' to that of ``commenting to someone'' (constrained by certain discourse structures). To support one common usage, we restricted the affordances which the interface primarily suggests to a structure in which there are basically two phases in a specific annotation dialogue: the first comment (type: annotation) is pinpointing a particular segment in some document which someone considers worth a comment-this annotation now opens a discourse thread which is of type dialogue (enabling affordances for reply, follow-up, etc.). In other words, all future comments on this point are treated as follow-ups, and are rendered in the sequentiality of their submission, giving the functionality of news-group-like discussions. We are thinking of possibly extending the current implementation with some synchronous communication features more in the spirit of MUDs, in which the commenting can be done synchronously among on-line users.
In order to navigate a large distributed web of documents, it is useful to have ``landmarks'', that is, places which people are familiar with and which they choose as reference points. The annotation mechanism can be used to generalize existing concepts used on the WWW such as the notion of ``hotlists''. Leaving a mark on interesting documents and then querying the annotation server for a list of these annotations, in fact provides a ``hotlist'', which can be shared. The fact that this is embedded in a generic mechanism (with all other searching and cataloguing functionality available) in conjunction with being more dynamic (for instance, 'most recent' queries are possible which give only the recently modified part of a structure) makes it even more useful. Landmarks are particularly useful in combination with a tour mechanism which we have implemented.
One of the problems with comments about distributed documents is that they disappear from a user's world unless the annotated document happens to be looked at. The overall design in [1] allows us to provide users with ``tour'' facilities which guide them through a tour of all annotations which fulfill certain filtering properties. One of the obvious uses are ``What's Up''-type queries: at a given time a user might want to know what has happened in the time between now and the last time. Examples include making a tour through all of the comments which are replies to a particular group member's original comments.
The tour mechanism is currently implemented simply as a (specifically typed) list of links for each comment thread and a two-pane browsing interface such that the tour context is always maintained in one view and the tour focus is rendered in the other view (e.g. the annotated document with the comment included). Each such tour link will bring the user to where the comment in the annotated page is located. This greatly extends the value of the annotations, since it enables users to find annotations that are distributed over a number of documents, without needing to check for changes all the pages of those documents. We are investigating the advantages of more sophisticated graphical visualizations, such as maps.
The same underlying mechanism used to store and provide annotations is used to maintain individual user profiles [1]. This allows for all of the user specific information to be stored remotely in a profile such that people can use browsers on different machines with distinct file systems and load their profile from a server. This profile is an extensible model of the user's context which also includes all the information necessary to hide authentication processes from the user.
The design gives people a presence in the virtual document space: users can make themselves present at certain document locations (e.g. by putting their face icon at the top of the page), and with user profiles stored on a persistent server ``base station'', they can also be contacted there at any time. This provides a foundation as an enabling platform for experiments with on-line communities.
Augmenting widely used browsers by facilities that allow not only to read documents, but also to communicate about them with other people opens up the possibility of a uniform interface to a variety of navigational, retrieval, and communication tasks.