CS 377 Final

14. Dec. 1994

Christian mogensen@cs.stanford.edu http://www-pcd.stanford.edu/mogens/377/final.html

Disclaimer: Feel free to copy this assignment and share it with others.

1: Course Organization
On the whole I quite liked the way the material was organized. I would like to suggest a few changes too:

Move the memory papers (Lindsay and Norman; Searleman and Herrmann; Cohen) in the main reader to just after the Wright and Lickorish paper Remembering while mousing.
This should put the low-level cognitive processes together in one place. First the cognitive model human processor, then GOMS analysis, followed by the three memory papers, finally the cognitive costs of mouse clicks paper ties these ideas together.
Put the Xerox memory prosthesis paper (or an excerpt from it) into the reader after the memory papers, as an example of applied theory. i.e. here is a principle of cognitive science that has been exploited and explicitly used in the development of a device.
Move the second Robson paper on analysing qualitative data to after the first one. This has the advantage of lumping the two horrors together - get the eyesores out of the way. I wonder if it would be possible to use summaries of the two papers instead? The papers need reorganizing and a new layout.
Placing these two together does pour a lot of cold water on the enthusiasm for qualitative ethnographic analysis that is built up in the initial papers on user studies. Hence, it might be worth interspersing them, or placing the qualitative analysis paper before all of the ethnographic study papers.
Of course, this reorganization pushes the larger projects into the latter half of the quarter, which may be a problem - it's a no-win situation.
The topics covered pretty much what I expected. If I could add one topic it would have to be usability studies, which don't seem to fit in anywhere else in the curriculum. You can't take a course on 'testing' it seems. So usability and other methodical testing methods get short shrift in academia.

2: Course Reader
The course reader was very well organized. The most interesting paper we read was probably the Twinkling Lights and Nested Loops paper by Nardi and Miller. The least appropriate paper was Storyboard Prototyping by Madsen and Aiken.
Twinkling Lights and Nested Loops
The paper gives you a taste of what ethnographic studies can achieve. Reading this after Blomberg's paper on ethnographic field methods gives a real idea of the result that the Ethnographic field methods produces. It is looking at the process from a different angle.
Blomberg walks through the process of setting up and doing a study, after which you end up with a collection of results. Robson's chapter six from Real world research will tell you what to look out for as you analyze the results, but he does not give too many examples of what the results would actually look like.
By demonstrating what ethnographic results look like, and illuminating the process by which the results were obtained, the Twinkling Lights paper makes ethnography seem a real possibility for studying users. The paper seems to say 'look - this wasn't so hard'.
Storyboard Prototyping
This paper is a relative waste of time - the Gould, Boies and Lewis paper Making usable, useful ... applications is much better at covering roughly the same topic. They spend several pages rehashing basics from the Hypercard manual and basic prototyping ideas. Granted, it is a CACM paper, but there must be something more significant to talk about than the fact that hypercard is a prototyping tool.
The one good idea in the paper is the use of playback of a user's interaction in debriefing a user after a test. Unfortunately there is no analysis or discussion about the implications of this technique, such as the possible post-facto rationalizations by the user.
The lack of quantitative data is not encouraging either, there appear to be no results worth mentioning - nor is there any information on the user sample used. College students are hardly a representative sample if what you wanted to test is the usability of a VCR interface.
The Gould et al. paper is a much better analysis of the same topic that Madsen and Aiken are trying to cover. They cover much of the same ground, and several other topics. The idea of high-level interface components is trated much more sensibly and explained much more clearly. Implementation is not just some Hypercard hack, but in a larger structure of ideas about user interface components and how they might be used. Gould et al. have thought through the consequences of their ideas - it's hard to say the same for Madsen and Aiken.
Other comments
I really enjoyed Making Usable, Useful...Applications by Gould, Boies and Lewis. An interesting paper which covers lots of ground.
I did not like: the Robson papers - they are verbose and poorly laid out, making for a very hard read of what should be simple material. The first Robson paper had something interesting to say - the second was less worthwhile.

3: New Paper Sketch
The reader needs a paper to tie the psychological principles together with the user interfaces - i.e. how can these tools be applied to user interfaces today. So the topic is applying cognitive principles to user interfaces we see today. This fits in with the last section, where we take a step back and look at the main course in context of other perspectives.
The paper is applying cognitive principles to well known iterfaces, analysing their flaws and suggesting alternatives. The inteded audience is well-read software developers: non-psychology readers. This would fit as an article in a software journal, such as Dr.Dobbs or Software Development or it could be a chapter in a larger look at HCI and cognitive psychology. For example: Shneiderman's introductory book on user interfaces "Designing the User Interface" could use this chapter.
Ending the reader on this note would form a nice closure when set against Don Norman's chapter from the Psychology of Everyday Things. Norman's paper looks at the psychology behind everyday things. This paper looks at the psychology behind human-computer interfaces.
Topics to be covered include:

Visual affordances in icons and buttons. Why does the user click on that arrow all the time? It's just pointing to the menu... Visual differentiation, foreground versus background.
Focus of attention, or, why the hourglass cursor is a good idea... The need for feedback where it is immediately visible and the idea that humans tend to focus at a particular point at a time.
Icon bars in word processors - how are they different from using the pull-down menus? A case of applied GOMS analysis. Pull downs require an extra step. Also note the need to search a menu for the correct item. Vice versa: note need to recognize an icon and map it to an action. "Scissor" implies "Cut".
Cut and paste - how could we improve it? Making things visible. What is that 'clipboard' they keep talking about? and why can't I see it?
The need for shortcuts - looking at the different needs of beginners and advanced users. Drool proof interfaces tend to frustrate experts.
Some case studies.
The last point can be quite interesting, since the paper could cover a lot of territory here. For example, it would be interesting to look at

Grandma and Me: a purely mouse-driven interface. How does the user know that clicking on a bush will do something? Color cues and subtle affordances for clicking.
Structured e-mail: the old Coordinator conversation-theory mail package by Winograd and Flores. Alternatively a modern variant on this using a forms based interface for standard mail requests.
Icon bar in MS-Word 6: why does Microsoft need mouse-dropping help balloons to explain what their icons mean?

4: Conceptual Design

4a: Concept
The concept I want to apply is one relating to the model cognitive processor and memory. The idea of moving processing of large amounts of data from a cognitive task to a perceptive task. The inspiration behind this was the NASA monitoring screens shown in Woods' paper on Behind Human Error.
The domain I want to apply this to is logfile analysis. A webserver generates a log detailing all of the file requests that have been made. Analysis of this log yields information about frequency of access, patterns of use and possible relationships between materials. Currently there exist a few summary tools for making sense of this data, but there is little in the way of advanced analysis tools.
I see the analyser presenting summary information graphically, rather than textually in tables. Tables are a cognitive burden for pattern recognition, while graphs can make pattern recognition simple. The trick then is to determine what sort of graphs are likely to be useful.
The tools should allow summaries of total number of accesses per time period and per file hierarchy. Combining these two gives us a 2 dimensional matrix of values to graph. Other features include:

The ability to compute an 'average' usage pattern, and graph 'significant' deviations from it, so that comparing values becomes perceptive. This should help catch errors quickly, before they become serious.

The ability to list particular kinds of accesses. Again, this is a case of embedding knowledge in artifacts. The kinds of accesses that are considered 'interesting' are embedded in a filter, rather than being remembered by the user.

Similarly, we want to filter out uninteresting accesses - i.e. fetching a page and all its related elements is uniteresting, while fetching a GIF image without fetching a page beforehand is more significant.

4b: Investigation and Interviews
The investigation method is to look at existing praxis and question users about their current uses of logs and what they currently look for in logs. We can try to make it easier to find those things they look for now.
Interviews were based around a set of questions to determine what they use logs for now. The questions are based on current significant areas of activity - i.e. the existing praxis of webmasters. I am a part time webmaster, so my questions are based on my inside understanding of common praxis, gathered through communication with other webmasters on mailing lists and personal experience.
Questions
Comments on the questions are in italics.

What sort of information do you try to get out of the logs?
Determine existing praxis.
How often do you check the logs and how much activity is there?
Is analysis ad-hoc or done regularly?
What tools do you use to analyse the data?
Is there a tool that fits needs already in use?
Is there some special activty that you look for? Is there activity that is important or unusual?
Exceptions stick in our memory - application of some critical incident analysis
On the other hand: is there activity that is common and can safely be ignored? How is it different from "abnormal" activty?
Filtering background noise reduces load on user
If I were trying to break in to your server, how might I show up in the access log?
Provoking a response by using a specific exceptional case.
How do you think people use your server? Is there a particular route through your server?
Is there higher level information about the contents that we can use?

Results
The informants responded informally. The informants are two experienced web masters, one beginner whose web site has only been active for a month or so. Responses were quite varied to the questions - each webmaster had particular concerns about accesses to their site, depending on what was stored on it. One query was conducted by e-mail, the others were face to face interviews using note taking recording rather than tape.

The current pratice is to check for known, "interesting" accesses.
Logs are checked frequently (daily) for interesting accesses, less frequently for general checks and analysis, partly due to the time it takes for the analysis to run.
Very simple tools are used - calculate totals and averages. No tools at all is also common.
Special activty is vaguely defined - it is an internalized metric - the informants have not put the metric into a tool or macro. This special activity is part of what is considered "interesting".
Similarly common ignored activity is not encoded in any object, but there is a distinct notion of what is ignorable.
It would show up as accesses to non-existant URLs or password rejects. No-one worried about alerting the webmaster or counting these accesses.
The respondents are definitely aware of a few usage patterns, but have no idea how common they are or if they still persist.

The praxis for analysing the logs is quite informal - there are a few tools to help analyse the log, but they are quite low level, and hence are not seen as valuable tools.
Given the ad-hoc nature of log analysis, it appears that log analysis is not seen as a valuable monitoring tool for much beyond the crudest of data. People prefer to work with raw data in the logs and focus on particular fragments rather than letting a tool abstract the information to too high a level.
The proposed analysis tool would have to be quite quick, and would have a grephical or menu driven interface. One complaint about current analyser was that there were far too many obscure command line options needed for it to do any useful work.
4c: Relevant papers
Beyond the Interface: Encountering Artifacts in Use, L. Bannon and S. Bødker. In J.M. Carroll (Ed.), Designing interaction: Psychology at the human-computer interface. (pp. 227-253). New York: Cambridge University Press (1991)
This paper is very relevant to the research, since the focus of the study is to discover praxis and leverage it in a new tool. The paper makes two powerful suggestions:

Work can only be understood from within praxis. In other words, it is important to understand what it is that webmasters do before you can provide new tools for their use.
This means doing user studies before you can even start designing a system, and soliciting ideas from the user community.
Artifacts embody a community's praxis. Our tools are shaped by the work we do, and tools evolve to fit a new style of work. In the real world, this is clear: a hammer mediates pounding things. In the world of the webmaster, the tools are more abstract. The closest we come to shape or heft is a user-interface experience.
With a suitable user interface, the tool fits into the work the webmaster is doing. So if the webmaster tends to work from a command line, then the tool should fit into that paradigm. A tool that insisted on spawning a huge GUI before being usable would not fit that style of work. It would fit a Mac user's style of work.

Behind human error: Cognitive systems, computers and hindsight Woods, Johannesen, Cook and Sarter (draft, pp 87-102, 105-109), CSERIAC (1993)
The point that Woods is making is that inappropriate presentation of the information contributes to misunderstanding and clumsiness. The raw data in a system log is clearly an inappropriate presentation for searching for patterns in the data.
Another problem is that with the entire log, you can only see a tiny part of it at a time, making long range patterns of use difficult to see. This is an instance of the keyhole property that Woods mentions.
The Psychology of Everyday Things, D. Norman, Ch 1, New York: Basic Books (1988).
Simply because of Norman's ever-useful rules for designing interfaces still hold true: make things visible, make relationships evident.