Human Input Channels Affect Debugging
It is a well-known item in learning theory that the more sensory channels a human can get involved in the learning experience, the greater and quicker will be the learning. This means that the cost of building a fairly slick GUI interface solely for the purpose of debugging can often pay off handsomely in terms of overall increase in productivity. But why stop at the visual channel?
With the ready availability of high-quality sound cards, it becomes possible to employ the auditory channel in the learning experience. The synergy achieved can create an effective bandwidth that exceeds the bandwidth of the individual sensory channels taken separately.
A study done at MIT, reported in the book The Eye-Voice Span [Levin79], shows that by getting both of these important human input channels involved, a 10:1 increase in learning performance is possible.
How does this relate to the act of debugging? By judiciously placing voice responses throughout the system, reporting events such as objects exchanging messages, thread activation or suspension, changes in variables, and the like, the software can communicate with the programmer as it executes. Concepts such as focus-of-attention, perceptual delay, and so on can limit the rate at which the visual channel alone can respond (simple events such as noticing when a variable gets a "suspicious" value require finding the variable on the screen, causing it to display if it is not already displaying, examining the displayed value, and reacting to it could be speeded up tremendously if, at the breakpoint, the computer said "x is 10"). So even at the lowest level of debugging in the traditional sense, the auditory channel allows a higher rate of interaction. But how can we move this technique to a higher architectural level, to help us view our software at the object level we used to design it?
As an example, let us treat threads of control as high-level objects. In many modern systems, the existence of multiple threads of execution makes it difficult to establish cognitive contact with the software behavior. Often, this is because the context-switching time of the events (such as thread activations) is far faster than the human perceptual context-switching time. Setting breakpoints solely for the purpose of reducing these events to a human time scale introduces gratuitous complexity to the debugging task. In addition, we have to form a mental model of this behavior using careful analysis via a single perceptual channel.
If, instead, the program execution were slowed to human-perceptual time scale by the introduction of spoken descriptions of the behavior, we would most likely find that we could quickly do away with many of the actual "breakpoints," and more quickly form an accurate model of the dynamic behavior of the program. We could also spot anomalies. Imagine how useful it would be in debugging a packet protocol if we could hear the transactions spoken to us, including, perhaps, an occasional "Oops!" as a packet failed a checksum, or a client-server system could be debugged by listening to the client on the left channel and the server on the right channel (because that’s how we drew the architecture, with the client on the left!). While we can certainly reverse-engineer this from a set of debug printouts appearing in two windows, or interlaced in a single window, this is often the equivalent of the octal memory dump of days gone by, rather than the interactive model we can now support.
The cognitive structures we deal with as humans are multilevel. The experienced software developer has a complex multilevel body of knowledge, both application-specific and general, in long-term memory. Some of that knowledge has to do with general concepts important for programming but which are independent of any particular programming language or application. Some knowledge is language-specific, and some is domain-specific, and there is an integration of all of these into the models we use to represent the application. As we build systems out of object components, we apply the same cognitive models to the construction of these systems; but when we debug, we often lose sight of the overall object models because we are forced to focus on minutiae.
With a combination of simple animation display and spoken output, high-level object-related concepts such as methods, message flow, and data transformation can be presented as an integrated whole. If this is the domain we are debugging in, the presentational model will exactly correspond to our mental model, and anomalies will be readily seen.