nickm.com > BAthesis || interactive fiction & e-lit

Interfacing with Computer Narratives: Literary Possibilities for Interactive Fiction

Part III
Methods of Interaction

6. Gesturing with the Mouse

While co-authoring a fiction can be an enjoyable activity, guessing the vocabulary and grammar-parsing ability of a computer is seldom as enjoyable in practice as is this idea of co-authorship. Most commercial interactive fictions today use a simpler mouse-driven interface.

These graphical interfaces fall into two basic categories. The first is a browsing interface for a hypertext or hypermedia document, such as the interactive fictions Afternoon and Myst. The second is a more complex interface for interactive fictions in which the interactor can manipulate his or her environment, either directly or through a character he or she controls, in a variety of ways that go beyond the type of discovery that a hypertext or hypermedia document allows.

The mouse-based graphical interface is ideal for hypertext, "composed of blocks of text ... and the links that join them" (Landow 4) and its multimedia cousin, hypermedia. A hypermedia document presents the interactor with a space of connected works: text fragments, images, animation, sounds. Since the interactor's role in this world is exploration, there is only one fundamental action, "following a link," which the binary language of the mouse needs to communicate. (Note 11) "Following a link" is the process of clicking on an "active," "hot," or "linked" area in one of a hypermedia document's media fragments and moving to another fragment that is connected to the current one. A mouse interface to a hypermedia document is uniform - even across different media - because it means "follow a link," no matter whether the interactor clicks on an image, a running animation or a piece of text. It is consistent, since a gesture is always appropriate to indicate the direction in which to travel. Since the interactor has a fixed number of choices and can interact only at certain points (after the text fragment or other media fragment has been displayed), this interface is not continuous or highly interactive.

While not as difficult to input as a textual command, a gesture with a mouse is also not as articulate a means of communication. Well-suited to browsing through an environment, the mouse does not serve as clear a role when the interactor's ability to interface with the fictional world goes beyond looking around. Cursor movement and a click are well-defined to the computer, but some interfaces often leave the interactor to wonder about the semantics of a particular mouse click. When an interactor clicks on an "active" region of the screen, this click (unambiguous in a hypermedia document) usually means little more in other types of interactive fiction than "perform an action on this." The action could range from eating a food item to walking to a different location. Often, the interactor is surprised at the result of a mouse click - not because of a twist in the narrative, but because an unanticipated and unintended action resulted.

The Legend of Kyrandia presents a good example. This is an interactive fiction similar to those of the King's Quest series. The interactor controls a small animated adventurer who can move about on a stationary image, which is a picture of a three-dimensional area; if the character walks off the edge of the image, the screen will be refreshed with an image of the area into which he just wandered and he will appear on the opposite edge of this area.

In Kyrandia, interactors can select a particular object to use and then a region of the screen on which to use the object. The ability to select objects and "perform actions" with them allows a sharper definition of what action would be performed than would a simple arrow on the screen - "do something to this part of the environment using this object" versus simply "do something to this part of the environment." By selecting objects, the Kyrandia interactor is able to change the pointer into an image of the selected object and then "use" the object. Other interfaces that allow the interactor to select different pointers with different semantics include King's Quest VI, which has various icons (e.g. an eye with which the interactor can examine things, a mouth with which the interactor can speak to other characters) which the interactor can choose from. After selecting an icon, the pointer assumes that shape and the interactor can examine a specific object on the screen, talk to a specific person, etc., by pointing and clicking on the area of interest.

This addition of semantics does not clear up ambiguity, however, as the result of "using" the selected object in Kyrandia, for instance, varies widely from object to object and place to place. In one forest area, selecting a set of emeralds (in the inventory of the character) and then clicking on the ground causes the character to hide behind a bush and throw the emeralds one by one on the ground. As each emerald falls to the ground, a gnome runs out to retrieve it and the character attempts to catch the gnome. Finally, on the third try, the character succeeds in snaring the gnome, whom he then interrogates. This entire elaborate sequence of actions is supposedly communicated by the interactor to the computer by his or her simply (A) selecting an object and then (B) clicking on the ground. Yet at no other point in this interactive fiction can the interactor hide behind something and throw an object out onto the ground, no matter what object he or she selects or on how many areas he or she clicks.

Another way of adding semantics to the mouse pointer is used in Myst and The Seventh Guest. In these fictions, the pointer assumes a different shape depending upon the context in which it appears, that is, into what region of the screen it has been moved. In The Seventh Guest, for example, the environment is laced with puzzles (literal puzzles, like chess problems, scrambled phrases and maze games) which the interactor must solve in order to unlock the house's mysteries. When the interactor moves the pointer over an object that is such a puzzle, the pointer becomes a skull with an opened top and a visible, throbbing brain, indicating that a click will display the puzzle. If the pointer is moved to the edges of the screen or to a doorway, the pointer becomes a beckoning hand to indicate that a mouse click will take the interactor to a new area. The total uniformity of the fundamental hypermedia interface is lost, but the interactor is always aware of what a mouse click means because the pointer conveys this information. A wide variety of computer applications, including some hypertext systems, adjust the pointer's appearance in this way to indicate what a mouse click in a certain place means.

Typical screen from Myst

In 'Myst,' the pointer (lower middle) changes shape to show its different meanings.

More complex types of mouse interfaces try to capture more of the semantic possibilities of a text parser. An extreme example is the "action interface" of Return to Zork. The pointer in this interactive fiction is context-sensitive. After the interactor selects an inventory item and an object with which to interact, the interface presents him or her with a diamond-shaped array of further choices, so the interactor can specify what particular interaction is desired, e.g. tie the things together, hit one with the other, insert one into the other. The action interface presents these options as a group of animated icons (e.g. a hand tying a rope, swinging an object, pushing an object into a slot); the meaning of each can be clarified by placing the pointer over the animation for a moment, eliciting the name of the action. Although all possible options are made explicit by this complex graphical interface, the array of vaguely gesturing hands that appears after each mouse click can leave the interactor just as bewildered as a series of uncomprehending messages from a text parser.

Using a more complex interface can be just as confusing as a text interface is and has the added disadvantage that the possible interactions with the world may be made explicit at all times - and thus the limits on the interactor's influence are also explicit. In a good text interface, some of the interactor's "futile" attempts can be permitted and appropriate responses can be generated so as to maintain narrative continuity. In a graphical interface in which all the possible interactions are displayed, whether in a menu or an "action interface" diamond, what interactors can do is explicitly limited, diminishing the interactors "perceived range of freedom of action," as Laurel calls it. She observes that in good computer applications (including good interactive fiction) "constraints should limit, not what we can do, but what we are likely to think of doing." (105) In other words, it is less intrusive on the interactor's experience if the interface channels his or her actions subtly toward input meaningful to the narrative, rather than forcing the interactor to enter only applicable commands and making the limitations of his or her interaction obvious.

While mouse interfaces can be simple, as the ease of use of hypermedia interfaces demonstrates, they can also be difficult to use if the attempt is made to give them greater semantic power. Using a simple mouse interface relegates the interactor to the role of uncovering things in the environment of the interactive fiction. The simplicity of the existing well-tried interface for hypertext has served this type of computer narrative well, but the interactor in these works acts as a discoverer. If the interactor is to be a co-author, an interface that can more easily convey larger amounts of meaning is needed.


_____________ Nick Montfort