My Generation about Talking

Nick Montfort

Software Studies Workshop - UCSD - 21 May 2008

This talk was written to accompany the execution of yes_voices.py, a program which has been translated to Brazilian Portuguese by Cicero Inacio da Silva: vozes_sim.py

Recently, both in my writing practice and as I teach, I've been interested in very simple computer programs that engage with language. Today I'll be sharing fifteen non-interactive text generators that each do the same essential thing: they affirm, by producing language that says yes.

We have heard and will hear more discussion about how important it is that at least some of us who study digital media, if not all of us, be able to read and understand code. There are analogies to learning human languages that suggest we should learn to write code as well.

I think the "literacy" metaphor and the connection between human langauge and programming language is actually overrated. Nevertheless, programming ability is valuable for digital media scholars. We don't learn to read mathematical proofs without constructing proofs ourselves. Scholars of printing learn a great deal by composing and setting type themselves.

However widely accepted "Software Studies" is today - and with a book of that title out from The MIT Press, and Ars Electronica and SLSA events on code, it is certainly gaining acceptance - it's more widely recognized now that programming is not just a detail. It is a foundation of creative computing.

Leaving aside for now the issue of the programming language to be used, what programs shall we, scholars and students of digital media, write? First-person shooters? Enterprise solutions? File systems? Fibonacci and prime number generators? Celsius to Fahrenheit converters? Hello world?

If our interest is in how langauge and computing intersect - and that's where my interest is - it makes sense to start with programs that generate language. Specifically, why not write simple programs that engage language in an interesting way, going beyond the "Hello world" programs Mark Marino discussed earlier?

Instead of printing a fixed phrase, it's possible to have the computer pick one of several options. That's one thing the computer can easily, perhaps "naturally" do, the way that Burroughs's and Gysin's scissors could natrually cut up a text on a sheet of paper.

These generators, or voices, produce sentences based on random numbers - numbers as random as the computer can provide. This compels the author of a voice to write in a different way - writing not a sentence, or a sequence of sentences, but a distribution over sentences.

In these simple examples, each sentence that is shown is drawn independently from a distribution. A more advanced exercise would involve conditioning the current sentence on the last sentence, or the last few, allowing progression or regression and perhaps connecting to Beckett's use of repetitive language even more strongly.

But I find that even sentences generated independently at random and placed one after another can still be enough to scaffold the imagination and allow us to envision a character and a scenario - without being particularly lyrical, without progression, without explicitly setting the scene, without enplotment.

It is possible to create a non-random language generator, for instance, one that uses the time as input and acts as a sort of clock, as seen here. Another non-interactive, deterministic program might generate langauge based on the contents of a directory, or a text file, or a URL.

Of course, the manipulation of langauge is not the only possibility. The more retinally inclined beginning digital media programmer could write code to manipulate images, implementing different sorts of transformations, color adjustments, and filtering. Sound is fair game, too.

I've worked with text, and I've also chosen to write these voices in Python, a langauge that I like and that I think is suitable those in the humanities. These generators could have been written in something else, though - Javascript, Perl, Pascal, or even Microsoft BASIC from the good old days.

Programming is a general skill, but the particular language - which functions as part of the platform - does have its influence on how a program can be created, revised, shared, and discussed. Whether a language is provided by default (as Python is on the Mac) is another important platform-level issue.

I see learning to program, and concern about the code level, as distinct from the lowest level in the five levels that Ian Bogost has discussed, that of platform. The understanding of digital media requires that we understand the software that makes up platforms (BASIC interpreters, Java runtimes) and the software that makes up artworks and other creative programs.

At the platform level, we could discuss why it might be harder to write a program like this on a Windows Vista machine as opposed to a Commodore 64. We could discuss what might make Python better than BASIC for projects like this today, technically and in terms of the embedding of these languages in society.

At the code level, we could consider why I made specific choices that I did, given that Python allows variables to be named arbitrarily and things to be done in several ways. The levels are adjacent and are strongly related, but it's still possible to usefully focus on one or the other.

These voices are simple but, in my experience, very useful investigations into what the computer can do with langauge. I invite you to take a look at the program, which is available for download from nickm.com and grandtextauto.org, and modify it or write one of your own to create your own voice.

Generating sentences independently at random is a legitimate capability of the computer, but hardly the only one. Before we even consider complex operations such as parsing, it's worth remembering that the computer can sort, reverse, and transform according to rules as easily as scissors can cut paper.

Writing a text generator of a few lines is experimental in the sense that Burroughs used the term - it is something to do, something offering insight into computing and language. So, I suggest to everyone who has tried running with scissors: Try executing with a computer. See what code can do to language.