The Wisdom of Oz™

Using natural language processing for fun.

Awhile back (3½ years — hard to believe), I did a couple blog posts about the language in the Land of Oz books by L. Frank Baum. One post was about visualizing the language in the books, while the other post was a bit of a linguistic analysis of one aspect.

Those posts used just the words of the texts, but with other natural language processing tools, we can look for structures as well. Just for fun, I decided to look for suggestions in order to create something like Brian Eno’s Oblique Strategies. I looked for commands for the suggestions and I also threw in some aphorism-like phrases for good measure, using some work I had done previously, also for fun.

It is quite fascinating what kinds of commands and aphorisms exist in these children’s novels from over 100 years ago, though I’m not sure I’ll use them for “lateral thinking” à la the Oblique Strategies.

So, take a look at the Wisdom of Oz™, and see what you think (of).


Technical Notes

To do the analysis I used the spacy parser and then wrote python programs to process the output. Commands are actually tricky to look for in English, since there is no reliable indicator of whether a phrase is a command or not, unlike in many languages where the form of the verb indicates a command. Exclamation points can be used for exclamations as well as commands, and we often talk elliptically and leave out subjects (“Leaving for the store now!”) just as commands (typically) don’t have explicit subjects. In addition, parsing is hard! Even something as seemingly simple as “Run, don’t walk, to the nearest store” is difficult for spacy, and this is not at all a criticism of spacy — all parsers will have some kind of problem (and to be fair, humans sometimes fail at parsing too!).

So what I did was make a first pass to find some (but not all) potential commands. Then I went through them manually, choosing the real ones, and further narrowing them down to the interesting ones. I also did some slight editing to remove extraneous context and to randomly substitute third person pronouns for he/him and she/her. For example, in the book Wizard of Oz, the passage

“Go to the strangers and sting them to death!” commanded the Witch, …

The Wonderful Wizard of Oz (1900)

gets edited down to just:

Go to the strangers.

I tried to be nice (no stinging to death!).

I just took a quick look at “you should” as another source of suggestions, but there are actually only a couple in all of the Oz books, so I didn’t include them. Most of the instances of “you should” are either part of conditionals or part of counterfactuals. Here are two examples:

if you should fall asleep you are too big to be carried.

The Wonderful Wizard of Oz (1900)

I cannot understand why you should wish to leave this beautiful country …

The Wonderful Wizard of Oz (1900)

I followed a similar process for aphorisms as I did for commands, of using parsing to find possibilities and then manually selecting and editing those initial results.