Inspired by the following video of Kurt Vonnegut, I decided to graph the general mood in well-known stories.
I wrote a simple parser which evaluates a given text based on two collections of keywords, one collection of words indicating a positive, up-beat tone and the other collection indicating a negative, dramatic tone. I then graphed the results.
Let’s start by looking at fairy tales:
Hansel and Gretel by the brothers Grimm
As you can see, a negative mood prevails at the start of the story. It’s a poor household and a terrible plan is hatched. Hansel and Gretel are abandoned, twice, but Hansel is optimistic and pragmatic. Things don’t look too bad. However, they grow increasingly hungry and tired, and if that wasn’t bad enough they meet an old woman who prefers to eat children rather than candy. However, everyone lived happily ever after. Except the witch. And the mother.
Little Red Riding Hood – the version by the brothers Grimm
This graph doesn’t work as well, since it is a very short story, but is fairly accurate. The reason the negativity doesn’t appear until the end is because Little Red Riding Hood is happily on her way to her grandmother, with the only outspoken drama when the wolf “ate her up”, which does not trigger my parser. Only towards the end do the horrors come out in the open.
Let’s take a quick detour to larger stories before revisiting fairy tales in a comparison of Disney and the original versions.
Pride and Prejudice by Jane Austen
Pride and Prejudice appears to be entirely up-beat! Well, let’s have another look, now making the positive series transparent.
That looks better (i.e. worse).
A Tale of Two Cities by Charles Dickens
Accurate? I haven’t read the novel, but the plot is far darker than this graph indicates. Perhaps Dickens uses a vocabulary which is not well covered by the collections of words I’ve used.
Now, let’s move from Dickens to Disney.
The Little Mermaid (transcript of the Disney animation, 1989)
Something terrible peeking up in the middle there. Now let’s compare this to the original, which is supposed to be rather more brutal.
The Little Mermaid by Hans Christian Andersen
Strikingly similar, really. Can anyone tell me what the negative event halfway through is?
Snow White (transcript of the Disney animation, 1937)
Disney’s version has a very clear and simple story arc much like Vonnegut’s first graph.
Snow White (by the brothers Grimm)
A significant difference. Grimm’s version is balanced and, well, more grim compared to the Disney script, though still leading to a happy end.




Just noticed I made a mistake. The y-axes do not represent line numbers, but the number of words. I will correct this next week.
Any suggestions for continuing to look at particular stories in this or other ways, let me know.
Some ideas I will look at are:
1. Comparing results from large versus smaller but carefully selected collections of words.
2. Using more specific categories than positive or negative (e.g. Love, Conflict, Good Fortune, Ill Fortune).
3. Pitting New Testament and Old Testament against each other and other literature from which one would expect interesting results.
Like this!
An idea – appify it so users can (a) specify/upload lists of words in two categories (b) upload a text story file© graph their own results.
Google AppEngine + Highcharts should do it.
that was meant to be c in parenthesis, not the copyright symbol
That is an excellent idea. That route hadn’t occured to me. I will definitely have a play around with it on Google App Engine.
Thanks Paul.
[...] I didn’t want anyone to unintentionally convey more than I wanted to know, but remembered the text parsing I had done and decided to run the script on the Game of Thrones book. I haven’t read it, but the show is [...]
I’m reading Tale of Two cities right now…and so far…i’d say it’s not nearly as happy and what the graph says. Although everything is going pretty well so far with all the “good” characters…the evil or selfish ones are doing horrible things. The situation of the commoners is pretty aweful too. I wouldn’t say the plot is this postitive.