Friday, October 5, 2012

Complexity Analysis of Tolkien's Works

Tolkien and The Lord of the Rings are now household names thanks to the huge success of Jackson's film adaptation of the Trilogy.  The Hobbit will soon follow, as Jackson finishes another film trilogy adaptation for it. However The Silmarillion, the book which provides the backdrop for the stories of The Lord of the Rings and The Hobbit remains out of reach for many people, even among Tolkien fans. One of the main reasons given is that The Silmarillion is a difficult book to read. I think I agree with that claim, although I would also add that the enjoyment I got from The Silmarillion surpasses that which I got from The Lord of the Rings (LotR).

I read LotR before Silmarillion.  I almost gave up reading LotR...twice. But once I got to the Mines of Moria chapter, I couldn't let go of the book.  I am already a Tolkien fan when I started reading the Silmarillion, so perhaps that eased out a bit the effort of reading a "difficult book". But the question remains: What makes Silmarillion a difficult book?

A blogger tried to answer this question by a textual statistical analysis of the 3 books of Tolkien mentioned above. The post is entitled Visualizing Tolkien. I will only describe and comment on the first part of the blog, in which the blogger makes a statistical analysis of the texts of the books . Please read that portion first before continuing with this article (the blogger also made visual representations & word clouds, but I won't comment on those).

Now that you've read the first part of the blog (I hope),  I give my comments below.  

Looking back at my experience in reading Silmarillion, what really overwhelmed me was the number of new names, places and things that come out as you read the book, especially in the first chapters. On top of that, many of these are given in a non-familiar language (e.g. Quenya Elven, Sindarin Elven, Dwarvish, Black Speech). I think the key to textual complexity analysis should be centered on this.  Doing an ordinary textual analysis on all words will not mean much, since, as shown in the data he presented, only the most common English words and most frequently-appearing characters in the novels will figure in the statistics. This explains why even the blogger's invented "originality index" (number of unique words divided by total number of words) did not produce the results he expected. The Hobbit, which was supposed to be the easiest read, got the highest originality index value, meaning, it's supposed to be the more difficult read, something completely the reverse of what should be expected.

And so here I present a suggestion on how to go about the textual complexity analysis of the 3 books. 

1. Focus on Tolkien-specific words.
By these I mean words/terms that are only attributable to Tolkien's novels. Examples of these are Gandalf, Elrond, hobbit, orc, ent, lembas, Rivendell, Argonath. English-accepted words, but with nevertheless Tolkien-specific meaning, can also be included here, such as men, elf, goblin, Merry, Sam.

2. Establish an  ease of understanding index, a scale for measuring the complexity of each Tolkien-specific word.

The following factors could be taken into account:

a. The number of occurrences of the word.
b. The spacing of the occurrence between words

For example : 

The first occurrence of the word could be given a constant rating.
The rating of the second occurrence, could depend on the gap since its first occurrence.

The rating of the third occurrence,  could depend on the gap since its 2nd occurrence ;  or, on the average gap among the 3 occurrences.

3. Calculate this ease of understanding index for all Tolkien-specific words in each book.  

4. Get the average of this index for each book.

5. Compare the average index of the 3 books.

I think running a program with the above process will give Silmarillion the lowest average value, thus establishing it to be the more complex among the 3 novels.

Of course, this is still a speculation.  Hopefully someone will be able to validate it.

Like Asymptotes on Facebook
Follow Asymptotes on BloggerJoin Asymptotes Blogger site
Follow Asymptotes on Twitter@_asymptotes_

No comments:

Post a Comment