Back To CourseACT Prep: Help and Review
42 chapters | 401 lessons
Benjamin has his master's degree in literature and has taught writing in and out of academia.
When you read a novel or poem, how do you analyze it? What kind of information do you look for, and how do you decide what it means? What can a British novel written in 1855 tell you about the society and people's lives back then? What can it tell you about what's being written today?
For most students and teachers, understanding literature involves close reading. Close reading is done by carefully reading and reflecting on a piece of literature, be it a poem, novel, or essay. You might pay special attention to characterization, to the pace of the plot, or to the symbolism and imagery found throughout a work. This has traditionally led to many interesting and complex theories about literature, but what if there is another way?
That's just the question that Stanford professor Franco Moretti is seeking answers to. Moretti has pioneered a new practice called distant reading, which is the opposite of close reading. Instead of carefully reading and analyzing a single work (or a group of works), distant reading takes thousands of pieces of literature and feeds them into a computer for analysis.
Distant reading attempts to uncover the patterns and unspoken rules behind literature from a very technical perspective. Where close reading relies on subjective analysis of what a single piece of literature means, distant reading compiles objective data about many, many works.
This idea is born from the fact that there are simply too many books for anyone to read and study seriously. If you're studying Victorian Literature, Moretti says, there are roughly two hundred novels in the canon to read. A literary canon is the list of important works that make up most of the literature that is studied. Comparing and analyzing two hundred books is a very, very large task, but Moretti sees it as still too limited.
You see, the canon represents a very small selection of what is written. There are tens of thousands of other books written during the Victorian era that are rarely (if ever) read any more. How could you possibly read through them all to make sure you had the 'whole picture' of what writing was like then? And what about all the other periods of history, and all the other places where things were written?
The simple answer is that you can't effectively study it all. No person can read and carefully analyze so much information, but a computer can. To this end, Moretti founded the Stanford Literary Lab, a part of Stanford University dedicated to using computers to analyze literature.
With the help of other academics and data specialists, Moretti has developed a system of using computers to analyze novels as raw data. While computers can't 'read' and understand a novel in the way people can, they are very good at searching for specific information you give them and finding patterns. They can measure sentence length, structure, and lexicon, and they can give a scholar patterns of data to analyze. In this way, distant reading is more of a practice than a literary theory.
For example, Moretti put the titles of 7,000 British novels published between 1740 and 1850 into his computer. He then had the computer count the words of each title, and compare the averages. The computer found that as time went on, the titles grew shorter. In 1740, the average novel had a title of twenty-five words. By 1850, that had shrunk to a much more manageable eight words.
It's natural to wonder 'so what?' when you hear that. On its own, that information is little more than a piece of trivia. However, Moretti took this information and looked into what else changed at the same rate as titles, and he found an answer: the market for books.
Before 1740, book production was a very small market in Britain, with five or ten new books published each year. As time went on, more and more books began being published. At the same time, authors started writing shorter titles.
Why might this be? What is it about having lots of competition that makes a writer need to come up with a shorter title? What is the appeal of a shorter title? What does it say about writers or readers that those titles flourish as more and more books are produced? These are all interesting questions, and they're questions we might never think to ask without distant reading.
This text-as-data approach also allows computer programs to learn how to spot what genre a story belongs to without being told. Interestingly, the programs use very different measures for this than a human reader would do.
For example, let's consider Gothic literature. You might recognize an example of this style by the horror-oriented villain, whether it be a supernatural menace like Dracula or the more mundane terror of institutions like the Spanish Inquisition in The Pit and the Pendulum. You might look for recurring themes of dark and dreary places, emphasis on death and melancholy, or many other elements.
By comparison, one of Moretti's programs would recognize a piece of Gothic literature by checking how frequently the word 'the' appears in the story, taking advantage of certain patterns of phrasing in that genre to determine how likely it is Gothic. Patterns like this elude human readers, both since we tend not to notice small details and because we don't have libraries of books in our heads, on-demand, to rapidly compare in the same way.
The rate at which 'the' appears in Gothic novels may not mean much in itself. However, it begs the question: What other patterns do we overlook in the things we read? What unspoken, unconsidered rules govern our favorite (and least favorite) forms of writing? Why do we seem to follow these rules that we're not even aware of?
This process has the potential to illuminate some processes occurring outside of the literary world as well. Moretti's Literary Lab is currently examining World Bank reports from 1946 to 2010. They are charting the frequency with which words like 'poverty reduction,' governance,' and others appear in those reports. What might the rise and fall of certain terms tell us about how the World Bank talks about economies and the state of the world?
Franco Moretti developed distant reading, a technique that uses computers to analyze literature as data. This allows for enormous amounts of information to be processed quickly. Distant reading emphasizes looking for patterns and rules that a human reader would probably miss and doing so over a larger collection of works than a person could realistically read.
To unlock this lesson you must be a Study.com Member.
Create your account
Did you know… We have over 95 college courses that prepare you to earn credit by exam that is accepted by over 2,000 colleges and universities. You can test out of the first two years of college and save thousands off your degree. Anyone can earn credit-by-exam regardless of age or education level.
To learn more, visit our Earning Credit Page
Not sure what college you want to attend yet? Study.com has thousands of articles about every imaginable degree, area of study and career path that can help you find the school that's right for you.
Back To CourseACT Prep: Help and Review
42 chapters | 401 lessons