- Sierra Eckert
- Allison Chaney
In a novel, time does not often move evenly or linearly–––a single paragraph in García Márquez’s One Hundred Years of Solitude jumps several decades while in Proust’s In Search of Lost Time, 15 pages are devoted to a single moment of eating a madeleine. In this project, we are interested in the kind of language that used to talk about time and what is the shape and tempo of this language in a given text. Tracking what we call the “time signature” of a text, we use explicit references to time passing in order to divide up a text, and then use probabilistic methods to analyze the relationships to the kinds of words that cluster around changes in novel time and how these temporal patterns are distributed in a text. We hope to explore how “time signature” offers a lens into the location and duration of temporal language in a novel and how such “time signatures” might be used to predict other aspects of a text such as author, genre, format, and place of publication.
In the first phase, we have plotted our potential time anchors, parsing them out by magnitude: year, month, day, hour. These two plots include samples of 18th and 19th century novels, and one visualization of the corpus of a nineteenth century author, George Eliot. Building on this, we will use these words denoting time passing to chunk the text in four different ways, and run our model on these four different “chunkings” of the text.
- Cleaned corpus and developed preliminary stopwords list for initial LDA model (October 2015)
- Compiled a test corpus for the time-signature project (August 2015)