Erik Lewis and Dr. Dale J. Pratt, Spanish and Portuguese
This report is designed to chronicle my experiences while performing my research and to explain the subsequent conclusions that I have made. In its inception, this project was to cover three novels by Benito Pérez Galdós, Spain’s most prolific author. However, after consulting with Dr. Pratt, we thought it favorable to enlarge the electronic database to seven novels.
The initial leg of my project consisted of obtaining Galdós’ seven novels in the form of an electronic database. I then encoded the database with symbols that would enable WCView©, a word cruncher program to interpret, display, and crunch my written corpus. This process is known as formatting the text.
After marking the entire corpus, I began scanning the text for words, combinations of words, and select phrases. Due to the limitations of the software, I was only able to search for two words at a time. However, the software was very helpful in truncating for both words in a variety of contexts. Disappointingly, I had little luck using two-word combinations within several words, lines, or even on the same page.
One of the most useful facets of WCView© is the alphabetical interface that accompanies the word search. This technology allowed me to search for number and gender variations of many words. For example, when searching for the frequency and location of the word “futuro”, the word appeared in an alphabetical list that made finding “futuros”, “futura”, and “futuras”, as well as the superlatives “futurisimos/as” very easy.
Computer-aided analysis of these novels proved invaluable in my topic-oriented research. Before I used WCView© to analyze these novels, my insight into the thematics was clouded by the moderately complicated plot, and the dozens of characters that are the hallmark of Galdósian narrative. WCView© allowed me to analyze themes independent of their context. To have such attempted such research without the aid of a word cruncher program would have been futile.
Contextual analysis proved more time consuming and tedious than any other aspect. Since word crunchers in general are very poor at, if not totally incapable of distinguishing context, I was obligated to read much of the text surrounding each word search. This obligation was enjoyable, but the limitations of the software proved frustrating when searching for nouns such as “Poder”and “General”, whose use as a verb or an adjective is far more common than the noun usage. The words that I used in the search were based on themes that I had noted from my experience in reading Galdósian novels, namely, science, social economics, the church, the government, and time. These topics represent the most prominent facets of life in 19th century Spain.
In many instances it was interesting to see the contrast between the frequency of a masculine adjective and its feminine counterpart. Is it surprising that “Preso” is used twice as often “Presa.”
Why is the masculine form of viejo, anciano, caduco, grosero, perverso much more common than their feminine counterparts? Can we understand Galdós’ psyche better by knowing that he chronically uses feminine forms hermosa, bonita, guapa, more often than their masculine equivalents? By asking myself questions like these, I was able to, albeit subjectively, derive answers and insights into the meaning of the novel and ultimately, the life of Galdós himself.
One striking insight into these novels, (and probably the Spanish language as a whole), was the frequency of adverbial or noun forms of words with relation to their connotation. Nouns with positive connotations such as placer, deseo, and gloria were abundant. While each of these words appeared several hundred times, their adjectival counterparts were employed an average of only 25 times. Conversely, adjectives containing negative connotations, such as malo, pobre, hipócrita and ciego, appeared several hundred times, while the noun forms maldad and pobreza, appeared very seldom. Hipocresía and ceguedad, the noun forms of hipócrita and ciego, were not used at all. This trend pervades these novels and is the case— almost without exception—of every noun-adjective pair employed by Galdós.
In addition to using the WCView© software, I had planned to also use a supercomputer at the National Security Agency (NSA) to do categorical research on the text know as n-gram analysis. Since n-gram analysis is a language-independent means of categorizing text, I was initially very excited about using it to gain additional insight into the thematics, stylistics, and topical structure of the novels. This method of research seemed to hold a lot of promise and potential for revolutionary research in the literature, regardless of the language. While we did send a portion of the written corpus to Marc Damashek of the NSA for testing, the results of his research were largely disappointing. After counseling with Marc and Dr. Pratt, it was decided that n-gram analysis is of little value in analyzing narrative fiction.
Word crunch analysis is also helpful in understand the time period in which the text was written. Nineteenth century Madrid appears very different from the modern-day city. References to alcoholism, drug abuse or any substance addiction are totally absent. The universe and planets, are referred to in vague and general terms, with little or no feeling of knowledge or intimacy with them. However, references to human nature are very similar to what one would expect from a modern-day novelist.
Finally, Galdós seems to paint a self-portrait in the character of Pablo, a blindman. His words seem to represent Galdós’ own feelings, “I feel in my heart such joy!..It seems as though the Universe, the Sciences, History, Philosophy, Nature, and all that I have learned is bundled up inside me and spinning around … it’s like a procession.”1 This procession of which Galdós speaks is chronicled in his novels. In addition to reading his novels for diversion or entertainment, word crunch programs like WCView© possess the power to provide invaluable insight into textual thematics, and the life of the author.2
1. Translations by Erik Lewis
2. The aid of Marc Damashek of the National Security Agency is gratefully acknowledged.