William Wilson and Dr. Deryle Lonsdale, Linguistics Department
Main Text
Accurate and reliable L2 testing (or second language testing) is something that many entities rely on, including educational institutions, business enterprises, and the government. Researchers over the last several decades have proposed new testing methods, and a particularly promising venue is Elicited Imitation (EI). Simply put, an EI test is a sentence imitation test. Test-takers hear a series of recorded utterances in the target language, repeat those sentences as exactly as possible, and then receive scores based on the accuracy of their repetitions.
Essentially, my ORCA project was straightforward: Construct an EI test for the Korean language. However, in order to maximize on the usefulness of the project’s outcome, I also placed specific parameters to answer an important question: Does the gender of the interlocutor (or speaker being imitated) affect the participants’ test outcomes? This required the engineering of an EI test that would 1) implement both male and female interlocutors, and 2) control for other factors that may cause variance in results. I believed that interlocutor gender would affect test outcomes, but how much, I could only postulate.
I hypothesized that in a Korean EI test, the difference in test response accuracy would exceed 10% when test items administered by a male interlocutor were compared to those administered by a female interlocutor. This threshold was chosen because 10% is a standard letter grade in many academic grading procedures.
The first task was procuring Korean sentences to use in the test. Professor Jang Seokbae, a visiting scholar of Brigham Young University specializing in corpus linguistics, provided access to “Handoumi,” a concordancer for the Korean language, which was used to search for sentences in a corpus of 30 Korean textbooks ranging from beginning to advanced levels.
In total, 25 sentences were included on the EI test. They were organized according to syllable length: 7 sentences ranging from 1-10 syllables constituted the “short sentences” group; 9 sentences ranging from 11-20 syllables formed the “mid sentences” group; and 9 sentences exceeding 20 syllables were included in the “long sentences” group. As a whole, the syllable length of the 25 sentences steadily increased from 5 to 43 syllables. This made it possible to analyze both the entire EI test and each sentence group individually.
A male interlocutor and female interlocutor who shared very similar demographics were chosen for the study. Both interlocutors individually recorded the entirety of the test (25 sentences), using an acoustics program called Praat in a sound-free environment. They were specifically told not to record their utterances in a slower voice for ease of understanding; rather, they were required to record their utterances at a natural flow, as if speaking with a fellow Korean.
This resulted in a male and female recording for each of the 25 test sentences, totaling 50 recordings. These recordings were then used to create Test Form A and Test Form B. In Test Form A, sentences alternated so that odd-numbered sentences were male voices and even-numbered sentences were female voices.
Test form B was opposite; odd sentences were female and even sentences were male. This ensured that the overall difference between male and female interlocutor comparisons wouldn’t be affected by variation in participants’ test-taking skills.
Due to the very small pool of L2 Korean speakers available (and my need to request their assistance in the future), I chose to invite only 4 male L2 Korean speakers to participate in this pilot study. This means that the results of this study are not significant, but are only indicators.
The 4 participants were all majoring in the Korean language and were taking 400-level Korean language courses. Furthermore, each participant showed a command of the language even among 400-level students.
The test forms and their accompanying recordings were used to create an Elicited Imitation testing program that operated on a Macintosh computer. The program instructed the participant to imitate the sentences he heard as exactly as possible. After listening to two sample sentences, participants pressed START and began imitating Korean utterances.
Two of the participants used Form A, and two used Form B. The program randomly ordered the 25 sentences in each form. After an interlocutor’s utterance was played, the participant was given a number of seconds, reflective of the length of the sentence, to record their imitation. Total test lengths were about 5 minutes long.
After all testing was completed, the recordings of the participants were scored according to accuracy. In doing this, I followed grading procedures that are currently used in English EI tests1.
Ultimately, test results did not meet my established threshold of 10% in male/female interlocutor differences. However, female interlocutor averages were consistently higher than male interlocutor averages, and this reflected a common belief among males in the Korean department of the university—that female Koreans are more easily understood than male Koreans. It is possible that the higher pitch of female Korean interlocutor made her utterances sound slightly more “clear” when compared to the deeper voice of the male Korean interlocutor.
What perhaps is most exciting about this ORCA project is the foundation that it has set. Now that a Korean EI pilot test has been established, the project has been handed over to a fellow researcher who specializes in computer science. He will implement a Korean Automated Speech Recognition (ASR) program into the EI test for automatic grading. With such a tool, institutions will be able to save a significant amount of time and money in measuring the L2 Korean skills of future students and employees.