Jarrett Lever and Dr. Deryle Lonsdale, Linguistics Department
An elicited imitation (EI) test is a simple, indirect method of assessing language ability and proficiency. EI works by having subjects hear a determined item (or sentence) and having them repeat the item back as close as possible to what is heard. The theory of EI proposes that to hear a sentence, process its meaning, and produce an imitation exactly the same in meaning as that given requires the subject (test taker) to have a level of proficiency in the language being tested equal to that which the item is examining (Bley-Vroman and Chaudron 1994). Though EI has been shown to successfully apply to evaluating second language acquisition (SLA) in languages such as English (C. R. Graham 2008), French (Millard 2011), Spanish (Thompson 2013), Japanese (Matsushita and Lonsdale 2012), and Mandarin Chinese (Wu and Ortega 2013); no work has been recorded in the literature for Portuguese.
Can I develop a Brazilian Portuguese (BP) elicited imitation (EI) test whose scores show a significant correlation with the oral proficiency interview (OPI) ratings for learners of the language?
After receiving approval from BYU’s Institutional Review Board, I developed and selected EI test items (i.e. sentences) according to the criteria of grammar difficulty, length in syllables, and lexical difficulty. I based the grammar on a modified form of the LS Grammar Grid—Spanish from the ILR Handbook of Oral Interview Testing for Spanish (Lowe 1982) that I translated and adjusted for use in BP.
I chose over 100 sentences from O Corpus Do Português (Davies and Ferreira 2006-) whose most salient grammatical features matched specific criteria of the grammar grid. I shortened the original sentences to various approximate short, medium, and long lengths and removed all proper nouns. I recorded the best 84 items with both a male and female elicitor using high quality recording equipment in room 1141 H of the Humanities Learning Resouce Center (HLR) of BYU. Of the 168 items recorded, I chose 42 items that were recorded from the male elicitor and 42 items from the female elicitor to represent all of the the 84 items on the test.
I divided the items into syllables using BYU’s WebCLIPS grammar tutorial Divisão Silábica (Bateman 2005). I next analyzed the lexical difficulty of the items using the lexical frequencies of the 5,000 most frequent words in BP as found in a Portuguese frequency dictionary (Davies and Preto-Bay 2011). Excluding funciton words, I assigned each word a lexical frequency number (LFN) and assigned each item a mean LFN.
As little work has been done to combine (a) item length in syllables, (b) frequency-based lexical difficulty, and (c) grammar difficulty into item difficulty; I developed a procedure for selecting items that appropriately stresses item length in syllables (C. R. Graham 2010) and combines it with three tiers of lexical difficulty and one tier of grammar difficulty. I followed this procedure until 51 items were selected.
I recruited a total of 42 volunteer participants from (a) individuals who completed the OPI during the Fall 2013 semester through BYU’s Center for Language Studies (CLS), (b) from 100 to 600 level Portuguese courses at BYU, (c) from native BP speakers, and (d) from non-speakers of BP (i.e. persons who had no training in BP and as little training possible in Spanish and other foreign languages, especially Romance languages); all of whom were BYU students at the time.
Either a fellow research assistant or I proctored the EI test on campus; the independent third party Language Testing International® (LTI) administered the OPI to participants. Participants’ EI responses were then scored twice (by two different human graders) for the percentage of syllables that were correctly repeated back in all items. LTI evaluated the participants’ OPI recordings and returned their OPI ratings.
Results and Discussion
I analyzed the EI and OPI scoring data using correlational analysis and Item Response Theory (IRT). The correlational analyses between the OPI ratings and the humangraded EI scores from round 1 (r = 0.93 and R2 = 0.8729) and round 2 (r = 0.92 and R2 = 0.8534) of grading indicate a significant degree of correlation and prediction power. The strong correlation between the two rounds of human-graded EI scores (r = 0.98 and R2 = 0.9638) indicates that the human graders were very consistent in their grading.
Compared to classical test analyses described beforehand, IRT analysis evaluates the performance and reliability of the individual EI items. Of the many measurements found using IRT analysis, the most significant results are the person reliability value and the item reliability value. The person reliability value of 0.98 indicates that if participants took these items twice (without remembering the first test experience), there is a 98% probability that they would perform similarly on a second set of test items. In other words, the test is very consistent internally. The item reliability value of 0.98 indicates that the 51 EI items had a wide difficulty range and the population size was large.
In this study, I endeavored to determine if an oral proficiency test for Brazilian Portuguese (BP) that is based on elicited imitation (EI) theory could have a significant, even predicative, relationship with the oral proficiency interview (OPI). I thus developed an EI test with 51 items, recruited participants from the complete range of BP oral proficiency, administered the EI, had participants complete the OPI, and used both statistical analysis and Item Response Theory (IRT) to analyze the results. The significant results indicate that an EI test can be developed for BP that correlates significantly with OPI ratings. Further work needs to explore the degree to which sentence length in syllables, average frequency-based lexical difficulty, and grammar difficulty individually contribute to item difficulty.
- Bateman, B. (2005). Divisão Silábica. Retrieved May 15, 2013, from BYU WebCLIPS: http://webclips.byu.edu/
- Bley-Vroman, R., & Chaudron, C. (1994). Elicited Imitation as a measure of second language competence . Research methodology in secondlanguage Acquisition, 245-261.
- Davies, M., & Ferreira, M. J. (2006-). Corpus do Português. Retrieved January 1, 2012, from Corpus do Português: 45 million words, 1300s- 1900s: http://www.corpusdoportugues.org/
- Davies, M., & Preto-Bay, A. R. (2011). A Frequency Dictionary of Portuguese. Routledge: Taylor & Francis Group.
- Graham, C. R. (2008, May). Elicited Imitation as an Oral Proficiency Measure with ASR Scoring. LREC.
- Graham, C. R. (2010). The Role of Lexical Choice in Elicited Imitation Item Difficulty. (M. T. al., Ed.) Selected proceedings of the 2008 Second Language Research Forum: Exploring SLA perspectives, positions, and practices, 57-72.
- Lowe, P. (1982). ILR Handbook on Oral Interview Testing.
- Matsushita, H., & Lonsdale, D. (2012). How to use simulated speech to assess learner Japanese oral proficiency. Georgetown University Round Table on Languages and Linguistics (GURT) 2012. Washington, D.c.
- Millard, B. (2011). Oral Proficiency Assessment of French Using an Elicited Imitation Test and Automatic Speech Recognition. Provo: Brigham Young University.
- Thompson, C. A. (2013). The Development and Validation of a Spanish Elicited Imitation Test of Oral Language Proficiency for the Missionary Training Center . PhD Thesis, Brigham Young University.
- Wu, S.-L., & Ortega, L. (2013). Measuring Global Oral Proficiency in SLA Research: A New Elicited Imitation Test of L2 Chinese. Foreign Language Annals, 1-25.