Análisis del vocabulario de alfabetos L2. A través de analizadores léxicos automáticos

Contenido principal del artículo

José Lema Alarcón


In grading second language (L2) scripts, teachers take approximate measures regarding lexical choices that may suggest the overall quality of the texts. This study compared various measures of lexical proficiency in scripts written by English language students. The corpus was analyzed to determine the correlation between teacher judgments and lexical items in L2 written assignments. Using the Text Inspector online tool, the study first attempted to delimit the lexical items corresponding to levels (e.g., A1/2, B1/2, and C1/2) of the Common European Framework of Reference for Languages (CEFR). In addition to verifying vocabulary levels, the study also used the Tool for the Automatic Analysis of Lexical Sophistication (TAALES) to analyze advanced words and phrases used in each script. Using the Text Inspector tool, the first part of the study demonstrated that the assigned grades for each script correlated with the CEFR word lists. Similarly, the grades and L2 scripts were correlated with twenty-two indices of lexical sophistication (i.e., academic word frequency, range and N-gram proportion frequency).



Detalles del artículo

Cómo citar
Análisis del vocabulario de alfabetos L2. A través de analizadores léxicos automáticos. (2023). Revista Vínculos ESPE, 8(1), 41-60.
Artículo de Investigación
Biografía del autor/a

José Lema Alarcón, Universidad de las Fuerzas Armadas – ESPE

Formación Académica: estudió en Inglaterra, en donde obtuvo su título de Ph.D., y maestría en Investigación Aplicada en la Universidad de Exeter, así como también una maestría en la Enseñanza del Idioma Inglés TESOL por parte de la Universidad de Canterbury Christ Church. Su trabajo se enfoca en la investigación de corpus lingüístico que permita el análisis de textos a gran escala. Para lo cual el Dr. Lema utiliza herramientas de minería de datos o Data Mining, Big Data y modelos estadísticos de predicción.

También, el Dr. Lema realiza investigación relacionada con el aprendizaje de una segunda lengua a través del uso de la tecnología en la educación (ej., blended learning, plataformas académicas, inteligencia artificial).

Cómo citar

Análisis del vocabulario de alfabetos L2. A través de analizadores léxicos automáticos. (2023). Revista Vínculos ESPE, 8(1), 41-60.


Allan, L. G. (1980). A note on measurement of contingency between two binary variables in judgment tasks. Bulletin of the Psychonomic Society, 15, 147–149.

Allen, L. K., Crossley, S. A., & McNamara, D. S. (2015). Predicting misalignment between teachers’ and students’ essay scores using natural language processing tools. In International Conference on Artificial Intelligence in Education (pp. 529-532). Springer International Publishing.

Alp, P., Kerge, K., & Pajupuu, H. (2013). Measuring lexical proficiency in L2 creative writing. In J. Colpaert, M. Simons, A. Aerts, M. Oberhofer (Eds.), Language Testing in Europe: Time for a New Framework? (pp. 274–286). Antwerpen: Linguapolis Universiteit Antwerpen

Amkham, C. (2016). Introducing the English Vocabulary Profile (EVP) to students: An attempt at enriching students’ written language. Retrieved from http://languagerese

Balota, D, Cortese, M., Sergent-Marshall, S., Spieler, D., & Yap, M. (2004). Visual word recognition of single-syllable words. Journal of Experimental Psychology: General, 133, 283-316.

Bax, S. (2015, January 1). Text Inspector [computer software]. Available from

Bell, H., 2003. Using Frequency Lists to Assess L2 Texts. University of Wales Swansea (Unpublished thesis).

British Council (2015). English in Ecuador: An examination of policy, perceptions and influencing factors. Retrieved from: files/latin-america-research/English%20in%20Ecuador.pdf

Capel, A. (2010). A1-B2 vocabulary: Insights and issues arising from the English Profile Wordlists project. English Profile Journal 1(1): 1–11. DOI: 10.1017/S20415 36210000 048

Capel, A. (2011). The English Vocabulary Profile. Available from

Capel, A. 2012. Completing the English Vocabulary Profile: C1 and C2 vocabulary. English Profile Journal 3(1): 1–14. DOI: 10.1017/S2041536212000013

Carter, R., McCarthy, M., Mark, G., & O’Keeffe, A. (2011). English grammar today: An A-Z of spoken and written grammar. Cambridge: Cambridge University Press).

Chen, Y., & Baker, P. (2014). Investigating Criterial Discourse Features across Second Language Development: Lexical Bundles in Rated Learner Essays, CEFR B1, B2 and C1. Applied Linguistics. doi:10.1093/applin/amu065

Cohen, L., & Manion, L. (1985). Research methods in education. Croom Helm.

Council of Europe (2011). Common European Framework of Reference for Languages: learning, teaching, assessment. Cambridge University Press.

Crossley, S. A., Cai, Z., & McNamara, D. S. (2012). Syntagmatic, paradigmatic, and automatic n-gram approaches to assessing essay quality. In McCarthy, P. M. &

Youngblood G. M., (Eds.). Proceedings of the 25th International Florida Artificial Intelligence Research Society (FLAIRS) Conference. (pp. 214-219) Menlo Park, CA: The AAAI Press.

Crossley, S. A., Cobb, T., & McNamara, D. S. (2013).

Comparing count-based and band- based indices of word frequency: Implications for active vocabulary research and pedagogical applications. System., 41, 965–981

Crossley, S. A., & McNamara, D. S. (2011). Understanding expert ratings of essay quality: Coh-Metrix analyses of first and second language writing. International Journal of Continuing Engineering Education and Life-Long Learning, 21 (2/3), 170-191.

Crossley, S. A., & McNamara, D. S. (2013). Applications of text analysis tools for spoken response grading. Language Learning & Technology, 17, 171–192.

Crossley, S. A., Salsbury, T., McNamara, D. S., & Jarvis, S. (2011a). Predicting lexical proficiency in language learners using computational indices. Language Testing, 28, 561–580. doi:10.1177/026553221037803

Crossley, S. A., Salsbury, T., McNamara, D. S., & Jarvis, S. (2011b). What is lexical proficiency? Some answers from computational models of speech data. TESOL Quarterly, 45, 182–193. doi:10.5054/tq.2010.244019

Davies, Mark. (2008-). The Corpus of Contemporary American English: 425 million words, 1990-present. Available online at

Doane, D. P., & Seward, L.E. (2011). Measuring Skewness. Journal of Statistics Education, 19(2), 1-18.

Dörnyei, Z. (2007). Research methods in applied linguistics: quantitative, qualitative, and mixed methodologies. Oxford: Oxford University Press.

Durrant, P., & Schmitt, N. (2009). To what extent do native and non-native writers make use of collocations? International Review of Applied Linguistics in Language Teaching, 47, 157–177.

Ferrand, L., Brysbaert, M., Keuleers, E., New, B., Bonin, P., Méot, A., & Pallier, C. (2011). Comparing word processing times in naming, lexical decision, and progressive demasking: Evidence from Chronolex. Frontiers in Psychology, 2,1–10.

Francois, T., Volodina, E., Pilan, I., & Tack A. (2016). SVALex: a CEFR-graded Lexical Resource for Swedish Foreign and Second Language Learners. Proceedings of LREC, Slovenia.

Grabe, W. & Kaplan, R.B. (1996). Theory and practice of writing. Harlow: Longman.

Granger, S. (2008). ‘Learner corpora in foreign language education’ in N. Van Deusen- Scholl and N. H. Hornberger. (eds): Encyclopedia of Language and Education Second and Foreign Language Education. Vol. 4. Springer.

Gries, S. T. (2008). “Dispersions and adjusted frequencies in corpora”. International Journal of Corpus Linguistics, 13 (4), 403–437.

Gries, S. T. (2013). 50-something years of work on collocations: What is or should be next ... International Journal of Corpus Linguistics, 18(1), 137–166

Gries, S. T., & Ellis, N. C. (2015). Statistical Measures for Usage- Based Linguistics. Language Learning, 65(S1), 228-255. doi:10.1111/lang.12119

Harrison, J., & Barker, F. (2015). English Profile in practice. Cambridge, United Kingdom.: Cambridge University Press.

Hyland, K. (2008). As can be seen: lexical bundles and disciplinary variation. English for Specific Purposes. 27 (1): 4-21.¡

Kyle, K., & Crossley, S. A. (2015). Automatically assessing lexical sophistication: Indices, tools, findings, and application. TESOL Quarterly, 49(4) 757–786. doi: 10.1002/ tesq.194

Kyle, K., & Crossley, S. A. (2016). The relationship between lexical sophistication and independent and source-based writing. Journal of Second Language Writing, 34(4), 12-24.

Kuperman, V., Stadthagen-Gonzales, H., & Brysbaert, M. (2012). Age-of-acquisition ratings for 30 thousand English words. Behavior Research Methods, 44, 978–990. doi:10.3758/s13428-012-0210-4

Laufer, B. (1994). The lexical profile of second language writing: Does it change over time? RELC Journal, 25(2), 21–33. doi:10.1177/003368829402500202

Laufer, B. (1998). The development of passive and active vocabulary in a second language: Same or different? Applied Linguistics 19(2): 255–271. DOI: 10.1093/applin/19.2.255

Laufer, B., & Nation, P. (1995). Vocabulary size and use: Lexical richness in L2 written production. Applied Linguistics, 16, 307–322. doi:10.1093/applin/16.3. 307

Lenko-Szymanska, A. (2015). The English Vocabulary Profile as a benchmark for assigning levels to learner corpus data. Studies in Corpus Linguistics Learner Corpora in Language Testing and Assessment, 115-140. doi:10.1075/


McCarthy, P., & Jarvis S. (2010). MTLD, voc-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment. Behavior Research Methods. 42(2):381–392.

Meara, P. (1996a). The dimensions of lexical competence. In G. Brown, K. Malmkjaer & Williams (Eds.), Performance and competence in second language acquisition (pp. 35–53). Cambridge, England: Cambridge University Press.

Meara, P. (1996b). The Vocabulary Knowledge Framework. retrieved from www.lognostics.

Millar, E. (2016). In Search of a Common Core of Key Vocabulary among EFL Coursebooks for 4th Year Secondary Education in Cantabria Digital Lexical Notebooks for Secondary Education (Master’s dissertation, Universidad de Cantabria,

Santander, Spain). Retrieved from

Nation, P. (1990). Teaching and learning vocabulary. Boston, MA: Heinle and Heinle. doi:10.1016/0346-251X(94)90065-5

Nation, P. (2001). Learning vocabulary in another language. Cambridge, England: Cambridge University Press. doi:10.1017/CBO9781139524759

Negishi, M., Tono, Y., & Fujita, Y. (2012). A Validation Study of the CEFR Levels of Phrasal Verbs in the English Vocabulary Profile. English Profile Journal, 3. doi:10.1017/s2041536212000037

Nicholls, D. (2003). The Cambridge Learner Corpus: Error coding and analysis for lexicography and ELT. In Proceedings of the Corpus Linguistics 2003 Conference, D. Archer, P. Rayson, A. Wilson & T. McEnery (eds), 572–581. Lancaster: University of Lancaster.

Olinghouse, N. G., & Wilson, J. (2013). The Relationship between Vocabulary and Writing Quality in Three Genres. Reading and Writing: An Interdisciplinary Journal, 26, 45-65.

Richards, J. (1976). The Role of Vocabulary Teaching. TESOL Quarterly, 10(1), 77. doi:10.2307/3585941

Saville, N. & Hawkey, R. (2010). The English Profile Programme - the first three years, English Profile Journal 1 (1).

Saville, N. (2012). The English Profile: Using Learner Data to Develop the CEFR for English. In Y. Tono, Y. Kawaguchi, & M. Minegishi (Eds.), Developmental and Crosslinguistic Perspectives in Learner Corpus Research (pp. 17–26). Amsterdam & Philadelphia: John Benjamins.

Schmitt, N. (1995). A Fresh Approach to Vocabulary: Using a Word Knowledge Framework. RELC Journal, 26(1), 86-94. doi:10.1177/003368829502600105

Schmitt, N. (1998). Tracking the Incremental Acquisition of Second Language Vocabulary: A Longitudinal Study. Language Learning, 48(2), 281-317. doi:10.1111/1467-9922.00042

Schmitt, N. (2005). Lexical resources in Main Suite writing examinations, in Lim, G. & Galaczi, D. (2010). Lexis in the assessment of speaking and writing: An illustration from Cambridge ESOL’s General English tests. Research Notes 41, 14-19, Cambridge: Cambridge ESOL. Retrieved from rs_notes/offprints/pdfs/RN41p2-7.pdf

Schmitt, N. (2012). Vocabulary in language teaching. New York: Cambridge University Press.

Schmitt, N. & Meara, P. (1997). ‘Researching vocabulary through word knowledge framework. Word associations and verbal suffixes.’ Studies in Second Language Acquisition 20: 17-36.

Simpson-Vlach, R., & Ellis, N. C. (2010). An academic formulas list: New methods in phraseology research. Applied Linguistics, 31, 487–512. doi:10.1093/applin/ amp058

Stæhr, L. S. (2008). Vocabulary size and the skills of listening, reading and writing. Language Learning Journal 36: 139–152. DOI:10.1080/09571730802389975 Stubbs, M. (2007). Quantitative data on multi-word sequences in English: The case of word ‘world’. In M. Hoey, M. Mahlberg, M. Stubbs & W. Teubert (Eds.), Text, Discourse and Corpora: Theory and Analysis (pp. 163–189). London: Continuum.

Van Ek, J. (1980). Threshold Level English. Oxford: Pergamon Press.

Van Gijsel, Speelman, D. & Geeraerts, D. (2005) A variationist, corpus linguistic analysis of lexical richness. Proceedings from the Corpus Linguistics Conference Series 1 (1), p. 1-16.

Artículos similares

También puede Iniciar una búsqueda de similitud avanzada para este artículo.

Artículos más leídos del mismo autor/a

1 2 3 4 5 6 7 8 9 10 > >>