Analysing Academic Texts with Computational Linguistics Tools
https://doi.org/10.31992/0869-3617-2020-29-7-89-103
Abstract
Writing academic texts in English introduces certain difficulties associated with translating Russian sentences with pronounced stylistic peculiarities, especially for young researchers who are just starting their publication activity. It seems impossible to study any genre without analysing examples of the discourse, which highlights the use of computational linguistics as it allows automating a lot of language and text processing mechanisms and generates relatively accurate quantitative results. The present study considers the application of AntConc and Coh-Metrix toolkits for analyzing master students’ abstracts to research papers written for international English-language journals or conference proceedings (Learner Corpus) in comparison with international researchers’ abstracts published in high-impact journals (Reference Corpus). The analysis conducted in the above-mentioned software tools revealed the drawbacks and strengths of master students’ texts, allowed characterizing them on the words, sentence and discourse levels, as well as outlined the potentials of their use in teaching academic writing skills.
About the Authors
E. I. ShpitRussian Federation
Elena I. Shpit - Senior language instructor
Address: 40, Lenin Prospect, Tomsk, 634050
V. N. Kurovskii
Russian Federation
Vassily N. Kurovskii - Dr. Sci. (Education), Prof.
Address: 60, Kievskaya str., Tomsk, 634061
References
1. Bure, N.A., Bystrykh, M.V., Vishchnyakova, S.A., et al. (2003). Osnovy nauchnoi rechi: uchebnoe posobie dlya studentov nefilol. vyssh. ucheb. zavedeniy [Fundamentals of Scientific Register: Textbook for Non-Philology Tertiary Students]. Ed. by Khimik, V.V., Volkova, L.B. St. Peterburg: St. Petersburg State Univ. Moscow: Akademiya Publ. House. 272 p. (In Russ.)
2. Dobrynina, O.L. (2018). Technology of Lifelong Linguistic Education: Teaching Students to Recognize and Correct Errors in English Academic Writing. Nepreryvnoe obrazovanie: XXI vek = Lifelong Education: The XXI century. No. 1 (21). (In Russ., abstract in Eng.)
3. Dobrynina, O.L. (2017). Grammar Errors in Academic Writing in English: Causes and Strategies of Correction. Vysshee obrazovanie v Rossii = Higher Education in Russia. No. 8/9 (215), pp. 100-107. (In Russ., abstract in Eng.)
4. Dobrynina, O.L. (2019). Academic Writing for Publication Purposes: The Infelicities of Style. Vysshee obrazovanie v Rossii = Higher Education in Russia. Vol. 28, no. 10, pp. 38-49. DOI:https://doi.org/10.31992/0869-3617-2019-28-10-38-49. (In Russ., abstract in Eng.)
5. Kuznetsova, L.B., Suchkova, S.A. (2015). Active or Passive? “I” or “We”? Vysshee obrazovanie v Rossii = Higher Education in Russia. No. 8-9, pp. 143-148. (In Russ., abstract in Eng.)
6. Shpit, E.I., Sobolevskaya, O.V. (2019) Analysing the Level of Academic Writing Literacy of TUSUR Graduate Students. In: Proc. of IEEE 2019 International multi-conference on engineering, computer and information sciences (SIBIRCON). Russia, Tomsk. 2019, Oct. 23–24, pp. 0207-0211. Available at: http://talgat.org/news/wp-content/uploads/2019/12/97.pdf
7. Wallwork, A. (2011). English for Writing Research Papers. Springer Science+Business Media, LCC, 349 p.
8. Likhachev, D.S. (1991). Kniga Bespokoistv [The Book of Worries]. Мoscow: Novosti Publ., 528 p. (In Russ.)
9. Laurence, A. (2004). AntConc: A Learner and Classroom Friendly, Multi-Platform Corpus Analysis Toolkit. In: IWLeL 2004: An Interactive Workshop on Language e-Learning, pp. 7-13. URL: https://core.ac.uk/download/pdf/144458559.pdf
10. Krajka, J. (2007). Corpora and Language Teachers: From Ready-Made to Teacher-Made Collections. In: CORELL: Computer Resources for Language Learning, no. 1, pp. 36-55. Available at: https://pdfs.semanticscholar.org/1b70/a6b768b1a1587442d28dd93685b5f3ad9cab.pdf
11. Laurence, A., Bowen, M. (2013). The Language of Mathematics: A Corpus-Based Analysis of Research Article Writing in a Neglected Field. Asian ESP J. Vol. 9, pp. 5-25.
12. Hidayat, F. (2015). Teaching Grammar by Induction to 21st Century Learners with Corpus Linguistics Technology. In: LIA International Conference and Cultural Events. Hyatt Regency Hotel, Yogyakarta, Indonesia, April 29 – May 1. Available at: https://www.academia.edu/12166819/Teaching_Grammar_by_Induction_to_21st_Century_Learners_with_Corpus_Linguistics_Technology
13. Rogacheva, V. (2017). Corpus-Based Method of Terminology Translation. Izvestiya RGPU imeni A.I. Gertsena = Izvestia: Herzen University Journal of Humanities & Sciences. No. 183, pp. 101-107. (In Russ., abstract in Eng.)
14. McСarthy, P., Lehenbauer, B., Hall, C., Duran, N., Fujiwara, Y., McNamara, D. (2007). A Coh- Metrix Analysis of Discourse Variation in the Texts of Japanese, American, and British Scientists. Foreign Languages for Specific Purposes. Vol. 6. Available at: https://www.researchgate.net/publication/242322281_A_Coh-Metrix_Analysis_of_Discourse_Variation_in_the_Texts_of_apanese_American_and_British_Scientists
15. Duncan, B., Hall, C. (2009). A Coh-Metrix Analysis of Variation among Biomedical Abstracts. In: Proceedings of the Twenty-Second International FLAIRS Conference. Available at: https://www.researchgate.net/publication/221439065_A_Coh-Metrix_Analysis_of_Variation_among_Biomedical_Abstracts
16. Duran, N., Bellissens, M., Taylor, R., McNamara, D. (2007). Quantifying Text Difficulty with Automated Indices of Cohesion and Semantics. In: Proceedings of the 29th Annual Meeting of the Cognitive Science Society, pp. 233-238.
17. Dowell, N., Graesser, A., Cai, Zh. (2015). Language and Discourse Analysis with Coh-Metrix: Applications from Educational Material to Learning Environments at Scale. Journal of Learning Analytics. No. 3. DOI: https://doi.org/10.18608/jla.2016.33.5
18. Solnyshkina, M.I., Kiselnikov, A.S. (2015). The Indices of Examination Texts Complexity. Vestnik Volgogradskogo gosudarstvennogo universiteta. Seriya 2. Jazykoznanije = Science Journal of VolSU. Linguistics. No. 1, pp. 99-107. (In Russ., abstract in Eng.)
19. McNamara, D., McCarthy, P., Graesser, A., Cai, Zh. (2014). Automated Evaluation of Text and Discourse with Coh-Metrix. Cambridge: Cambridge University Press, 278 p.
20. McCarthy, P., Jarvis, S. (2007). VOCD: A Theoretical and Empirical Evaluation. Language Testing, Vol. 24, Issue 4, pp. 459-488. DOI: https://doi.org/10.1177/0265532207080767
21. Miller, G., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.J. (1990). Introduction to WordNet: An on-line lexical database. Journal of Lexiography, Vol. 3, pp. 235-244.
22. Gazizulina, L.R. (2019). Complexity and Readability Criteria for the Assessment of Academic Text in Foreign Language Training in a Non-Language Higher Education Institution. Mir nauki, kulturi i obrazovania = The World of Science, Culture and Education. No. 1 (74), pp. 372–374. (In Russ., abstract in Eng.)