Readings, Tools, and Useful Links for Corpus Analysis

October 27, 2020 | The Savvy Linguist | | Starting Your Career

This post originally appeared on the blog In My Own Terms and it is republished with permission.

The following list is a result of collaboration by participants of Lancaster’s recent MOOC on Corpus Linguistics. This is a selection of the links that I considered more relevant for those who might want to start exploring this field. If you want to share other links, feel free to add a comment or send me a message and I will add it here. I will keep you posted on the next CL course by Lancaster University. This post complements previous posts on corpora lists, GraphColl, and AntConc.

Readings

An Introduction of Corpus Linguistics – G. Bennet

Corpus Linguistics: What It Is and How It Can Be Applied to Teaching – D. Krieger

Corpus Linguistics 2015. Abstract book – F. Formato and A. Hardie (Lancaster:UCREL)

Corpus annotation – R. Garside, G. Leech, T. McEnery

A critical look at software tools in corpus linguistics – L. Anthony

Corpora and Language Teaching: Just a fling or wedding bells? – C. Gabrielatos

Books

Sociolinguistics and Corpus Linguistics – P. Baker

Using Corpora in Discourse Analysis

Google book: Corpus-based Translation Studies – S. Laviosa

Google book: Corpus-based translation studies: Research and Applications – A. Kruger, K. Wallmach, J. Munday

Tools

SkELL is a free online, stripped down version of the Sketch Engine corpus query software. It allows very simple searches for words which will produce a word sketch to show the grammatical and collocational behavior of the word. It also produces a list of similar words and the regular concordance lines. One of our tutors in Lancaster’s MOOC, Keith Barrs, wrote an article on how to use this tool (from page 6).

WebCorp. Concordance the web in real-time

Wmatrix is a software tool for corpus analysis and comparison.

Corpora

The SILS Learner Corpus of English is a collection of essays by students at SILS, the School of International Liberal Studies at Waseda University.

Translational English Corpus (TEC) is a corpus of contemporary translational English: it consists of written texts translated into English from a variety of source languages, European and non-European

The Collins Corpus is an analytical database of English with over 4.5 billion words. It contains written material from websites, newspapers, magazines and books published around the world, and spoken material from radio, TV and everyday conversations.

CORPUS. The Open Parallel Corpus is a growing collection of translated texts from the web.

Natural Language Toolkit. is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, wrappers for industrial-strength NLP libraries, and an active discussion forum.

WordBanks Online is an online corpus service offering you the chance to tap into the unique resources of the Collins Word Web, on which the highly successful range of Collins dictionaries is based.

Lancaster Corpus of Children’s Project Writing is a digitized collection of project work produced by children aged between 9 and 11.

For corpora in other languages visit Corpus Linguitics and Morphology of Humbolt-Universität zu Berlin, and lemmatization lists in several languages at lexiconista.com

Articles in Spanish

Lingüística de corpus: una introducción al ámbito – G. Parodi
Lingüística de corpus y lingüística del español – Guillermo Rojo

Introducción al análisis de estructuras lingüísticas en corpus – M. Alcántara Plá
Hacia una definición del concepto de colocación – J. R. Firth, a I.A. Mel’cuk, by M. Alonso Ramos
Diseño de corpus textuales y orales by Torruella y Llisterri

Sobre la construcción de diccionarios basados en corpus – G. Rojo

Compilación de un corpus ad hoc para la enseñanza de la traducción inversa especializada – G. Corpas

El corpus lingüístico en la didáctica del léxico en el aula – E. Alonso

For corpora in Spanish, visit my page TermFinder (Corpora EN+ES section)

Author bio

Patricia Brenes is the owner of the blog http://inmyownterms.com/. Originally from Costa Rica, she moved to Washington in 2000 to work for an international organization. She obtained her Master’s Degree in Specialized Translation at the Universitat de Vic in Barcelona and is a Certified Terminology Manager (ECQA-TermNet). Her blog collects useful information on theory and practice, as well as infographics, biographies, interviews, tools, and much more.

Readings, Tools, and Useful Links for Corpus Analysis

Readings

Tools

Corpora

For corpora in other languages visit Corpus Linguitics and Morphology of Humbolt-Universität zu Berlin, and lemmatization lists in several languages at lexiconista.com

Other useful and interesting links

Articles in Spanish

For corpora in Spanish, visit my page TermFinder (Corpora EN+ES section)

Leave a Comment Cancel Reply

Recent Posts

Announcing The ATA Savvy Linguist

Grief: Staying alive and kicking when people are dying around you

Unlock Your Potential: Join the ATA Mentoring Program

Embracing the Future: Why New Translators and Interpreters Should See AI as an Ally, Not a Threat

A (Former) Boston Local’s List of Things to Check Out Around ATA66

ATA66 Conference Preview: Business Practices Education at ATA66

Independent Contracting from A to Z: Everything Translators & Interpreters Need to Know

Retirement Planning for Freelancers

Professional Etiquette to Lower Your Stress and Improve Work-Life Balance

Translator Profile: Lucy Gunderson, CT

Connect with The Savvy Linguist

Readings, Tools, and Useful Links for Corpus Analysis

Readings

Tools

Corpora

For corpora in other languages visit Corpus Linguitics and Morphology of Humbolt-Universität zu Berlin, and lemmatization lists in several languages at lexiconista.com

Other useful and interesting links

Articles in Spanish

For corpora in Spanish, visit my page TermFinder (Corpora EN+ES section)

Leave a Comment Cancel Reply

Recent Posts

Subscribe to The Savvy Linguist

Connect with The Savvy Linguist