CASE Project

The CASE project

Compiling a corpus of informal English as a Lingua Franca (ELF) conversations

The CASE project was started in 2012 at Saarland University with the aim of collecting video-mediated conversations in an international English-language context and thus create a dataset or "corpus" that allows research of this particular communication type. Until 2018,  teams of researchers from Germany, Bulgaria, Spain, Italy, Sweden, Finland, France, Belgium, the UK and the US have compiled more than 250 hours of video-mediated conversations (via Skype). The conversations are first encounters between two participants from different countries and last between 30 and 60 minutes.

Of particular interest to us are pragmatics and discourse in a video-mediated communication setting, cultural and intercultural negotiation, issues of identity, the role of plurilingual resources, and the influence of the communication medium on issues such as rapport and cooperation in an international setting. 

Conversations are transcribed according to pragmatic transcription guidelines, with the aim of allowing for a wide range of applications and in particular focusing on spoken language features, multimodality, and the use of plurilingual resources. Our team of researchers has published several papers on various aspects of the project, some of which are available online. A detailed description of the issues related to the analysis of spoken data with extensive examples can be found in

Brunner, Marie-Louise; Stefan Diemer; and Selina Schmidt. 2017. “... okay so good luck with that ((laughing))?” - Managing rich data in a corpus of Skype conversations. Studies in Variation, Contacts and Change in English 19 [Big and Rich Data in English Corpus Linguistics: Methods and explorations, ed. by Turo Hiltunen; Joe McVeigh; and Tanja Säily]. Helsinki: Varieng. Full text  here. [http://www.helsinki.fi/varieng/series/volumes/19/brunner_diemer_schmidt/].

ViMELF: Corpus of Video-Mediated English as a Lingua Franca Conversations

ViMELF contains 20 Skype conversations between 40 speakers from Germany (20 speakers), Spain (5), Italy (5), Finland (5), and Bulgaria (5), totaling 744.5 minutes (ca. 12.5 hours), with an average conversation length of 37.23 minutes. The corpus comprises 113 677 words in the plain text version and 152 467 items in the annotated (preliminary numbers). Get access now!

The transcripts are available as .docx and .txt files; the videos in MPEG4 format. Several versions are available: the fully annotated pragmatic version as text and XML, a lexical version, and a POS-tagged version (auto-tagged with CLAWS).

TaCoCASE - Transatlantic Component of the CASE project 

Sub-Corpus of the CASE project

  • International video-mediated communication
  • Skype conversations between native speakers (NS) and non-native speakers (NNS) of English

Description:

  • 15 conversations
  • Conversation length: 650 minutes (= ca. 10.5 hours)
  • Average conversation length: 43 minutes (= 9,483 words)
  • Words / Tokens: 140,003
  • Participants: 26 [8 SB (Germany), 10 BI (Great Britain), 8 BO (USA)]
  • Medium: Video both sides (13x video, 2x audio)
  • Including sociolinguistic background data

The transcripts are available as .docx and .txt files; the videos in MPEG4 format. Several versions are available: the fully annotated pragmatic version as text and a lexical version. XML and a POS-tagged versions (auto-tagged with CLAWS) are in preparation.

Datasets

CASE project recordings have been completed in 2018-23 with a total of more than 250 hours of data. The raw data has been used for various qualitative studies. 

  • CASE. 2018. Corpus of Academic Spoken English – Recordings. Birkenfeld: Trier University of Applied Sciences. [http://umwelt-campus.de/case].

While the CASE project is still ongoing, several preliminary datasets have been analyzed and discussed in our publications. A preliminary set of 20 conversations, BabyCASE was compiled in 2017, as well as two sets of conversations about food in 2015 and 2017. Preliminary transcripts of additional single conversations are also available.

  • BabyCASE. 2017. 20 conversations from the CASE project. Birkenfeld: Trier University of Applied Sciences & Saarbrücken: Saarland University. [http://umwelt-campus.de/case].
  • FoodCASE 2015. Conversations about food from the CASE project. Birkenfeld: Trier University of Applied Sciences & Saarbrücken: Saarland University. [http://umwelt-campus.de/case].
  • FoodCASE v2 2017. Conversations about food from the CASE project. Birkenfeld: Trier University of Applied Sciences. [http://umwelt-campus.de/case].

In May 2018, the  first finalized corpus based on data from the CASE project was released for scientific use: ViMELF.

In September 2023, the second finalized corpus was released, TaCoCASE.

Citations

Citing ViMELF - A Corpus of Video-Mediated English as a Lingua Franca Conversations:

ViMELF. 2018. Corpus of Video-Mediated English as a Lingua Franca Conversations. Birkenfeld: Trier University of Applied Sciences. [http://umwelt-campus.de/case] (date of last access). 

Citing TaCoCASE - Transatlantic Component of CASE

TaCoCASE. 2023. Transatlantic Component of the CASE project. Birkenfeld: Trier University of Applied Sciences. Version 1.0. Compiler: Collet, Caroline. [http://umwelt-campus.de/case/TaCoCASE].

Citing the CASE project: 

Long citation:

The CASE project. 2012-2023. Stefan Diemer; Marie-Louise Brunner; Caroline Collet; and Selina Schmidt. Birkenfeld: Trier University of Applied Sciences (coordination) / Saarbrücken: Saarland University / Sofia: St Kliment Ohridski University / Forlì: University of Bologna-Forlì / Santiago: University of Santiago de Compostela / Helsinki: Helsinki University & Hanken School of Economics / Birmingham: Birmingham City University / Växjö: Linnaeus University / Lyon: Université Lumière Lyon 2 / Louvain-la-Neuve: Université catholique de Louvain / Boise: Boise State University. [http://umwelt-campus.de/case] (date of last access).

Short citation:

The CASE project. 2012-2023. Birkenfeld: Trier University of Applied Sciences. [http://umwelt-campus.de/case] (date of last access).

Citing transcripts:

Transcript 01SB00SF00. The CASE project. 2012-2023. Birkenfeld: Trier University of Applied Sciences. [http://umwelt-campus.de/case] (date of last access).

Contact us

  • You have questions about the CASE project or ViMELF?
  • You would like to get in touch about a cooperation?
  • You have published or presented results based on ViMELF?
  • You have developed teaching materials based on ViMELF?

Please contact us at case(at)umwelt-campus.de! We would love to hear from you.

back-to-top nach oben