Extra

Other collaborations between project members

Corpus of Written Productions of Mirandese L2 Learners (PEAMirL2) and Master Classes on corpus and language resources development, with Špela Arhar Holdt and Iztok Kosem (Centre for Language Resources and Technologies, Faculty of Computer and Information Science, University of Ljubljana)

On the 28th and 29th of November 2024, Špela Arhar Holdt and Iztok Kosem (Centre for Language Resources and Technologies, Faculty of Computer and Information Science, University of Ljubljana) visited the Faculty of Arts and Humanities of the University of Coimbra. They participated in working meetings with the project team carrying out the Project of a Corpus of Written Productions of Mirandese L2 Learners (PEAMirL2), where they act as international consultants. Furthermore, the guest researchers delivered two Master Classes on corpus development and language resources.

The Corpus PEAMirL2 project is being developed at the Research Centre for General and Applied Linguistics (CELGA-ILTEC, University of Coimbra), in partnership with the Association of Mirandese Language and Culture (ALCM). It is coordinated by Cristina Martins and has the participation of the following CELGA-ILTEC researchers: Tanara Zingano Kuhn, Isabel Santos and Isabel Pereira. In addition to Špela Arhar Holdt and Iztok Kosem, Alfredo Cameirão and Ana Afonso (teachers of the Mirandese language courses offered by ALCM) are external consultants.

The guest researchers also delivered two Master Classes on corpus development and language resources:

  • From developmental corpus to developed applications: The journey of SOLAR 3.0, by Špela Arhar Holdt,

This talk will explore the journey and lessons learned surrounding the Slovene developmental corpus Šolar 3.0, a key language resource documenting student writing in Slovene primary and secondary schools. Šolar 3.0 includes 5,485 student texts along with 36,570 teacher corrections, capturing authentic student language use and teacher feedback. These insights have enabled empirical analyses and studies, as well as the development of new resources and tools, including corpus-based teaching materials, grammar-checking tools, tools supporting the corpus-building process, and a specialised concordancer for in-depth analyses. The talk will highlight a beneficial cycle, where linguistic and didactic expertise in interdisciplinary teams is essential for building impactful language resources, which in turn advance and enrich both linguistics and education, facilitating future developments.

  • Corpus Tools and Language Resources at the University of Ljubljana: Purposes and people behind the development, by Iztok Kosem.

This presentation will explore the range of corpus tools and language resources for Slovene developed at the Centre for Language Resources and Technologies, University of Ljubljana. Key resources include the 1.3-billion-word reference corpus Gigafida, the Trendi monitor corpus (currently at 1 billion words), and the corpus data summarizer Korpusnik, which provides access to data from five distinct corpora of Slovene. I will also introduce the Digital Dictionary Database for Slovene and the various dictionaries it encompasses. The talk will emphasize the processes and collaborative efforts involved in developing these resources, highlighting the essential roles played by diverse participants. Special focus will be given to the involvement of linguists, demonstrating that while they are substantial end-users of these tools, they also play a crucial role in shaping and creating these resources through their active participation in development activities.

In presenting and demonstrating some of their projects, the speakers addressed the role of linguists in their development. Their presentations were therefore of particular interest to undergraduate, master's and doctoral students. To access the presentation slides, send an e-mail to celga-iltec@uc.pt.

The Master Classes took place in person in Auditorium 2 of the Student Hub, in the University of Coimbra Faculty of Medicine building, and was broadcast via Zoom.

This event was financed by national funds through FCT - Fundação para a Ciência e a Tecnologia, I.P., under project UIDB/04887/2020 and UIDP/04887/2020.