The Corpus of Spoken Istrovenetian/Fiuman and Croatian (C-ORAL-IC)

Authors

  • Nada Poropat Jeletić Filozofski fakultet, Sveučilište Jurja Dobrile u Puli
  • Gordana Hržica Edukacijsko-rehabilitacijski fakultet, Sveučilište u Zagrebu
  • Eliana Moscarda Mirković Filozofski fakultet, Sveučilište Jurja Dobrile u Puli

Keywords:

jezično uzorkivanje; govoreni korpusi; kodno preključivanje; dvojezični diskurs

Abstract

Bilingual conversational corpora are invaluable for studying genuine contact phenomena in spontaneous bilingual speech. This paper presents the Corpus of Spoken Istrovenetian/Fiuman and Croatian (C-ORAL-IC), the first corpus documenting unscripted Istrovenetian and Fiuman dialects spoken among bilinguals in the Istrian and Kvarner areas of Croatia. The region has a long history of Croatian and Italian cultural and linguistic interaction, shaping a complex sociolinguistic system with diglossic and polyglossic relations. C-ORAL-IC includes data from 87 bilingual/multilingual speakers and features over 85,000 tokens and 27,000 types. Available on TalkBank (BilingBank subsection) [https://talkbank.org, https://biling.talkbank.org/access/C-ORAL-IC.html], it includes transcribed, phonologically adapted, coded, segmented and morphologically tagged recordings. Additional participant data on language history and usage are available. C-ORAL-IC provides a rich resource for exploring spontaneous bilingual speech, offering insights into conversational features, structure, and synchronic changes in Istrovenetian/Fiuman.

Downloads

Published

2025-07-31

Issue

Section

Pregledni rad

Categories