XVIII Instituto de Lingüística

Minicursos propostos

C-ORAL-ROM: BUILDING AND TAGGING REFERENCE CORPORA OF SPONTANEOUS SPEECH
Emanuela Cresti & Massimo Moneglia ( Curriculum Vitae )
 
Nível:Avançado
Data:5 a 9 de março
Horário:14h às 17h
 
Resumo:

The course will focus on the corpus design structure of the C-ORAL-ROM resource and will present the annotation strategy chosen in C-ORAL-ROM for what concerns the main unit of analysis of spontaneous speech. The validity of the C-ORAL-ROM assumptions at both levels will be supported showing the generalizations that have been obtained through the early cross-linguistic studies accomplished on the C-ORAL-ROM corpus. The course will be integrated with a tutorial session given by Massimo Moneglia that will introduce students to corpus annotation procedures. The course will highlight various connected conclusions that are based on the C-ORAL-ROM annotation of the reference units of speech through terminal prosodic breaks:

a)the annotation of the utterance based on the detection of terminal breaks is linguistically relevant, because: 1) the reference units so identified are in one to one correspondence with speech acts identified through pragmatic criteria; 2) no chunking relation can hold between elements across terminal boundaries;

b) the annotation of the utterance based on the detection of pauses is not linguistically relevant because it is both too week and too strong for the selection of the utterance boundaries (according to the evaluation of the French C-ORAL-ROM corpus . See Cresti & Moneglia 2005)

c) the annotation of terminal breaks provides the term of reference for the statistic evaluation of the main properties of spoken language performance in the multilingual corpus.

d) the annotation of terminal breaks provides an easy and relevant criterion for a meaningful text to speech alignment of large corpora (to be demonstrated by Massimo Moneglia in a tutorial section).

 
Pré-requisitos:


Main reference: Cresti & Moneglia (eds) C-ORAL-ROM, Integrated Reference Corpora for Spoken Romance Languages. Benjamins
Link

Essential references on the speech act/prosody relation are available on line, together with materials and tools for the tutorial session
Link

DATAS
- Congresso: 28 de fevereiro e 1, 2, 3 de março de 2007

- Instituto: 22 a 27 de fevereiro e 5 a 9 de março de 2007