- Source: Switchboard Telephone Speech Corpus
The Switchboard Telephone Speech Corpus is a corpus of spoken English language consisted of almost 260 hours of speech. It was created in 1990 by Texas Instruments via a DARPA grant, and released in 1992 by NIST. The corpus contains 2,400 telephone conversations among 543 US speakers (302 male, 241 female). Participants did not know each other, and conversations were held on topics from a predetermined list.
Switchboard-2 Phase II was collected in 1999 and includes "4,472 five-minute telephone conversations involving 679 participants".
The corpus was used for development of speech recognition algorithms.
Text example:
Further reading
Calhoun, Sasha; Carletta, Jean; Brenier, Jason M.; Mayo, Neil; Jurafsky, Dan; Steedman, Mark; Beaver, David (December 2010). "The NXT-format Switchboard Corpus: a rich resource for investigating the syntax, semantics, pragmatics and prosody of dialogue" (PDF). Language Resources and Evaluation. 44 (4): 387–419. doi:10.1007/s10579-010-9120-1. S2CID 5176936. Retrieved 26 January 2024.
References
Kata Kunci Pencarian:
- Switchboard Telephone Speech Corpus
- Brown Corpus
- Enron Corpus
- Cambridge English Corpus
- British National Corpus
- Arabic Speech Corpus
- Corpus of Contemporary American English
- Speech recognition
- Quranic Arabic Corpus
- Oxford English Corpus