- Source: CMU Pronouncing Dictionary
The CMU Pronouncing Dictionary (also known as CMUdict) is an open-source pronouncing dictionary originally created by the Speech Group at Carnegie Mellon University (CMU) for use in speech recognition research.
CMUdict provides a mapping orthographic/phonetic for English words in their North American pronunciations. It is commonly used to generate representations for speech recognition (ASR), e.g. the CMU Sphinx system, and speech synthesis (TTS), e.g. the Festival system. CMUdict can be used as a training corpus for building statistical grapheme-to-phoneme (g2p) models that will generate pronunciations for words not yet included in the dictionary.
The most recent release is 0.7b; it contains over 134,000 entries. An interactive lookup version is available.
Database format
The database is distributed as a plain text file with one entry to a line in the format "WORD
The following is a table of phonemes used by CMU Pronouncing Dictionary.
History
Applications
The Unifon converter is based on the CMU Pronouncing Dictionary.
The Natural Language Toolkit contains an interface to the CMU Pronouncing Dictionary.
The Carnegie Mellon Logios tool incorporates the CMU Pronouncing Dictionary.
PronunDict, a pronunciation dictionary of American English, uses the CMU Pronouncing Dictionary as its data source. Pronunciation is transcribed in IPA symbols. This dictionary also supports searching by pronunciation.
Some singing voice synthesizer software like CeVIO Creative Studio and Synthesizer V uses modified version of CMU Pronouncing Dictionary for synthesizing English singing voices.
Transcriber, a tool for the full text phonetic transcription, uses the CMU Pronouncing Dictionary
15.ai, a real-time text-to-speech tool using artificial intelligence, uses the CMU Pronouncing Dictionary
See also
Moby Pronunciator, a similar project
References
External links
The current version of the dictionary is at SourceForge, although there is also a version maintained on GitHub.
Homepage – includes database search
RDF converted to Resource Description Framework by the open source Texai project.
Kata Kunci Pencarian:
- George Deukmejian
- Bahasa Amerika Umum
- CMU Pronouncing Dictionary
- English Pronouncing Dictionary
- ARPABET
- Richard Gere
- 15.ai
- George Deukmejian
- Barbara Ehrenreich
- Twitter bot
- Pronunciation respelling for English
- Betty Friedan