This rhyming dictionary is part of a suite of tools for lyricists, songwritersand rappers. If you write lyrics you should definitely check outRapPad to get feedback and share your songs with the world.
At the beginning of the file they've listed symbols and punctuation. The symbol is followed by the english spelling of said symbols name with no space between them. This is then followed by the two space divider and the arpabet code. Since you're only looking for rhymes you don't have to do anything special with the symbols section since you're never going to be looking for a rhyme to ...ELLIPSIS
Basically, if you've already found the cmudict then you've already got what you asked for: a database of words and their pronunciations. To find words that rhyme you'll have to parse the flat file into a table and run a query to find words that end with the same ARPAbet code.
Once you've got the data into whatever kind of database you choose, you can then use that database to find correlations between the arpabet codes. You could find rhymes, consonance, assonance, and other mnemonic devices. It would go something like
Then, find the vowel phoneme that has primary stress. In other words, look for the number "1" in that pronunciation. The text directly to the left of the 1 is the vowel sound that has primary stress (AH). That text, and everything to the right of it are your "rhyme phonemes" (for the lack of a better term). So the rhyme phonemes for LOVE are AH1 V.
For a simple conceptual example, you could compute the "rhyme phonemes" of all words in the dictionary, then insert them into a "Rhymes" table whose columns are WordText, RhymePhonemes . For example, you might see records like:
I've also had some luck storing the raw pronunciation in the database in varying "full" formats (forward and reversed strings of the pronunciation, with stress marks and without stress marks, etc) but not "chopped" into specific pieces like a rhyme-phoneme column.
You could always use and search a word and then put its rhyme matches into a text file if you are only using a small demo subset. If you want a full database of words. You could hook up a dictionary to a zombieJS UI automation and then screen scrape the words and put them into your own database. This would allow you to create your own rhyme database. Although to be honest, that's quite an undertaking for your original request
These dictionaries specify the pronunciations of characters using the fǎnqiè method, giving a pair of characters indicating the onset and remainder of the syllable respectively.The later rime tables gave a significantly more precise and systematic account of the sounds of these dictionaries by tabulating syllables by their onsets, rhyme groups, tones and other properties. The phonological system inferred from these books, often interpreted using the rime tables, is known as Middle Chinese, and has been the key datum for efforts to recover the sounds of early forms of Chinese. It incorporates most of the distinctions found in modern varieties of Chinese, as well as some that are no longer distinguished. It has also been used together with other evidence in the reconstruction of the Old Chinese language (1st millennium BC).
Chinese scholars produced dictionaries to codify reading pronunciations for the correct recitation of the classics and the associated rhyme conventions of regulated verse.[3] The earliest rime dictionary was the Shenglei (lit. 'sound types') by Li Deng (李登) of the Three Kingdoms period, containing more than 11,000 characters grouped under the five notes of the ancient Chinese musical scale.[4] The book did not survive, and is known only from descriptions in later works.[5]
Each tone was divided into rhyme groups (韻 yùn), traditionally named after the first character of the group, called the yùnmù (韻目 'rhyme eye').[16] Lu Fayan's edition had 193 rhyme groups, which were expanded to 195 by Zhangsun Nayan and then to 206 by Li Zhou.[17] The following shows the beginning of the first rhyme group of the Guangyun, with first character 東 ('east'):
Each rhyme group was subdivided into homophone groups preceded by a small circle called a niǔ (紐 'button').The entry for each character gave a brief explanation of its meaning.At the end of the entry for the first character of a homophone group was a description of its pronunciation, given by a fǎnqiè formula, a pair of characters indicating the initial (聲母 shēngmǔ) and final (韻母 yùnmǔ) respectively. For example, the pronunciation of 東 was described using the characters 德 tok and 紅 huwng indicating t + uwng = tuwng.[18][19][a]The formula was followed by the character 反 fǎn (in the Qieyun) or the character 切 qiè (in the Guangyun), followed by the number of homophonous characters.[20][21] In the above sample, this formula is followed by the number 十七, indicating that there are 17 entries, including 東, with the same pronunciation.
The order of the rhyme groups within each volume does not seem to follow any rule, except that similar groups were placed together, and corresponding groups in different tones were usually placed in the same order. Where two rhyme groups were similar, there was a tendency to choose exemplary words with the same initial.[22] The table of contents of the Guangyun marks adjacent rhyme groups as tóngyòng (同用), meaning they could rhyme in regulated verse.[23] In the above sample, under the entry for the rhyme group 刪 in the last part the table of contents (on the right page) is the notation "山同用", indicating that this group could rhyme with the following group 山.
The following are the rhyme groups of the Guangyun with their modern names, the finals they include (see next section), and the broad rhyme groups (shè 攝) they were assigned to in the rime tables. A few entries are re-ordered to place corresponding rhyme groups of different tones in the same row, and darker lines separate the tóngyòng groups:
From this we may conclude that 東, 德 and 多 must all have had the same initial. By following such chains of equivalences Chen was able to identify categories of equivalent initial spellers, and similarly for the finals.More common segments tended to have the most variants.Words with the same final would rhyme, but a rhyme group might include between one and four finals with different medial glides, as seen in the above table of rhyme groups.The inventory of initials Chen obtained resembled the 36 initials of the rime tables, but with significant differences.In particular the "light lip sounds" and "heavy lip sounds" of the rime tables were not distinguished in the fanqie, while each of the "proper tooth sounds" corresponded to two distinct fanqie initial categories.[39][40][41]
Karlgren also sought to determine the phonetic values of the abstract categories yielded by the formal analysis, by comparing the categories of the Guangyun with other types of evidence, each of which presented their own problems. The Song dynasty rime tables applied a sophisticated featural analysis to the rime books, but were separated from them by centuries of sound change, and some of their categories are difficult to interpret. The so-called Sino-Xenic pronunciations, readings of Chinese loanwords in Vietnamese, Korean and Japanese, were ancient, but affected by the different phonological structures of those languages. Finally modern varieties of Chinese provided a wealth of evidence, but often influenced each other as a result of a millennium of migration and political upheavals.After applying a variant of the comparative method in a subsidiary role to flesh out the rime dictionary evidence, Karlgren believed that he had reconstructed the speech of the Sui-Tang capital Chang'an.[48]
Assigning phonetic values to the finals has proved more difficult, as many of the distinctions reflected in the Qieyun have been lost over time.Karlgren proposed that type B finals contained a palatal medial /j/, a position that is still accepted by most scholars. However Pulleyblank, noting the use of these syllables in the transcription of foreign words without such a medial, claims the medial developed later.A labiovelar medial /w/ is also widely accepted, with some syllables having both medials.The codas are believed to reflect those of many modern varieties, namely the glides /j/ and /w/, nasals /m/, /n/ and /ŋ/ and corresponding stops /p/, /t/ and /k/.Some authors argue that the placement of the first four rhyme groups in the Qieyun suggests that they had distinct codas, reconstructed as labiovelars /ŋʷ/ and /kʷ/.Most reconstructions posit a large number of vowels to distinguish the many Qieyun rhyme classes that occur with some codas, but the number and the values assigned vary widely.[58][59]
The Chinese linguist Li Rong published a study of the early edition of the Qieyun found in 1947, showing that the expanded dictionaries had preserved the phonological structure of the Qieyun intact, except for a merger of initials /dʐ/ and /ʐ/.For example, although the number of rhyme groups increased from 193 in the earlier dictionary to 206 in the Guangyun, the differences are limited to splitting rhyme groups based on the presence or absence of a medial glide /w/.[60][61][62]
However the preface of the recovered Qieyun suggests that it represented a compromise between northern and southern reading pronunciations.[aa]Most linguists now believe that no single dialect contained all the distinctions recorded, but that each distinction did occur somewhere.[8][63]For example, the Qieyun distinguished three rhyme groups 支, 脂 and 之 (all pronounced zhī in modern Chinese), although 支 and 脂 were not distinguished in parts of the north, while 脂 and 之 rhymed in the south.The three groups are treated as tongyong in the Guangyun and have merged in all modern varieties.[64] Although Karlgren's identification of the Qieyun system with a Sui-Tang standard is no longer accepted, the fact that it contains more distinctions than any single contemporary form of speech means that it retains more information about earlier stages of the language, and is a major component in the reconstruction of Old Chinese phonology.[65]
f448fe82f3