The Japanese-German dictionary WaDokuJT contains data for accent and pronunciation, i.e. number for accent and kana plus mark-up which can together be transformed to IPA, phonetic or phonological transcriptions in kana or Rōmaji etc. The dictionary data can be used under the Creative Commons License with Attribution and Share Alike. In my file, about 130'000 records out of 325'000 contain accent information.
In most or at least many cases devocalized mora – like Asakusa where the u is dropped to become [a˹saku̥sa] in IPA – have corresponding mark-up. Similar mark-up is used for the mora ga, gi, gu, ge, go when they are not nasalized within a word, which is the case in many katakana loanwords. An example would be プログラム, and we have an online version of it as "puro’guramu" on
http://wadoku.de/entry/view/98996. Further, は that is pronounced wa and へ that is pronounced e have also diacritical markup.
The data is available under
https://github.com/Wadoku. This file is a little bit old, but we are working on a new version of the dictionary file, which should be available in autumn. There are two online versions of the dictionary:
wadoku.eu and
wadoku.de. Accent data is displayed at
wadoku.de and a Rōmaji version can be found in a detail layout.
In my opinion too, Japanese pronunciation is important and not always as straight forward as learners of Japanese think. Unfortunately, our programmers were not interested in a decent rendering of the pronunciation details and we couldn't pay them for improvements. So. devocalisation never made to the user interface. Yes, there are rules for devocalisation, but it not always unambiguous. We have also audio data for the basic vocabulary spoken by a former TV announcer. An example would be はし with accent 1 or accent 2 on
https://wadoku.eu/?query=%E3%81%AF%E3%81%97, but unfortunately the path to the audio files seems broken.
With the new version of the dictionary data, we hope to not only to fix the existing problems but use also to add stroke order data and will check for characters in the dictionary which are missing in KanjiVG. This means, that we hope to have also improvements to KanjiVG in the foreseeable future.
Best wishes
Ulrich
> Am 29.06.2017 um 19:36 schrieb Tomash Brechko <
tomash....@gmail.com>:
>
> I was considering using accent data from OJAD for my Android app some time ago, but couldn't figure out their terms of use. The closest I found was last two sentences in the top frame at
http://www.gavo.t.u-tokyo.ac.jp/ojad/eng/pages/notes , which ruled out the commercial use (which I targeted at that point), yet left me undecided about whether it's ok to use the data in the free app, or is it supposed to be used only through their site. I don't mean it as any kind of suspicion or rebuke, but did you have any luck finding their explicit terms of use, or did you ask some kind of permission from them, or clarified the matter in some other way? OJAD seems to be the only source of accent data, and their audio is also a valuable resource: though some of us privately use clips from JapanesePod101, alas they can't be used even in free apps, let alone paid.
>
> --
> --
> You received this message because you are subscribed to the "KanjiVG" group.
> For options and unsubscribing, visit this group at
>
http://groups.google.com/group/kanjivg
> ---
> You received this message because you are subscribed to the Google Groups "KanjiVG" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
kanjivg+u...@googlegroups.com.
> For more options, visit
https://groups.google.com/d/optout.