An application of Mecab at RomajiDesu

49 views
Skip to first unread message

Hải Bùi Hoàng

unread,
Jun 19, 2014, 6:42:20 AM6/19/14
to nlp-ja...@googlegroups.com

I've made an application of Mecab at http://www.romajidesu.com/translator, the tool is both Japanese translator and analysis.
The online tool decompose a Japanese sentence into different parts, translated into Kana, Romaji and English (using Google translate engine). 
The targeted audience of course is for beginner to intermediate level as well as self taught Japanese learners.
Do you think it'll be useful? What do you think I can do to impove it?

Best regards,
Hai

Jim Breen

unread,
Jun 19, 2014, 7:57:54 PM6/19/14
to nlp-ja...@googlegroups.com
Just a quick comment - I don't find it particularly useful. I can get Google
translate whenever I want, and I have MeCab on local systems and
via simple server (http://www.edrdg.org/~jwb/mecabdemo.html)
I think things like Rikai (http://www.rikai.com/perl/Home.pl) and the
Rikaichan plugin cover this area quite well.

BTW, you have the romaji for 場合 as "bâi". You might like to work
on that  8-)

Jim



--
You received this message because you are subscribed to the Google Groups "nlp-Japanese" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nlp-japanese...@googlegroups.com.
To post to this group, send email to nlp-ja...@googlegroups.com.
Visit this group at http://groups.google.com/group/nlp-japanese.
For more options, visit https://groups.google.com/d/optout.



--
Jim Breen
Adjunct Snr Research Fellow, Japanese Studies Centre, Monash University

Michael Wayne Goodman

unread,
Jun 20, 2014, 5:23:12 AM6/20/14
to nlp-ja...@googlegroups.com
Hi Hai,

To offer a differing opinion, I think it's a fairly slick interface combining the various tools. Of course users can head off to Google Translate or run Mecab locally or on a dedicated web service (which I find useful and have made use of), but for learners of Japanese I think there is value in having them available together in a user-friendly way. There are some things you could do differently, thought:

1. Provide attribution for Google Translate. Not only is it useful to know where the translation is coming from, but it's also a required term for using the service (https://developers.google.com/translate/v2/attribution). I don't even see it on the "About" page where you list other resources.

2. The ruby furigana over the kanji do not line up. It's not always obvious for learners of Japanese which kanji contribute which sounds, so making sure they align vertically (where possible) can help in this regard.

3. The mouseover information on the original sentence is not particularly useful. A dictionary lookup (perhaps with some word-sense disambiguation, at least for ordering the results) would be nice. Currently I only see pronunciation and a simple part-of-speech. E.g. for verbal morphology, it's nice to know that verb + ta makes the perfective form of the verb. I think I recall the "Perapera" pop-up dictionary tool (similar to Rikai) doing this.

Good luck!
-Michael Wayne Goodman

Jim Breen

unread,
Jun 20, 2014, 7:20:18 PM6/20/14
to nlp-ja...@googlegroups.com
On 20 June 2014 19:22, Michael Wayne Goodman <good...@uw.edu> wrote:

> 3. The mouseover information on the original sentence is not particularly useful. A dictionary lookup (perhaps with some word-sense disambiguation, at least for ordering the results) would be nice. Currently I only see pronunciation and a simple part-of-speech. E.g. for verbal morphology, it's nice to know that verb + ta makes the perfective form of the verb. I think I recall the "Perapera" pop-up dictionary tool (similar to Rikai) doing this.

And that is the really hard part. Take as an example 援助交際. You can throw it
into MeCab, etc. and be told it's 援助 + 交際. If your dictionary lookup uses
that split, you get:
援助: assistance; aid; support
交際: company; friendship; association; society; acquaintance

All of that is no help at all (and neither is GT's "Aid communication")
You only get the meaning of 援助交際 by looking at it in its entirety.

It gets worse with really idiomatic expressions. Consider 鼬の最後っ屁.
Looking at the components will lead you to something about the last fart
of a weasel. In fact it means a last-resort defence when you're cornered.

So the message is that for proper dictionary lookups you need to do some
sort of greedy aggregation of morphemes, possibly sorting out inflections
as you go.

Cheers

Jim
Reply all
Reply to author
Forward
0 new messages