Encoding problem when importing from a UTF-8 encoded text file

113 views
Skip to first unread message

regi...@bazanov.net

unread,
Dec 30, 2016, 1:13:58 AM12/30/16
to mnemosyne-proj-users
Hello,

I've created a UTF-8 encoded file with transcription and cyrillic text:
https://postimg.org/image/cf3b2g6o7/

Here is another program to confirm it's in UTF-8:
https://postimg.org/image/vy7wbt5fr/

Then I import that file into Mnemosyne:
https://postimg.org/image/rdlpwvlqf/

And the text is shown incorrectly:
https://postimg.org/image/wdj64trd3/

Am I doing something wrong?

The file can be found here:
http://ra.ae/anki.txt

Peter Bienstman

unread,
Dec 30, 2016, 2:45:40 AM12/30/16
to mnemosyne-...@googlegroups.com
Hi,

Thanks for the detailed report! I'm currently away, but I'll fix this for the next release.

Cheers,

Peter
> --
> You received this message because you are subscribed to the Google Groups
> "mnemosyne-proj-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mnemosyne-proj-u...@googlegroups.com.
> To post to this group, send email to mnemosyne-proj-
> us...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/mnemosyne-proj-users/d6708059-
> 9971-4c50-8690-13110f65524e%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

regi...@bazanov.net

unread,
Dec 30, 2016, 9:57:11 AM12/30/16
to mnemosyne-proj-users
Thank you, Peter!

Is there a workaround for me in the meantime?
(I'm a programmer so I'm OK if the workaround is non-trivial)

Kind regards,
Pavel

Peter Bienstman

unread,
Dec 31, 2016, 3:36:04 AM12/31/16
to mnemosyne-...@googlegroups.com
Hi,

What you can do for the time being is manually creating the cards.

I'll get you a private prerelease version as soon as I'll fix the bug after my holidays :-)

Cheers,

Peter

> -----Original Message-----
> From: mnemosyne-...@googlegroups.com [mailto:mnemosyne-
> proj-...@googlegroups.com] On Behalf Of regi...@bazanov.net
> Sent: 30 December 2016 15:57
> To: mnemosyne-proj-users <mnemosyne-...@googlegroups.com>
> Subject: Re: [mnemosyne-proj-users] Encoding problem when importing
> from a UTF-8 encoded text file
>
> --
> You received this message because you are subscribed to the Google Groups
> "mnemosyne-proj-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mnemosyne-proj-u...@googlegroups.com.
> To post to this group, send email to mnemosyne-proj-
> us...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/mnemosyne-proj-users/65555268-f321-
> 41bb-931c-ef37556d586f%40googlegroups.com.

regi...@bazanov.net

unread,
Jan 4, 2017, 10:54:21 PM1/4/17
to mnemosyne-proj-users
Thank you, Peter!

I will eagerly wait for your prerelease version :)

Kind regards,
Pavel

Peter Bienstman

unread,
Jan 5, 2017, 9:27:43 AM1/5/17
to mnemosyne-proj-users, regi...@bazanov.net

regi...@bazanov.net

unread,
Jan 5, 2017, 7:57:52 PM1/5/17
to mnemosyne-proj-users, regi...@bazanov.net
Awesome, it is working correctly now, thanks!

MarcoP

unread,
Jan 6, 2017, 12:53:35 AM1/6/17
to mnemosyne-proj-users, regi...@bazanov.net
This may be a similar problem, therefore posted here:

When importing cards (Mnemosyne 2.4, menu selection File-Import... select 2.x cards), using the following card sets (available under one of the Mnemosyne-links, see website "Sharing cards", i.e. http://mnemosyne-proj.org/card-sets under German or Chinese):
chinese_hsk.cards
hanziSY.cards
quick_german.cards

the following error message occurs:

"An unexpected error has occurred.
Please forward the following info to the developers:

Traceback (innermost last):
File "mnemosyne/pyqt_ui/import_dlg.py", line 76, in accept
File "mnemosyne/libmnemosyne/file_formats/mnemosyne2_cards.py", line 213, in do_import
File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/encodings/ascii.py", line 26, in decode
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 561: ordinal not in range(128)"

(or: UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 351: ordinal not in range(128), or:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 976: ordinal not in range(128) --- line number reported depending on the file).

Furthermore, the "Decompressing" info remains, can only be eliminated when ESC is pressed.
Machine: MacOSX 10.11.6

Strange thing: When using MnemoSyne 2.3.6, these card sets can be imported. And then they are available under MnemoSyne 2.4 as well! (both Mnemosyne versions installed in parallel; select menu Cards-Browse cards... in Mnemosyne2.4).

==> You may try to import the file in a former version, to have it available in v2.4.

Hope it helps... M

Peter Bienstman

unread,
Jan 6, 2017, 8:17:25 AM1/6/17
to mnemosyne-...@googlegroups.com, regi...@bazanov.net, Devin Howard
Hi,

I tried importing this card set: http://mnemosyne-proj.org/cards/chinese-hsk-set

Under Windows, this proceeded without problems.

It might be a platform specific issue, so I've committed a patch to explicitly specify the encoding, so hopefully this helps in your case.

Perhaps Devin, our OSX packager, can verify if this solves the problem.

Cheers,

Peter

> -----Original Message-----
> From: mnemosyne-...@googlegroups.com [mailto:mnemosyne-
> --
> You received this message because you are subscribed to the Google Groups
> "mnemosyne-proj-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mnemosyne-proj-u...@googlegroups.com.
> To post to this group, send email to mnemosyne-proj-
> us...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/mnemosyne-proj-users/a5959ed4-
> 557c-4223-8637-1e28f960d5d9%40googlegroups.com.

MarcoP

unread,
Jan 6, 2017, 10:48:47 AM1/6/17
to mnemosyne-proj-users
> I tried importing this card set: http://mnemosyne-proj.org/cards/chinese-hsk-set
> It might be a platform specific issue, so I've committed a patch to explicitly specify the encoding, so hopefully this helps in your case.

Thank you,

note that it worked in Mnemosyne 2.3.6 (Mac version) and the vocabulary then is available under Mnemosyne 2.4, too (which seems to be a reasonable workaround).

MarcoP

abaku...@arcor.de

unread,
Jan 6, 2017, 1:44:23 PM1/6/17
to mnemosyne-proj-users@googlegroups com
Hi Peter,

I have seen a similar error too, it's been something with "dlg" (the same file?). I am learning Turkish on Windows.

By the way, when I use Mnemosyne 2.4 on my PC, I can't delete any cards from the cards browser anymore. I'll send you a screenshot of the error message in a few days. And exporting the cards as text file doesn't work either.

Greetings,

Abakus

Am 06.01.2017 14:17 schrieb Peter Bienstman <Peter.B...@UGent.be>:
>
> Hi,

>
> I tried importing this card set: http://mnemosyne-proj.org/cards/chinese-hsk-set
>

> Under Windows, this proceeded without problems.
>

> It might be a platform specific issue, so I've committed a patch to explicitly specify the encoding, so hopefully this helps in your case.
>

> To post to this group, send email to mnemosyne-...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/mnemosyne-proj-users/d77c31c7ea3843689eec613066562e79%40xmail102.UGent.be.

Devin Howard

unread,
Jan 6, 2017, 6:17:40 PM1/6/17
to Peter Bienstman, mnemosyne-...@googlegroups.com, regi...@bazanov.net
I replicated this error on my version of 2.4 on macOS. I'll try it with the latest pbienst branch when I next get a chance

Devin Howard

unread,
Jan 23, 2017, 12:22:53 PM1/23/17
to Peter Bienstman, mnemosyne-...@googlegroups.com, regi...@bazanov.net
I confirmed today that on the latest development code this problem is fixed on macOS
Reply all
Reply to author
Forward
0 new messages