Issue using AntConc with russian text

363 views
Skip to first unread message

Fabrice Deprez

unread,
Apr 11, 2015, 9:51:15 AM4/11/15
to ant...@googlegroups.com
Hi,

I have been trying to use AntConc with a corpus of HTML files taken from a russian website (example of file attached), but have been unable to do so. In each instance, the file is titled "??????" and the software can't analyse it.
I have tried switching AntConc to every cyrillic encoding it has, without sucess.
Internet Explorer tells me the encoding of those files are UTF-8, but the same problem arise using that encoding in AntConc.

I tried using the software with HTML pages in english, it worked without any problems.

Is there any solution to this ? The only thing I could find on the internet was confirmation that the software is supposed to be able to handle russian language, so I assume the issue may lie elsewhere.

Have a good day,

Fabrice
МИД России 01 10 2013 Интервью официального представителя МИД России А.К.Лукашевича «РИА Новости» в связи с официальным визитом в Россию Мини.html

Laurence Anthony

unread,
Apr 11, 2015, 10:15:08 AM4/11/15
to ant...@googlegroups.com
The problem is the Russian file name and the UTF-8 content. I suggest you simplify the file names to something a little shorter with only ascii characters in them (e.g. file1.html), and then everything should work fine.

Here is an example of your file (with the tags hidden):

Inline images 1

I hope that helps.

Laurence.



###############################################################
Laurence ANTHONY, Ph.D.
Professor
Center for English Language Education in Science and Engineering (CELESE)
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################

--
You received this message because you are subscribed to the Google Groups "AntConc-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antconc+u...@googlegroups.com.
To post to this group, send email to ant...@googlegroups.com.
Visit this group at http://groups.google.com/group/antconc.
For more options, visit https://groups.google.com/d/optout.

Fabrice Deprez

unread,
Apr 11, 2015, 10:26:19 AM4/11/15
to ant...@googlegroups.com
Alright, it's working perfectly. Thanks a lot for your help !

Laurence Anthony

unread,
Apr 11, 2015, 10:28:25 AM4/11/15
to ant...@googlegroups.com
You're welcome!

Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor
Center for English Language Education in Science and Engineering (CELESE)
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################

Reply all
Reply to author
Forward
Message has been deleted
0 new messages