UTF-8 Unicode text and diacritics

748 views
Skip to first unread message

Lonneke

unread,
Oct 23, 2010, 9:35:28 AM10/23/10
to AntConc-discussion
Hi,

I have just installed AntConc for a project on ancient Greek. I am
using a text in UTF-8 Unicode. I have changed the encoding in AntConc
to UTF-8 Unicode, but it does not show letters that have diacritics in
the text. It gives blanks. Any ideas on how I could solve this?


Thanks a lot for your help,
(and thanks for providing such a nice tool, that runs on Linux as
well)

Lonneke

LAnthony

unread,
Oct 23, 2010, 9:51:26 AM10/23/10
to AntConc-discussion
Hi Lonneke,

First, you need to check that the file really is UTF8. Can I ask how
you know this? (Perhaps open the file in a Internet browser. If you go
to the "Character Encoding" page, it will tell you the encoding).

Then, you need to language encoding in the Global Setting of AntConc
and then check the file in the AntConc File View tool. If this does
not work, can you send me the file. I can quickly confirm the problem.

Laurence.

Lonneke

unread,
Oct 27, 2010, 11:45:46 AM10/27/10
to AntConc-discussion
Thank you for your quick reply. I had not seen it until now. I will
answer your questions immediately.

When typing
file 001.txt
on my Ubuntu 8.04 - Hardy Heron machine, I get:
001.txt: Unicode text, UTF-8.

I changed the language encoding to UTF-8 in AntConc.
I tried it again, but I still have the same problem. I will send you
the file by e-mail.

Lonneke

Lonneke

unread,
Oct 28, 2010, 9:04:13 AM10/28/10
to AntConc-discussion
Displaying greek characters in AntConc works fine on my current
machine (Linux: Ubuntu 8.04 - Hardy Heron).
My problems on the other machine might have been due to language
settings.

Thanks,
Lonneke

sysvol32

unread,
Dec 22, 2010, 1:33:06 AM12/22/10
to AntConc-discussion


Hello,

While I was trying to find the useful concordance program which gives
Keyness, Wordlist, Clusters, and Collocation, I found AntConc 3.2.1w.
The program has many features; however, I have some problems with the
Turkish characters. The texts which have Turkish characters like
"şığüçÜĞğİÇÖö" are displayed correctly in Concordance Results (not in
the HITS); however, I can not write these characters to the Search Box
for concordancing. Is there any solution for this problem. Thanks.

LAnthony

unread,
Dec 23, 2010, 8:19:28 PM12/23/10
to AntConc-discussion
Hi,

I have heard about this problem on some computers. In theory, you
should be able to just type the Turkish
characters directly into the search box, because AntConc just relies
on your local operating system for text input.
But, I've heard that some people have trouble when typing characters
that are not part of their local operating system.

Are you on a Turkish operating system? If not, perhaps you need to
upgrade the software that allows you input Turkish characters.

I'm sorry there isn't a better answer. Right now I'm looking to write
AntConc in a more modern computer language that will hopefully resolve
these text input problems. This will take time though.

Best regards,
Laurence.

sysvol32

unread,
Dec 24, 2010, 1:28:20 AM12/24/10
to AntConc-discussion
Thanks for the reply Mr. Anthony.

I am using Windows 7 Ultimate Edition Turkish Version with Turkish
charset on my notebook, and I can not see the characters as I
explained in the previous message; however, my collegue is using
Windows 7 Starter Turkish Version with Turkish charset, and he can see
all of the Turkish characters without any problem.

By the way, we are trying to build Turkish National Corpus
(www.tnc.org.tr) which contains 50 million words both written and
spoken in Mersin University. This study will be pioneer work for
Turkish because we can guarantee that our corpus will be
representative and well-balanced.

For the undergraduate students, we set up a laboratory to make them
use AntConc and other softwares for their Corpus Studies. The AntConc
seems to be the best among the others since it gives Keyness, MI, T-
Score and Log-likelihood. I am glad to hear that you are looking for a
new version. I hope that it will bring many features more... As you
know BNC-Web gives a little more statistical values like MI3, Log-Log
etc.

Your suggestions will be evaluated carefully. Thanks for all.

Yours sincerely,
Umut

Laurence Anthony

unread,
Dec 24, 2010, 2:48:13 AM12/24/10
to ant...@googlegroups.com
Dear Umut,

So it does seem that the OS is responsible for this problem but I
think the dated graphical toolkit I use to build the AntConc interface
(called Tk) is also a source of trouble.

AntConc was originally designed for students and so I have also tried
to avoid putting too many features in it, as they can often confuse
beginner users. However, these days, more and more researchers seem to
be using AntConc as a replacement for WordSmith Tools and other tools.
So, perhaps I need a way to switch modes giving users a simple or
advanced interface.

Let me think about this.

Regards,
Laurence.

sysvol32

unread,
Dec 24, 2010, 3:02:18 AM12/24/10
to AntConc-discussion
I am glad to hear that the AntConc Software may have Advanced Options
for the Advanced Users. I am sure about that we can do more with the
comprehensive edition of AntConc. I would like to be a Beta-Tester of
the software when you compile the new beta version. Nowadays, I am
trying to major on AntConc's special features. You know The Wordsmith
Tool is not a freeware, and we are not able to make researches freely
with it. Thanks for all.

Yours sincerely
Umut

Laurence Anthony

unread,
Dec 24, 2010, 3:12:03 AM12/24/10
to ant...@googlegroups.com
> I would like to be a Beta-Tester of
> the software when you compile the new beta version. Nowadays, I am
> trying to major on AntConc's special features

Dear Umut,

Thank you for the offer. When I start getting something working, I can
let you know through the discussion list.

Laurence.

Reply all
Reply to author
Forward
0 new messages