Using NAWL in AntWord Profiler

69 views
Skip to first unread message

Tim VANDENHOEK

unread,
Jun 12, 2022, 8:32:33 PM6/12/22
to AntWordProfiler-Discussion
Hi all,

Question from a rank amateur here....

I am trying to run the NAWL baseword list in the profiler but I get the following error message: 

C:/Users/Tim/Desktop/NAWL_basewrd.txt

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc9 in position 10488: invalid continuation byte

The NGSL lists I downloaded with the NAWL seem to work fine (see screenshot)

Capture.PNG

Any help would be greatly appreciated.

Laurence Anthony

unread,
Jun 12, 2022, 8:59:28 PM6/12/22
to antword...@googlegroups.com
Hi,

The error message explains the problem. The file you are giving to AntWordProfiler is not UTF-8 encoded. Save the file in the UTF-8 encoding and everything should work fine.

Where did you get the  NAWL baseword list? 

Regards,
Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################


--
You received this message because you are subscribed to the Google Groups "AntWordProfiler-Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antwordprofil...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/antwordprofiler/b08cb515-2d65-4ac0-8b5a-504ecf734622n%40googlegroups.com.

Tim VANDENHOEK

unread,
Jun 12, 2022, 10:31:18 PM6/12/22
to AntWordProfiler-Discussion
Hello Laurence,

Many thanks for your efforts with the software and the quick help, your advice solved the issue.

I got all the lists from the AntWordProfiler homepage. Only the NAWL list file led to the error (so far as I can tell).

With thank,

Tim

Laurence Anthony

unread,
Jun 12, 2022, 11:14:40 PM6/12/22
to antword...@googlegroups.com
Hi again,

I looked at the files and they were very old, dated in 2014. I suspect I received them from the author. I noticed that the line breaks were a little odd and that the NAWL_basewrd.txt file was indeed incorrectly encoded. I imagine most users of the lists loaded in the 1st, 2nd, and 3rd 1000 lists, which were correctly encoded. Also, AntWordProfiler 1x didn't do strict testing of the file encodings, so they would have been accepted. AntWordProfiler 2x is much more strict about encodings, which is why you saw the warning.

I've updated all the lists now on the website. Please try them if you have time.

Best regards,

Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################

Tim VANDENHOEK

unread,
Jun 12, 2022, 11:20:51 PM6/12/22
to AntWordProfiler-Discussion
Hello again,

I just downloaded the updated versions and they work perfectly. Thanks very much!

Best,

Tim Vandenhoek

Reply all
Reply to author
Forward
0 new messages