lua interface for hunspell

265 views
Skip to first unread message

Matt White

unread,
Mar 17, 2010, 5:32:19 PM3/17/10
to scite-interest
Hello all,
For anyone interested in spell checking in SciTE, I put together a
simple Lua interface to hunspell (Windows only for now) and a spell
checking script for SciTE. Misspelled words are red-underlined;
double-clicking a misspelled word brings up a list of suggestions.
The installation is a bit involved, since SciTE Extman and hunspell
dictionary files are needed in addition to the DLL and script
available at:

http://code.google.com/p/luahunspell/

Comments? Suggestions?

Matt White
pbs...@gmail.com

romor

unread,
Mar 18, 2010, 8:19:06 PM3/18/10
to scite-interest
Works perfect for me

Thank you :)

romor

unread,
Mar 18, 2010, 9:39:07 PM3/18/10
to scite-interest
Thou it can't load Cyrillic dictionaries: Error: hunspell not
initialized. Please check dictionary path

and if path for dictionary is wrong it can hand SciTE (process is
running but SciTE window is unavailable)

mozers

unread,
Mar 19, 2010, 2:50:53 AM3/19/10
to romor
Friday, March 19, 2010, 4:39:07 AM, romor wrote:
> Thou it can't load Cyrillic dictionaries: Error: hunspell not
> initialized. Please check dictionary path

print("Error: hunspell not initialized. Please check dictionary path.")
be if:
- dictionary is not found (you can explicitly specify the path to hunspell.init)
- dictionary corrupted
- dictionary has a different character encoding differs from the checked text
- as a test hunspell.spell("test") uses the word which is not in this dictionary (In Cyrillic dictionary is not include the word "test").

Thanks to Matt White!

--
mozers
<http://scite.net.ru>

romor

unread,
Mar 19, 2010, 8:05:08 AM3/19/10
to scite-interest
I tried two dictionaries before I posted:
- one was in CP1251 (my local code page)
- other was in UTF-8

They both failed with the same error, and when I added word
"test" (thanks for the tip) in the first one (CP1251) then error
stopped popping, but spelling was not working, thou it doesn't have
any effect on UTF-8 dict

On Mar 19, 7:50 am, mozers <moz...@gmail.com> wrote:
> - dictionary is not found (you can explicitly specify the path to hunspell.init)
> - dictionary corrupted
> - dictionary has a different character encoding differs from the checked text
> - as a test hunspell.spell("test") uses the word which is not in this dictionary (In Cyrillic dictionary is not include the word "test").

- it's in the same place as others
- doesn't look like, I opened both dictionaries and they seem fine
and with stated encoding
- I'm in that code page (CP1251) and additionally tried spelling
document in UTF-8 (with that other dictionary) without success
- this has some effect as noted

additionally:
- .aff file contains this first line:
SET microsoft-cp1251

I tried changing to:
SET CP1251
SET CP-1251

with no luck

romor

unread,
Mar 19, 2010, 8:06:34 AM3/19/10
to scite-interest
On Mar 19, 2:39 am, romor <dejan....@gmail.com> wrote:

> and if path for dictionary is wrong it can hang SciTE (process is


> running but SciTE window is unavailable)

just for the record, SciTE hanged in this situation:

- I moved dict files to "dict" subfolder and added:

hunspell.init("dict\en_US.aff", "dict\en_US.dic");

notice single backslash - now SciTE won't show

changing to:

hunspell.init("dict\\en_US.aff", "dict\\en_US.dic");

works as expected

Philippe Lhoste

unread,
Mar 19, 2010, 10:41:02 AM3/19/10
to scite-i...@googlegroups.com
On 17/03/2010 22:32, Matt White wrote:
> For anyone interested in spell checking in SciTE, I put together a
> simple Lua interface to hunspell (Windows only for now) and a spell
> checking script for SciTE. Misspelled words are red-underlined;
> double-clicking a misspelled word brings up a list of suggestions.

Cool, I wanted that for a long time, but never managed to do that myself.

For the record, there is also the SciHun / SciTE + Hunspell project
<http://sourceforge.net/projects/scihun/>
But it implies numerous deep changes to the source code, so it is harder to keep across
versions.

> The installation is a bit involved, since SciTE Extman and hunspell
> dictionary files are needed in addition to the DLL and script
> available at:

I suggest to put some explanations in the home page or a wiki page of your project.

I suggest to give the example as:
hunspell.init("C:/hunspell/en_US.aff", "C:/hunspell/en_US.dic")
to avoid issues like romor's one.


It is also my first opportunity to test extman...
And I had trouble with it!
So this message might interest Steve as well.

I have put, along the instructions, extman.lua in my SciteDefaultHomedirectory (where I
have my properties files), and spawner-ex.dll in my SciTE directory (where I have the
binary files).
Now, each time I switch files, I have one or several cmd boxes flashing, and I get the
message: "cannot load spawner. The specified module cannot be found" (the second part of
the message is actually in French, I translated it approximatively...).

I moved spawner-ex.dll to SciteDefaultHomebut same result.
Using Sysinternals' Process Monitor, I found out that it searches it one folder above
SciteDefaultHome!
I find no installation instruction for this DLL in extman.html (it isn't even mentioned
there) nor in documentation.html in scite-debug's package.
The answer is, of course, in extman.lua
After some tracing, I found out the issue was that I wrote:
ext.lua.startup.script=$(SciteDefaultHome)/extman.lua
where extman expected:
ext.lua.startup.script=$(SciteDefaultHome)\extman.lua

Windows tolerates mixed paths (it was E:\Dev\PhiLhoSoft\settings\SciTE/extman.lua) but
extman didn't expect that...

Next issue was that hunspell.dll was expected in SciTE dir, along with SciLexer.dll, which
is more logical...
I put spawner-ex.dll there too (no need to put it with my settings) by adding:
spawner.extension.path=.
in my properties file.

I changed spellchecker.lua for the dictionary init:

local dictPath = props['SciteDefaultHome'] .. [[\dic\]]
local dict = 'en-US'
hunspell.init(dictPath .. dict .. ".aff", dictPath .. dict .. ".dic"); -- relative to
SciTE folder

plus it offers a value to display in the message:
print("Error: hunspell not initialized. Please check dictionary path: " .. dictPath);

However, I don't see the Toggle spelling option on all buffers...
I am not sure why, but it appears in the Tools menu only on even buffers (buffer 0, 2,
4...). Mmm, no, it changes (sometime it is on buffer 1, 3, 5...) when adding new files...

And it acts curiously.
If I have b = 'false' it marks false as incorrect because of the quotes (seen as
apostrophes...).
On HTML files, it turns the whole file to style 12, except on bad words.
Highlighted words have a strange style (like -84).
Perhaps checker should also verify style, eg. to verify only strings and comments.

There is room for improvement, but that's already a very good start, I think I will try
and improve it, if I can.

--
Philippe Lhoste
-- (near) Paris -- France
-- http://Phi.Lho.free.fr
-- -- -- -- -- -- -- -- -- -- -- -- -- --

Matt White

unread,
Mar 20, 2010, 1:12:25 PM3/20/10
to scite-interest
Thanks for the feedback. I've uploaded a new version with a number of
changes.

I changed to "modern" indicators for marking misspelled words instead
of style byte indicators; this fixes the problem with HTML files.
Words in quotes, e.g., 'false' should be handled better now.
Dictionary path and name can now be set in SciTE properties file.

Philippe - I can't reproduce the problem with "Toggle Spelling" only
appearing for even or odd buffers; does the shortcut (F9) still work
for all buffers for you? I encountered the exact same problem with
the extman path because the version of extman I was initially using
did work with '/' as a path separator!

Romor - for Cyrillic, try uncommenting the "os.setlocale" line in the
new version to use code page 1251. You might also try setting the
spell_ignoreCAPS option to false. Let me know if this works. I don't
think UTF-8 is an option, since Lua doesn't seem to support unicode.
Also, I turned the dictionary loading check off by default, so you
don't need to add "test" to the dictionary now.

Matt White
pbs...@gmail.com

romor

unread,
Mar 20, 2010, 3:00:37 PM3/20/10
to scite-interest

On Mar 20, 6:12 pm, Matt White <pbs...@gmail.com> wrote:

> Romor - for Cyrillic, try uncommenting the "os.setlocale" line in the
> new version to use code page 1251.

Perfect
Thanks :)

romor

unread,
Mar 20, 2010, 4:45:58 PM3/20/10
to scite-interest
+ thumbs up for your To Do ;)

ADelmotte

unread,
Mar 30, 2010, 8:53:22 AM3/30/10
to scite-interest
Hi!

I think I have all components in the Scite directory, but I'll need a
detailed installation note to be able to use the system.
I am working on Windows XP, Scite is in
C:\Documents and Settings\Principal\Local Settings\Application Data
\Apps\SciTE
(installation with the .msi file)
I also have extman, dictionaries, hunspell.dll and the script
spellcheck.lua in the Scite directory along all the properties, etc
files.

How to have Scite use spellchecking ?

Thanks,

Alain

Philippe Lhoste

unread,
Mar 30, 2010, 11:52:11 AM3/30/10
to scite-i...@googlegroups.com
On 30/03/2010 14:53, ADelmotte wrote:
> I think I have all components in the Scite directory, but I'll need a
> detailed installation note to be able to use the system.
> I am working on Windows XP, Scite is in
> C:\Documents and Settings\Principal\Local Settings\Application Data
> \Apps\SciTE

What is this new trend to put applications in Local Settings\Application Data? (And why
this Apps?) I first saw that with Google Chrome and I was horrified: the base idea of
Program Files folder vs. Documents and Settings is to separate applications (that can be
installed again if needed, and uninstalled) from their settings/data.
Here, they are mixed up, making harder to do proper backup, among other things.

Is that a new policy from Microsoft or some mistake from the packagers?

> (installation with the .msi file)
> I also have extman, dictionaries, hunspell.dll and the script
> spellcheck.lua in the Scite directory along all the properties, etc
> files.
>
> How to have Scite use spellchecking ?

I recently wrote a long message about my experience in setting up all this. With an added
difficulty of having settings separated from exe... Maybe it can help you in making it
running?
You also have instructions coming with the spellchecking package.

Reply all
Reply to author
Forward
0 new messages