Hi Arnold,
> I've just released an UDF lib to stem words using the snowball stemmer.
> It only contains 1 function stem_word. Please have a look athttp://
www.mysqludf.com/lib_mysqludf_stem.
nice job! I was able to install and run stem_word without issue.
A few comments on miscellaneous things:
#1
"The location of the plugin dir defaults to: <mysql-home>/lib/mysql/
but can be configured to a custom location in the my.cnf."
It's not a big deal, because you properly linked to the documentation
page. However, I always suggest people to run:
SHOW VARIABLES LIKE 'plugin_dir'
so they know immediately where to move the lib.
#2
In the API section, it would be cool if you could document the return
type for the functions.
#3
I was wondering what value to use for the language argument. I think
both 'en' and 'English' should work, but is it possible to document a
list of valid languages, or better, ask the stemmer which languages
are supported?
If I input some obvious nonsense, like stem_word('~', 'bla') I get
NULL. I am not sure what the best approach is although I expected to
get an error to inform me that '~' is not a supported language.
#4
I tried passing a non-constant for the language argument:
mysql> select stem_word(language, language) from world.countrylanguage
limit 1;
+-------------------------------+
| stem_word(language, language) |
+-------------------------------+
| NULL |
+-------------------------------+
1 row in set (0.00 sec)
Is this because you handle the error in the row-level function rather
than in the row-level function?
Personally I would prefer do this check in the init function:
if( args->args[0] != NULL){
strcpy(message, "Invalid argument value: language argument must be a
constant value.");
return 1;
}
(or something like this)
#5
I got the impression that stem_word coerces non-string arguments to
strings, is that correct? I was wondering if it would make more sense
to fail with an error in case a non-string is passed. (Not really
sure, just a thought)
(I tried: mysql> select stem_word(0,10) limit 1;
+-----------------+
| stem_word(0,10) |
+-----------------+
| NULL |
+-----------------+
)
#6 Finally I was wondering if you anticipate more functions will be
added to this lib. It seemed to me that it may make sens to move
stem_word to lib_mysqludf_str, but this is just a thought - maybe you
have some reasons to want to keep it in a separate lib.
>
> Also, there a 3 bugs for the sys, str and preg udf libs. Please take a
> look athttp://
bugs.mysqludf.com.
I added a comment to the bug for the sys lib. I hope it will be picked
up so I get the feedback.
I wasn't really aware a bug was assigned to me, I guess I should check
it on a regular basis. Would it be possible to send automatic email
notification? That is, if something like that is already in place,
great, I just don't recall getting an email.
(of course rss/atom feed would work just as well for me, apologies if
that is in place already, I haven't seen it.)
kind regards,
Roland