Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[ANN] ICU4R 0.1.0 - initial release

0 views
Skip to first unread message

Lugovoi Nikolai

unread,
Jan 19, 2006, 3:13:37 AM1/19/06
to
==ICU4R v.0.1.0 - initial release ==

= Abstract

ICU4R is an attempt to provide better Unicode support for Ruby, based
on ICU library.

Project Site: http://rubyforge.org/projects/icu4r/

Download: http://rubyforge.org/frs/download.php/8116/icu4r-0.1.0.tar.gz

RDoc: http://icu4r.rubyforge.org/

= Install Notes

To build ICU4R you'll need GCC and ICU v3.4 libraries, which can be
downloaded from
http://ibm.com/software/globalization/icu/downloads.jsp

Build and install:
ruby extconf.rb && make && make check && make install

= Features

ICU4R is Ruby C-extension binding for ICU library.
It is NOT mirroring full ICU object hierarchy, but is rather set of simple
interfaces for some practically useful functionality, and provides:

- UString : String-like class with internal UTF16 storage;
- UCA rules for UString comparisons (<=>, casecmp);
- Unicode regular expressions;
- encoding(codepage) conversion;
- Unicode normalization;
- access to resource bundles, including ICU locale data;
- transliteration, also rule-based;

Bunch of locale-sensitive functions:
- upcase/downcase;
- string collation;
- string search;
- iterators over text line/word/char/sentence breaks;
- message formatting (number/currency/string/time);
- date and number parsing.

== DISCLAIMER ==

The code is slow and inefficient yet, can have many security and memory leaks,
bugs, inconsistent documentation, incomplete test suite. Use it at
your own risk.

Critics, bug reports, feature requests are welcome :)

WBR, Nikolai Lugovoi <meadow...@gmail.com>


Alex Fenton

unread,
Jan 19, 2006, 11:33:34 AM1/19/06
to
Lugovoi Nikolai wrote:
> ==ICU4R v.0.1.0 - initial release ==
>
> ICU4R is an attempt to provide better Unicode support for Ruby, based
> on ICU library.

Thanks, this is really interesting - not heard of the ICU library before.

There have been a few threads on Ruby + Unicode recently. Though the answer 'it's not broken' is true in that Ruby won't mess with your low-level UTF-8/16 bytes, the absence of support for semantics of glyphs is a big hindrance for writing multilingual text handling apps. It's things like having character classes like [:alpha:] and methods like String#upcase that actually work. Looks like ICU4r could address this.

But .. I couldn't try it as the build failed on OS X 10.3 . Installed ICU to /usr/local without a hitch, and ran extconf.rb without problem. But make died with:

SCIPIUS:~/installers/ruby/icu4r alex$ make
gcc -fno-common -Wall -I. -I/usr/local/lib/ruby/1.8/powerpc-darwin7.9.0 -I/usr/local/lib/ruby/1.8/powerpc-darwin7.9.0 -I. -c ustring.c
ustring.c: In function `icu_ustr_new_set':
ustring.c:169: warning: assignment discards qualifiers from pointer target type
ustring.c: In function `icu_reg_get_replacement':
ustring.c:1854: warning: passing arg 4 of `ustr_splice_units' discards qualifiers from pointer target type
ustring.c:1864: warning: passing arg 4 of `ustr_splice_units' discards qualifiers from pointer target type
ustring.c: In function `icu_ustr_substr':
ustring.c:2296: warning: unused variable `n'
g++ -fno-common -Wall -I. -I/usr/local/lib/ruby/1.8/powerpc-darwin7.9.0 -I/usr/local/lib/ruby/1.8/powerpc-darwin7.9.0 -I. -c fmt.cpp
cc -dynamic -bundle -undefined suppress -flat_namespace -licuuc -licui18n -licudata -L"/usr/local/lib" -o ustring.bundle ustring.o fmt.o -ldl -lobjc
ld: multiple definitions of symbol _rb_cUString
ustring.o definition of _rb_cUString in section (__DATA,__common)
fmt.o definition of _rb_cUString in section (__DATA,__common)
make: *** [ustring.bundle] Error 1

SCIPIUS:~/installers/ruby/icu4r alex$ gcc -v
Reading specs from /usr/libexec/gcc/darwin/ppc/3.3/specs
Thread model: posix
gcc version 3.3 20030304 (Apple Computer, Inc. build 1666)

HTH
alex

Gyoung-Yoon Noh

unread,
Jan 19, 2006, 7:44:52 PM1/19/06
to

Great work. I'll check out next week.

--
http://nohmad.sub-port.net


Lugovoi Nikolai

unread,
Jan 21, 2006, 4:53:49 AM1/21/06
to
Alex, thank you for pointing this bug.
I had no compile problems with GCC 3.4.2, GCC 4.0 and MSVC++ 7.1, so
didn't catch that, looks like GCC 3.3 has different default linking
options.

Could you try 0.1.1 release ?
http://rubyforge.org/frs/download.php/8168/icu4r-0.1.1.tar.gz

(Sorry for late response)

Alex Fenton

unread,
Jan 23, 2006, 2:19:03 PM1/23/06
to
Lugovoi Nikolai wrote:

thanks for this, it compiles fine on OS X 10.3 (see below), but segfaults when I run the ruby test with

dyld: ruby Undefined symbols:
___gxx_personality_v0
Trace/BPT trap

Let's take it off-list unless this rings any bells for anyone

alex

SCIPIUS:~/icu4r alex$ make clean; make; ruby test/test_ustring.rb

gcc -fno-common -Wall -I. -I/usr/local/lib/ruby/1.8/powerpc-darwin7.9.0 -I/usr/local/lib/ruby/1.8/powerpc-darwin7.9.0 -I. -c ustring.c

Michal Suchanek

unread,
Jan 24, 2006, 12:08:35 PM1/24/06
to
On 1/24/06, Alex Fenton <al...@deleteme.pressure.to> wrote:
> Lugovoi Nikolai wrote:
>
> > Could you try 0.1.1 release ?
> > http://rubyforge.org/frs/download.php/8168/icu4r-0.1.1.tar.gz
>
> thanks for this, it compiles fine on OS X 10.3 (see below), but segfaults when I run the ruby test with
>
> dyld: ruby Undefined symbols:
> ___gxx_personality_v0
> Trace/BPT trap
>

Usually C++ code compiled by different versions of gcc linked together.

Check that all the stuff and the libraries it links with are compiled
with the same gcc.

Thanks

Michal

0 new messages