Approach to solve #230 - unicode in FFI calls

7 views
Skip to first unread message

David Banks

unread,
Aug 26, 2012, 4:59:12 PM8/26/12
to mosh-develo...@googlegroups.com
Hi, I want to try to solve this bug:
http://code.google.com/p/mosh-scheme/issues/detail?id=230

What approach would you recommend? I would normally just dig in and
try to fix it, but the cause of this one seems like it can be quite
complicated.
Just wanting some advice on what patch would be accepted for this.

Cheers,
--
David Banks <amo...@gmail.com>

okuoku

unread,
Aug 27, 2012, 8:45:44 AM8/27/12
to mosh-develo...@googlegroups.com
2012/8/27 David Banks <amo...@gmail.com>:
> Hi, I want to try to solve this bug:
> http://code.google.com/p/mosh-scheme/issues/detail?id=230
>
> What approach would you recommend? I would normally just dig in and
> try to fix it, but the cause of this one seems like it can be quite
> complicated.
> Just wanting some advice on what patch would be accepted for this.

Hmm, it works for me; convert the argument to UTF-8 bytevector first.

(import (rnrs)
(nmosh pffi util) ;; string->utf8/null
(mosh ffi))

(define str "What\x02bc;s up?")
(define str-bv (string->utf8/null str)) ;; (mosh ffi) has an equivalent IIRC

(let ((lib (open-shared-library "libc.so"))) ;; I'm using FreeBSD amd64
(define puts (c-function lib int puts char*))

(puts "Hello, world!")
(puts "What\x02bc;s up?") ;; NG
(puts str-bv)) ;; OK


It seems CStack.push lacks "real" UTF-8 support, since it uses
ascii_c_str to convert the argument.

FFI.cpp:339 says;
// String -> char* (utf-8 ascii only)

So the possible solutions are:

A) Wrap your FFI procedure and always use a bytevector to pass a char* argument

This plan doesn't need mosh itself fixed and has some performance gain
(assuming almost FFI char* argument is 7bit string)
It's a bit too hackish.

B) Fix CStack::push; convert string object into bytevector

... as if string->utf8.


Personally, i've took (A) approach because Windows uses UTF-16
everywhere and i couldn't utilize mosh FFI's char* argument at all.
But anyway, if you are going to (B) I'll absolutely appreciate your patch :)
mosh already/silently assumes environment's LANG settings is UTF-8. I
think UTF-8 is the dominant coding system for *NIX environment.


-- oku
Reply all
Reply to author
Forward
0 new messages