Hi,
So I'm trying to work with a database written by another program. It's encoded in UTF-16LE, and I'm trying to edit data in it in Ruby on both Mac and Linux. I've been having numerous problems so far.
----
Problem #1: Incorrect encoding on queried results
Using the "sqlite3" program:
SQLite version 3.7.16.2 2013-04-12 11:52:43
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> pragma encoding;
UTF-16le
sqlite> SELECT * FROM ItemTable WHERE key = 'muted';
muted|false
Using sqlite3-ruby:
>> db = SQLite3::Database.new("file__0.localstorage")
=> #<SQLite3::Database:0x000000030906e8>
>> q = db.query("SELECT * FROM ItemTable WHERE key = 'muted'"); key, value = q.next
=> ["\u756D\u6574\x64", "f\x00a\x00l\x00s\x00e\x00"]
>> [key, value].map(&:encoding)
=> [#<Encoding:UTF-16LE>, #<Encoding:ASCII-8BIT>]
It actually has them reversed:
>> key.force_encoding("ASCII-8BIT")
=> "muted"
>> value.force_encoding("UTF-16LE")
=> "false"
It also gets the encoding wrong for values I insert myself, but in a different way:
>> db.execute("INSERT INTO ItemTable VALUES ('insertedKey', 'insertedValue')")
=> []
>> q = db.query("SELECT * FROM ItemTable WHERE key = 'insertedKey'"); key, value = q.next
=> ["\u6E69\u6573\u7472\u6465\u654B\x79", "\u6E69\u6573\u7472\u6465\u6156\u756C\x65"]
>> [key, value].map(&:encoding)
=> [#<Encoding:UTF-16LE>, #<Encoding:UTF-16LE>]
>> key.force_encoding("ASCII-8BIT")
=> "insertedKey"
>> value.force_encoding("ASCII-8BIT")
=> "insertedValue"
But sqlite3 is happy with my inserted stuff:
sqlite> select * from itemtable where key = 'insertedKey';
insertedKey|insertedValue
Note that the above occurs even if I create a DB completely from scratch, using
SQLite3::Database.new(filename.sqlite".encode("UTF-16LE"))
so it's not just that I have a weird corrupt database or something. (Also, is that filename encode trick documented anywhere? I believe I randomly stumbled across it in sample code.)
It also seems I'm not the only one to encounter this, but the author of this post just found a satisfactory workaround and went with that: http://stackoverflow.com/questions/8250171/ruby-sqlite3-encoding-problems
----
Problem #2: Binding parameters in queries only accepts UTF-8
>> q = db.query("SELECT * FROM ItemTable WHERE key = ?", ["muted".encode("US-ASCII")]); q.count
=> 0
>> q = db.query("SELECT * FROM ItemTable WHERE key = ?", ["muted".encode("UTF-16LE")]); q.count
=> 0
>> q = db.query("SELECT * FROM ItemTable WHERE key = ?", ["muted".encode("ASCII-8BIT")]); q.count
=> 0
>> q = db.query("SELECT * FROM ItemTable WHERE key = ?", ["muted".encode("UTF-8")]); q.count
=> 1
Discovered this when I tried to do queries in a source file that wasn't marked with an "enconding: utf-8" comment tag at the top. My strings were US-ASCII by default and I couldn't do any queries with bound params.
----
So, yeah. What's up with the current status of sqlite3-ruby UTF-16 support? Are these just a really unfortunate series of bugs, or is UTF-16 not supported at all and I'm basically on my own? (I'd be happier if it was at least consistently wrong, but it seems to be wrong in different ways depending on the situation.)