How to enable ActiveRecord to support CJK query?

23 views
Skip to first unread message

Dillon Peng

unread,
Jun 23, 2015, 8:02:36 AM6/23/15
to rubyonra...@googlegroups.com

I ask a question in stackoverflow: http://stackoverflow.com/q/30970840/1054800, There is no answer!

After runing rails console, I can execute the following query:

1.9.3-p551 :001 > ActivityObject.where(:title => "kiketurpis integer aliquet")
And I got a unique answer existed in database. But if I enter:

1.9.3-p551 :002 > ActivityObject.where(:title => '第一个纵纹')

(In double quotes there is a Chinese string.) I got all records in table activity_objects, which means I can not use Chinese string in predicate where.

Also, I can directly query this record using the Chinese string under psql:

vish_development=# select * from activity_objects where title = '第一个纵纹';

So my question is that what should I do for enabling CJK string in ActiveRecord's where or like this?

Frederick Cheung

unread,
Jun 23, 2015, 8:13:13 AM6/23/15
to rubyonra...@googlegroups.com, pengc...@gmail.com
On Tuesday, June 23, 2015 at 3:02:36 PM UTC+3, Dillon Peng wrote:

> Also, I can directly query this record using the Chinese string under psql:
> vish_development=# select * from activity_objects where title = '第一个纵纹';
> So my question is that what should I do for enabling CJK string in ActiveRecord's where or like this?

I'm not aware of any rails setting that controls this, however there are a few things that could come into play:

- what is the encoding of that string? Current versions of ruby default to Utf-8, but I don't think 1.9 did
- what are postgres' collation / encoding settings ? This will affect how strings are compared, which is super important in unicode, with different normalisation forms and so on.

Fred

Dillon Peng

unread,
Jun 23, 2015, 10:40:50 AM6/23/15
to rubyonra...@googlegroups.com, pengc...@gmail.com
hi, Fred
   Thank you very much!
   Now it works fine!
   Because ruby 1.9 does not use utf-8 as default encode:
   
   1.9.3-p551 :001 > __ENCODING__.name
   => "US-ASCII" 
 
  For UTF-8, I used the following commands:
    
   bash-3.2$ export LANG=en_US.UTF-8
   bash-3.2$ rails c
     Loading development environment (Rails 3.2.22)
    1.9.3-p551 :001 > __ENCODING__.name
    => "UTF-8" 

  After that, I get the right record when I run the previous query: 

 1.9.3-p551 :002 > ActivityObject.where(:title => '第一个纵纹')
 
 ActivityObject Load (29.5ms)  SELECT "activity_objects".* FROM "activity_objects" WHERE "activity_objects"."title" = '第一个纵纹'
 => [#<ActivityObject id: 648, created_at: "2015-06-21 14:41:19", updated_at: "2015-06-21 14:41:19", object_type: "Excursion", like_count: 0, title: "第一个纵纹", description: "整的行不?", follower_count: 1, visit_count: 5, language: "independent", age_min: 0, age_max: 0, notified_after_draft: false, comment_count: 0, popularity: 0, download_count: 0, qscore: 500000, reviewers_qscore: nil, users_qscore: nil, ranking: 0, title_length: 1, desc_length: 1, tags_length: 1, scope: 0, avatar_file_name: nil, avatar_content_type: nil, avatar_file_size: nil, avatar_updated_at: nil, teachers_qscore: nil>]
Reply all
Reply to author
Forward
0 new messages