Re: Optional Casting

25 views
Skip to first unread message

Sam Saffron

unread,
Feb 9, 2014, 9:33:27 PM2/9/14
to rub...@googlegroups.com
@Cody, 

Whatever happened with this? 

On Saturday, April 6, 2013 8:46:17 AM UTC+11, Cody Cutrer wrote:
I was wondering if you would be open to a pull request for a feature to optionally do type casts similar to how the mysql2 gem does them for basic data types (bool, int, float, timestamps)?
Before you throw "performance!" at me, I've done some benchmarking, and it's actually *significantly* *faster* (about half the time) to do these conversions in the extension rather than in ruby-land for a query of 5000 rows with 43 columns of production-data (i.e. not specially crafted to exercise expensive conversions). With casting turned off, the overhead was about 5%. IMHO, not just the performance benefits, but the convenience of this being an option on the pg gem itself (i.e. for those not wanting to do these conversions themselves, or when it's not easy - i.e. from ActiveRecord when doing custom selects or direct queries) far outweighs the slight performance loss for those still needing the raw speed.


Thanks,

Cody Cutrer

Detailed benchmarks:

0.15.0:

>> GC.start; Benchmark.ms { conn.exec("SELECT * FROM users WHERE id<4066234 LIMIT 5000").values }
=> 74.406078
>> GC.start; Benchmark.ms { conn.exec("SELECT * FROM users WHERE id<4066234 LIMIT 5000").values }
=> 68.38324899999999
>> GC.start; Benchmark.ms { conn.exec("SELECT * FROM users WHERE id<4066234 LIMIT 5000").values }
=> 65.600595
>> GC.start; Benchmark.ms { conn.exec("SELECT * FROM users WHERE id<4066234 LIMIT 5000").values }
=> 75.67106
>> GC.start; Benchmark.ms { conn.exec("SELECT * FROM users WHERE id<4066234 LIMIT 5000").values }
=> 77.56785400000001

0.15.5 doing conversions in ruby:

# I only did the types that exist in my query
def convert(ftype, value)
  return nil if value.nil?
  case ftype
  when 16
    value == 't' ? true : false
  when 20
    value.to_i
  when 23
    value.to_i
  when 1114
     ActiveRecord::ConnectionAdapters::Column.string_to_time(value)
  else
    value
  end
end

rs.values.map { |r| nr = []; r.each_with_index { |c, i| nr << convert(rs.ftype(i), c) }; nr }

def converted(rs)
  rs.values.map { |r| nr = []; r.each_with_index { |c, i| nr << convert(rs.ftype(i), c) }; nr }
end


>> GC.start; Benchmark.ms { converted(conn.exec("SELECT * FROM users WHERE id<4066234 LIMIT 5000")) }
=> 544.848845
>> GC.start; Benchmark.ms { converted(conn.exec("SELECT * FROM users WHERE id<4066234 LIMIT 5000")) }
=> 488.41922999999997
>> GC.start; Benchmark.ms { converted(conn.exec("SELECT * FROM users WHERE id<4066234 LIMIT 5000")) }
=> 507.443384
>> GC.start; Benchmark.ms { converted(conn.exec("SELECT * FROM users WHERE id<4066234 LIMIT 5000")) }
=> 487.04740400000003
>> GC.start; Benchmark.ms { converted(conn.exec("SELECT * FROM users WHERE id<4066234 LIMIT 5000")) }
=> 501.5174999999999


with cast:

>> GC.start; Benchmark.ms { conn.exec("SELECT * FROM users WHERE id<4066234 LIMIT 5000").values }
=> 304.11587000000003
>> GC.start; Benchmark.ms { conn.exec("SELECT * FROM users WHERE id<4066234 LIMIT 5000").values }
=> 282.924267
>> GC.start; Benchmark.ms { conn.exec("SELECT * FROM users WHERE id<4066234 LIMIT 5000").values }
=> 266.892654
>> GC.start; Benchmark.ms { conn.exec("SELECT * FROM users WHERE id<4066234 LIMIT 5000").values }
=> 286.018639
>> GC.start; Benchmark.ms { conn.exec("SELECT * FROM users WHERE id<4066234 LIMIT 5000").values }
=> 292.179201

with casting disabled (but the code in there, so overhead):

>> GC.start; Benchmark.ms { conn.exec("SELECT * FROM users WHERE id<4066234 LIMIT 5000").values }
=> 81.622447
>> GC.start; Benchmark.ms { conn.exec("SELECT * FROM users WHERE id<4066234 LIMIT 5000").values }
=> 79.955042
>> GC.start; Benchmark.ms { conn.exec("SELECT * FROM users WHERE id<4066234 LIMIT 5000").values }
=> 76.548084
>> GC.start; Benchmark.ms { conn.exec("SELECT * FROM users WHERE id<4066234 LIMIT 5000").values }
=> 77.644216
>> GC.start; Benchmark.ms { conn.exec("SELECT * FROM users WHERE id<4066234 LIMIT 5000").values }
=> 79.186944

Cody Cutrer

unread,
Feb 10, 2014, 12:00:17 PM2/10/14
to rub...@googlegroups.com
Sam,

There seemed to be some resistance to this, on the basis that pg lib shouldn't know how to cast, and casting should be left to a higher level. Then I found out that ActiveRecord 4 properly reads the data types of each columns and does cast correctly, so the functionality isn't needed by me anymore, but the performance improvement would still be nice (both memory and speed wise). I haven't had time to get back to it since then.

Cody Cutrer


--
You received this message because you are subscribed to the Google Groups "ruby-pg" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ruby-pg+u...@googlegroups.com.
To post to this group, send email to rub...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ruby-pg/80728246-c51f-406b-9aea-5aaa9889a2f3%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all
Reply to author
Forward
0 new messages