Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Storing ruby serialized objects as text or binary?
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  3 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Clint Pachl  
View profile  
 More options Nov 9 2012, 4:03 am
From: Clint Pachl <clintpa...@gmail.com>
Date: Fri, 9 Nov 2012 01:03:03 -0800 (PST)
Local: Fri, Nov 9 2012 4:03 am
Subject: Storing ruby serialized objects as text or binary?

I am wondering what are the dis/advantages of storing ruby serialized
objects (i.e. Marshal.dump) directly into a `bytea` column versus Base64
encoding the byte stream and storing into a `text` column? Also, are there
any performance concerns with either method? I am using PostgreSQL to be
specific.

One of the advantages I found in base64 encoding first is that it doesn't
screw up my terminal when Sequel logs to STDOUT. Maybe there is a way to
escape or hide this binary data in the logger? However, it seems storing
the serialized object directly, without encoding, would be more efficient.

(this may make this post irrelevant)
Finally, I was unable to reconstitute a ruby object after an insert/select.
I keep getting the error from Marshal.dump, "data too short". It wasn't
until I base64 encoded first that I got it to work. So maybe storing binary
data directly doesn't work? I found a post from 2008, How do you insert
binary data using sequel + postgresql?<https://groups.google.com/forum/#!searchin/sequel-talk/ruby$20binary/...>

Here's what I did:

ds = DB[:core__checkout_snapshots]
checkout_data = Marshal.dump(data)
ds.insert(id: checkout_id, data: checkout_data)
Marshal.load(ds[id: checkout_id][:data])

This failed with the `data` column as type `bytea` and `text`.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jeremy Evans  
View profile  
 More options Nov 9 2012, 11:51 am
From: Jeremy Evans <jeremyeva...@gmail.com>
Date: Fri, 9 Nov 2012 08:51:24 -0800 (PST)
Local: Fri, Nov 9 2012 11:51 am
Subject: Re: Storing ruby serialized objects as text or binary?

When storing data in a bytea column, you need to mark it as a blob:

   ds.insert(id: checkout_id, data: Sequel.blob(checkout_data))

That may fix your issue.

Personally, I don't think serialization of ruby objects into a database is
a good idea in most cases, but if you have to do it with Marshal, it's
probably better to store it in bytea instead of base64 encoded text.

About your logger issue, use a custom logger that escapes the output
instead of the default Logger class, that's unrelated to Sequel.

Jeremy


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Clint Pachl  
View profile  
 More options Nov 10 2012, 4:49 am
From: Clint Pachl <clintpa...@gmail.com>
Date: Sat, 10 Nov 2012 01:49:43 -0800 (PST)
Local: Sat, Nov 10 2012 4:49 am
Subject: Re: Storing ruby serialized objects as text or binary?

On Friday, November 9, 2012 9:51:24 AM UTC-7, Jeremy Evans wrote:

> When storing data in a bytea column, you need to mark it as a blob:

>    ds.insert(id: checkout_id, data: Sequel.blob(checkout_data))

> That may fix your issue.

Yep, that fixed my issue!

I knew the Postgres function I needed to escape the binary data
(i.e. escape_bytea), but I just couldn't find the Sequel method to access
it. I looked at the code, but too many abstraction layers for me to follow.
I'm guessing it is in the C code of the Postgres driver.

Personally, I don't think serialization of ruby objects into a database is

> a good idea in most cases, but if you have to do it with Marshal, it's
> probably better to store it in bytea instead of base64 encoded text.

I am glad you offered your opinion Jeremy. You made me think about what I
am trying to do. I do indeed want to serialize my objects because they are
short-lived snapshots of a collection of objects. Doing so just makes life
easier. However, I decided that they don't belong in the database. I
decided that I can more easily manage these snapshots using files.

I would also agree that it is better to store a Marshal.dump directly in a
bytea column. For example, upon quick comparison of my snapshots, the
binary data is about about 35% smaller than the base64 encoded text.

> About your logger issue, use a custom logger that escapes the output
> instead of the default Logger class, that's unrelated to Sequel.

Escaping the binary data using Sequel.blob fixed the garbling of my
terminal. I am in fact using the standard Logger, so no need for a custom
logger. The Postgres escape_bytea function must make the data
terminal-friendly.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »