Shouldn't it be up to the user how to encode the message body? For
example the user could pass in an already encoded message body, or
turn on some flag to have the message encoded if they wish. At the
very least it seems the user should be able to turn off the encoding,
especially if they're going to use boto to interact with some system
that's already setup and wasn't written using boto. It's like boto is
making the assumption that everyone else is using boto.
The decision to base64 encode the message bodies came as a result of
this thread:
http://developer.amazonwebservices.com/connect/thread.jspa?messageID=49680숐
So, I felt that it made sense to just base64 encode everything so
people didn't ever have to worry about what characters are legal and
which are not. I think for most people that's okay but clearly not if
you are interoperating with different languages and libraries around
the same queues and messages.
I think the best way to address this is to change message.py (which
needs some clean up anyway) to include a base Message class that does
not do any encoding of the message and then have a subclass that does
the base64 encoding which would be the default message class used for
queues. This would mean that you would have to remember to do a
set_message_class on the queue if you wanted it to return the plain
message instance.
Would that work for you? Any better ideas? Thanks for the feedback.
Mitch
I was too when I discovered this :)
I figured there was some reason for the encoding, and knowing it now I
agree the encoding should probably be the default.
I think one thing that's confusing is that I look at the Message class
interface, notice there's a set_body and set_body_64, I choose
set_body, but then Queue.write() sneakily uses get_body_64. I thought
choosing set_body rather than set_body_64 was me saying I don't want
it encoded. Actually, it might even be the case that that choosing
between set_body_64 and set_body makes no difference to your app,
which seems a bit weird.
What if you just removed get_body_64, have one set(get)_body, and have
an optional boolean parameter either in set_body or the message
constructor (and in queue.new_message) that indicates whether its
encoded and defaults to True. Something like "def set_body(self,
body, convertToBase64=true)" That way when a user looks at the
documentation to figure out how to set a message body, they
immediately
notice that it's going to be encoded unless they pass an extra flag
in.
Actually it might make more sense to have it as a Queue parameter,
since it'd be weird if you were mixing encodings within the same
Queue, and then you'd only have to deal with the parameter once.
My proposal is pretty half-baked as I haven't looked through the code
a lot and thought it through. But if I'm not missing anything it
seems
simpler and more intuitive to me, at least from a user perspective.
However, I kind of like your suggestion of having it be a queue
parameter. Mixing encoded and non-encoded messages within a single
queue just seemswrong and doomed to cause problems. Having an
optional flag (defaults to true) to specify whether messages for a
particular queue are encoded or not seems reasonable, though. I'll
have a little think about it and try it on for size in my development
environment and if it doesn't make my butt look too big I'll check it
in.
Mitch
There are now three message classes, RawMessage, Message, and
MHMessage. The new one, RawMessage, is the base class and as the name
implies it does not do any encoding/decoding of the message body.
Whatever you set the body to is what will be written to SQS. Message
class still does the base64 encoding and is the default class used by
Queue objects. So, hopefully I haven't broken anything but it is now
at least possible to use Queue.set_message_class to set the default
message class to RawMessage and avoid the whole base64 thing.
In addition, I cleaned up the code quite a bit and added some
docstring goodness. So, pick your message class of choice, use the
set_message_class method to clue the queue in to what you want and try
not to mix and match message classes within a queue.
Mitch
BTW this is now Issue 72 (http://code.google.com/p/boto/issues/detail?
id=72) on the Boto issue tracker.