i just committed a patch to Beautiful Soup HEAD which fixes pickling
of BS objects without awful hacks that break deepcopy. The problem is
described here:
http://bugs.python.org/issue1757062
The solution is simple, but inefficient: it turns a NavigableString
object into a standard Python unicode object by converting it into a
string and then into a unicode object. This was the only way I could
find to turn an instance of a unicode subclass into a unicode instance
proper. If you have a better idea I'd like to hear it.
Leonard
How about encoding and then decoding it? Still inefficient, but no
data would be lost as long as a suitable encoding is used.
As for a more general solution, is there any reason why
NavigableString can't wrap a unicode instance instead of being one?
Then __unicode__() could just return the underlying unicode object.
Sorry to ask, I have no time to take a peek right now; I'll look at it
tomorrow morning.
- Tal