I've finally had a chance to work on couchdb-python today, and have checked in a number of changes. In particular:
* The Database.update() has been changed as previously discussed on this list [r151] * A number of fixes to schema.ListField * client.Server now has a delete() method (in addition to __delitem__) * Added a copy() method to client.Database [r155,r159] * Added a streaming MIME multipart writer for the couchdb-dump tool [162], and fixes for the multipart parser.
I'd be grateful for any feedback and testing, in particular with respect to Database.update and the streaming couchdb-dump.
If testing/review doesn't surface any really nasty issues, there should be a release later this week (f-f-finally!)
> I've finally had a chance to work on couchdb-python today, and have > checked in a number of changes. In particular:
> * The Database.update() has been changed as previously discussed on > this list [r151]
The new Database.update() API looks good to me, but then I guess you'd expect me to say that ;-).
However, when the doc revs are updated (the successful ones at least) is it updating schema Documents too? I don't use schema at all but they're not dict subclasses so I don't think the new rev will be set.
> * A number of fixes to schema.ListField > * client.Server now has a delete() method (in addition to __delitem__) > * Added a copy() method to client.Database [r155,r159] > * Added a streaming MIME multipart writer for the couchdb-dump tool > [162], and fixes for the multipart parser.
I found a couple of problems when testing against r163. I was going to attach patches to this message but thought I might as well just commit them. Feel free to revert or change anything you don't like.
* The required self arg is missing from MultipartWriter._make_boundary. See r164. * Missing method MultipartWriter.end() is called at the end of dump. See r165. * End of line handling was a bit messed up (only a bit!) so I refactored it a little to ensure it's using '\r\n' everywhere. See r166. * Documents with attachments fail. *Not* fixed yet.
Other than that, memory usage during dump and load stayed around the 10MB mark (as "measured" with Gnome's system monitor).
> I'd be grateful for any feedback and testing, in particular with > respect to Database.update and the streaming couchdb-dump.
Unittests with Python 2.6 all succeed, but I see some failing doctests with Python 2.5.
> If testing/review doesn't surface any really nasty issues, there > should be a release later this week (f-f-finally!)
That would be great! I'll do what I can to test and help out.
> Thanks,
Thank you!
I need to get some "real" work done now but I'll try to go through the changes later as well as test dump/load with some different databases.
thanks for the reviewing and the fixes. Just a quick note:
On 30.06.2009, at 13:46, Matt Goodall wrote:
> * The required self arg is missing from > MultipartWriter._make_boundary. See r164. > * Missing method MultipartWriter.end() is called at the end of dump. > See r165.
I renamed end() to close() pretty late in the process, and this was a leftover. But we do still need to call close().
> * End of line handling was a bit messed up (only a bit!) so I > refactored it a little to ensure it's using '\r\n' everywhere. See > r166.
Hmm, we need to be careful then that the fileobj is not open in universal-newline mode ('U') for the parser. Need to look closer, later.
> * Documents with attachments fail. *Not* fixed yet.
> On 30.06.2009, at 13:46, Matt Goodall wrote: >> * The required self arg is missing from >> MultipartWriter._make_boundary. See r164. >> * Missing method MultipartWriter.end() is called at the end of dump. >> See r165.
> I renamed end() to close() pretty late in the process, and this was a > leftover. But we do still need to call close().
>> * End of line handling was a bit messed up (only a bit!) so I >> refactored it a little to ensure it's using '\r\n' everywhere. See >> r166.
> Hmm, we need to be careful then that the fileobj is not open in > universal-newline mode ('U') for the parser. Need to look closer, > later.
Okay, after some investigation I think we to make the parsing more tolerant. Previously it expected just \n linebreaks, now it expects \r \n linebreaks, but it should really be possible to hand it either. But we still need to be careful not to mess up linebreaks in attachments. I have a change ready locally, but currently can't check anything in because svn@googlecode seems to be down. Will try later.
>> * Documents with attachments fail. *Not* fixed yet.
> Hmm, I'm sure I did test those :P
Doh, that was probably due to another leftover call to end() instead of close().
On Wed, Jul 1, 2009 at 14:53, Christopher Lenz<cml...@gmx.de> wrote: > Okay, after some investigation I think we to make the parsing more > tolerant. Previously it expected just \n linebreaks, now it expects \r > \n linebreaks, but it should really be possible to hand it either. But > we still need to be careful not to mess up linebreaks in attachments. > I have a change ready locally, but currently can't check anything in > because svn@googlecode seems to be down. Will try later.
Semi-relatedly: what about switching to h...@code.google instead?
> On 30.06.2009, at 15:05, Christopher Lenz wrote: >> On 30.06.2009, at 13:46, Matt Goodall wrote: >>> * The required self arg is missing from >>> MultipartWriter._make_boundary. See r164. >>> * Missing method MultipartWriter.end() is called at the end of dump. >>> See r165.
>> I renamed end() to close() pretty late in the process, and this was a >> leftover. But we do still need to call close().
>>> * End of line handling was a bit messed up (only a bit!) so I >>> refactored it a little to ensure it's using '\r\n' everywhere. See >>> r166.
>> Hmm, we need to be careful then that the fileobj is not open in >> universal-newline mode ('U') for the parser. Need to look closer, >> later.
> Okay, after some investigation I think we to make the parsing more > tolerant. Previously it expected just \n linebreaks, now it expects \r > \n linebreaks, but it should really be possible to hand it either. But > we still need to be careful not to mess up linebreaks in attachments. > I have a change ready locally, but currently can't check anything in > because svn@googlecode seems to be down. Will try later.
Okay, it's checked in now. I've also added Content-MD5 headers for all the leaf parts in the dump, so that we can verify the integrity of the content on load.
> On 01.07.2009, at 14:53, Christopher Lenz wrote: >> On 30.06.2009, at 15:05, Christopher Lenz wrote: >>> On 30.06.2009, at 13:46, Matt Goodall wrote: >>>> * The required self arg is missing from >>>> MultipartWriter._make_boundary. See r164. >>>> * Missing method MultipartWriter.end() is called at the end of dump. >>>> See r165.
>>> I renamed end() to close() pretty late in the process, and this was a >>> leftover. But we do still need to call close().
>>>> * End of line handling was a bit messed up (only a bit!) so I >>>> refactored it a little to ensure it's using '\r\n' everywhere. See >>>> r166.
>>> Hmm, we need to be careful then that the fileobj is not open in >>> universal-newline mode ('U') for the parser. Need to look closer, >>> later.
>> Okay, after some investigation I think we to make the parsing more >> tolerant. Previously it expected just \n linebreaks, now it expects \r >> \n linebreaks, but it should really be possible to hand it either. But >> we still need to be careful not to mess up linebreaks in attachments. >> I have a change ready locally, but currently can't check anything in >> because svn@googlecode seems to be down. Will try later.
> Okay, it's checked in now. I've also added Content-MD5 headers for all > the leaf parts in the dump, so that we can verify the integrity of the > content on load.
Yep, dump/load seems to work fine now, including attachments. Thanks.
> On Wed, Jul 1, 2009 at 14:53, Christopher Lenz<cml...@gmx.de> wrote: >> Okay, after some investigation I think we to make the parsing more >> tolerant. Previously it expected just \n linebreaks, now it expects >> \r >> \n linebreaks, but it should really be possible to hand it either. >> But >> we still need to be careful not to mess up linebreaks in attachments. >> I have a change ready locally, but currently can't check anything in >> because svn@googlecode seems to be down. Will try later.
> Semi-relatedly: what about switching to h...@code.google instead?
I don't have a strong opinion either way. Might look into a switch sometime after the 0.6 release. What do others think?
On Thu, Jul 2, 2009 at 1:08 PM, Christopher Lenz<cml...@gmx.de> wrote:
> On 01.07.2009, at 15:19, Dirkjan Ochtman wrote:
>> On Wed, Jul 1, 2009 at 14:53, Christopher Lenz<cml...@gmx.de> wrote:
>>> Okay, after some investigation I think we to make the parsing more
>>> tolerant. Previously it expected just \n linebreaks, now it expects
>>> \r
>>> \n linebreaks, but it should really be possible to hand it either.
>>> But
>>> we still need to be careful not to mess up linebreaks in attachments.
>>> I have a change ready locally, but currently can't check anything in
>>> because svn@googlecode seems to be down. Will try later.
>> Semi-relatedly: what about switching to h...@code.google instead?
> I don't have a strong opinion either way. Might look into a switch
> sometime after the 0.6 release. What do others think?
On Thu, Jul 2, 2009 at 4:08 AM, Christopher Lenz <cml...@gmx.de> wrote:
> On 01.07.2009, at 15:19, Dirkjan Ochtman wrote:
> > On Wed, Jul 1, 2009 at 14:53, Christopher Lenz<cml...@gmx.de> wrote:
> >> Okay, after some investigation I think we to make the parsing more
> >> tolerant. Previously it expected just \n linebreaks, now it expects
> >> \r
> >> \n linebreaks, but it should really be possible to hand it either.
> >> But
> >> we still need to be careful not to mess up linebreaks in attachments.
> >> I have a change ready locally, but currently can't check anything in
> >> because svn@googlecode seems to be down. Will try later.
> > Semi-relatedly: what about switching to h...@code.google instead?
> I don't have a strong opinion either way. Might look into a switch
> sometime after the 0.6 release. What do others think?
I like hg just fine. We use svn at work and hg is an easy transition with
lots of benefits.
On Thu, Jul 2, 2009 at 11:08, Christopher Lenz<cml...@gmx.de> wrote: > I don't have a strong opinion either way. Might look into a switch > sometime after the 0.6 release. What do others think?
Not much noise here, but people who responded were in favor. What do you think?
> On Thu, Jul 2, 2009 at 11:08, Christopher Lenz<cml...@gmx.de> wrote: >> I don't have a strong opinion either way. Might look into a switch >> sometime after the 0.6 release. What do others think?
> Not much noise here, but people who responded were in favor. What do > you think?
I'd like to make the move. Any tips on how that would go?
You might want to read up on the differences between clone- and named branches in Mercurial (use 1.3 for the best named branches support). -all has everything, from a straight hgsubversion conversion with a small author map, -trunkrel has the default branch and all release branches, but not the feature branch. -httplib is a clone feature branch off of the default branch.
Since GCode currently doesn't support multiple clones per project, you probably want to use named branches for release branches, but I'd advise the use of clone branches for feature branches. See also Python's PEP 385 for my reasoning concerning this issue for the Python repo.
Let me know if anyone wants to know more/doesn't understand something.
> You might want to read up on the differences between clone- and named > branches in Mercurial (use 1.3 for the best named branches support). > -all has everything, from a straight hgsubversion conversion with a > small author map, -trunkrel has the default branch and all release > branches, but not the feature branch. -httplib is a clone feature > branch off of the default branch.
Thanks a lot. I'll investigate the migration process over the next few days.
> Since GCode currently doesn't support multiple clones per project, you > probably want to use named branches for release branches, but I'd > advise the use of clone branches for feature branches. See also > Python's PEP 385 for my reasoning concerning this issue for the Python > repo.
Just to make sure I understand... the feature branching approach you suggest we use isn't actually supported by Google Code?
> On 16.07.2009, at 16:18, Dirkjan Ochtman wrote: >> On Thu, Jul 16, 2009 at 15:27, Dirkjan Ochtman<dirk...@ochtman.nl> >> wrote: >>> I'll prepare a hg repo you can pull from and host it somewhere.
>> Okay, I've prepared three repos and put them on Bitbucket:
>> You might want to read up on the differences between clone- and named >> branches in Mercurial (use 1.3 for the best named branches support). >> -all has everything, from a straight hgsubversion conversion with a >> small author map, -trunkrel has the default branch and all release >> branches, but not the feature branch. -httplib is a clone feature >> branch off of the default branch.
> Thanks a lot. I'll investigate the migration process over the next few > days.
So following the migration guide, I "hg convert"ed the repository, and then switched the VCS on Google Code to Mercurial. After which everything is now resulting in an internal server error :( Can't even switch back to Subversion right now.
I'm hoping this will just take a couple of hours or something, otherwise… ugh.
> So following the migration guide, I "hg convert"ed the repository, and > then switched the VCS on Google Code to Mercurial. After which > everything is now resulting in an internal server error :( Can't even > switch back to Subversion right now.
Why didn't you use the repos I setup on Bitbucket? I used a better conversion tool to do it. :)
> On 24/07/2009 23:16, Christopher Lenz wrote: >> So following the migration guide, I "hg convert"ed the repository, >> and >> then switched the VCS on Google Code to Mercurial. After which >> everything is now resulting in an internal server error :( Can't even >> switch back to Subversion right now.
> Why didn't you use the repos I setup on Bitbucket? I used a better > conversion tool to do it. :)
The only thing I really *did* was change the dropdown on Google Code. This is obviously a pretty bad bug on their side.
Anyway, what exactly does the conversion process you used do better than hg convert? (Keeping in mind that the svn history of CouchDB- Python is pretty darn simple).
> The only thing I really *did* was change the dropdown on Google Code. > This is obviously a pretty bad bug on their side.
> Anyway, what exactly does the conversion process you used do better > than hg convert? (Keeping in mind that the svn history of CouchDB- > Python is pretty darn simple).
It keeps the order of revisions straight and simple. I also used an author map (you can do that with convert, too, not sure if you did?).
And yeah, it's a pretty bad bug on GCode's site. Have you filed a bug with them about it? If so, where is it?
> On 25/07/2009 21:30, Christopher Lenz wrote: >> The only thing I really *did* was change the dropdown on Google Code. >> This is obviously a pretty bad bug on their side.
>> Anyway, what exactly does the conversion process you used do better >> than hg convert? (Keeping in mind that the svn history of CouchDB- >> Python is pretty darn simple).
> It keeps the order of revisions straight and simple. I also used an > author map (you can do that with convert, too, not sure if you did?).
No, I forgot about that. But luckily I haven't pushed anything yet :)
The order of revisions in the hg converted repos looks straight and simple to me.
Anyway, stupid question, how would I go about "using" the repos you pushed to bitbucket? Just pull and push to googlecode?
(I'm still a newbie to hg and have so far only used it for local development. I thought couchdb-python would be a good way to extend that usage and learn.)
> And yeah, it's a pretty bad bug on GCode's site. Have you filed a bug > with them about it? If so, where is it?
Apparently there's at least one other project with the same problem:
> Anyway, stupid question, how would I go about "using" the repos you > pushed to bitbucket? Just pull and push to googlecode?
Yes, that should do the job. You can use the hgsubversion extension (requires hg 1.3; get it from bitbucket, see also hg help extensions) to pull additional changesets since my conversion.
> (I'm still a newbie to hg and have so far only used it for local > development. I thought couchdb-python would be a good way to extend > that usage and learn.)
You may want to review the hgbook online. There's also a dead-tree version, called Mercurial: The Definitive Guide.