Having trouble replicating with npm repo (couchdb) - anyone tried it lately and/or seen this error?

1,134 views
Skip to first unread message

andy

unread,
May 13, 2013, 11:16:51 PM5/13/13
to nod...@googlegroups.com
Based on the awesome feedback I got from https://groups.google.com/d/msg/nodejs/sX4mbsRPwls/WtDDE-To2o4J, we tried replicating the npm repo so we could use it in an offline environment.

We're essentially following the instructions at http://clock.co.uk/tech-blogs/how-to-create-a-private-npmjs-repository but replication fails after syncing about 17k documents.

We've tried reinstalling couch (found one issue that suggested using a patched version of SpiderMonkey) but the same thing keeps happening, even after restarting replication several times.

Here's our setup:

CentOS 6.4
CouchDB 1.3
SpiderMonkey 1.8.5-7 

Replication works fine for over 17,000 documents, then we see this error and can't get past it:

[Sat, 11 May 2013 00:55:39 GMT] [error] [<0.12970.4>] Replicator: couldn't write document `bufferhelper`, revision `19-d339684ee7f5eaf4cc18d84da753832d`, to target database `registry`. Error: `unauthorized`, reason: `Please log in before writing to the db`.

Any ideas?

Thanks,

Andy

andy e

unread,
May 17, 2013, 12:22:21 AM5/17/13
to nod...@googlegroups.com
Kevin,

Unfortunately, no. We tried a few of the tips mentioned here on the CouchDB list (http://mail-archives.apache.org/mod_mbox/couchdb-user/201305.mbox/%3cCAL+Y1nuP=wBwXn8eM7MBzZg2v3nKChTEVmo=BNtwHF5...@mail.gmail.com%3e) - for example, we didn't have an admin user set up, so we tried that and it looked like it would work...but we restarted replication with a new DB (we want this to be a repeatable process) and it failed after only 500 or so documents. We were still trying Couch 1.3 so we're gonna drop down to 1.2.1 and see how that goes.

So, no real idea what is wrong. If anyone has more tips on replicating with the public npm repo, or maybe wants to zip up their .couch file and put it on bittorrent, I'm all ears, haha. 

I'm really hoping StrongLoops additions to NPM work out well (see http://blog.strongloop.com/whats-new-in-strong-loop-node-beta-3-private-repositories/) and someone will create an 'enterprise' repo a la Nexus/Artifactory. 

andy

On Thu, May 16, 2013 at 6:50 PM, Kevin Sawicki <kevins...@gmail.com> wrote:
Hi Andy,

I'm also seeing the exact same issue trying to replicate isaacs.iriscouch.com to another iriscouch.com database, it gets stuck at 17,286 documents (16gb) and those errors start to appear in the log.

Have you found any more details about this issue?

Sincerely,
Kevin

--
--
Job Board: http://jobs.nodejs.org/
Posting guidelines: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to nod...@googlegroups.com
To unsubscribe from this group, send email to
nodejs+un...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en
 
---
You received this message because you are subscribed to the Google Groups "nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nodejs+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Gregg Caines

unread,
May 17, 2013, 5:05:25 AM5/17/13
to nod...@googlegroups.com
I ran into the same issue, and this is going to sound crazy, but I ended up writing my own replication (which is amazingly easy for anyone who's comfy with making http requests programmatically; couch spits out a page-able list of all changes in json format at http://isaacs.iriscouch.com/registry/_changes).  It might also be worth trying https://npmjs.org/package/replicate .

I can't remember what the root cause of your problem is, but I think it's something to do with different couch versions and large attachments or something.

G

DTrejo

unread,
May 17, 2013, 1:02:35 PM5/17/13
to nod...@googlegroups.com
I would try mikeal's replicate: https://github.com/mikeal/replicate

Alex Kocharin

unread,
May 18, 2013, 12:45:34 PM5/18/13
to nod...@googlegroups.com

Why do you want replication at all? I was thinking about it for myself recently, but I found out that there're lots of libraries you won't ever use.

So isn't it better to write something like proxying repository server that would host your private projects, but proxy all other requests to npm central repository (with caching of course to avoid heavy load)?

Martin Cooper

unread,
May 18, 2013, 12:53:15 PM5/18/13
to nod...@googlegroups.com
On Sat, May 18, 2013 at 9:45 AM, Alex Kocharin <al...@equenext.com> wrote:

Why do you want replication at all? I was thinking about it for myself recently, but I found out that there're lots of libraries you won't ever use.

One reason for replicating is to protect yourself against an outage of the npmjs.org registry. It doesn't happen often, but it does happen.
 
So isn't it better to write something like proxying repository server that would host your private projects, but proxy all other requests to npm central repository (with caching of course to avoid heavy load)?

Something like shadow-npm, for example:

https://github.com/dominictarr/shadow-npm

(Caveat - I haven't actually used this, I just know it's out there.)

--
Martin Cooper
 

On Tuesday, May 14, 2013 7:16:51 AM UTC+4, andy wrote:
Based on the awesome feedback I got from https://groups.google.com/d/msg/nodejs/sX4mbsRPwls/WtDDE-To2o4J, we tried replicating the npm repo so we could use it in an offline environment.

We're essentially following the instructions at http://clock.co.uk/tech-blogs/how-to-create-a-private-npmjs-repository but replication fails after syncing about 17k documents.

We've tried reinstalling couch (found one issue that suggested using a patched version of SpiderMonkey) but the same thing keeps happening, even after restarting replication several times.

Here's our setup:

CentOS 6.4
CouchDB 1.3
SpiderMonkey 1.8.5-7 

Replication works fine for over 17,000 documents, then we see this error and can't get past it:

[Sat, 11 May 2013 00:55:39 GMT] [error] [<0.12970.4>] Replicator: couldn't write document `bufferhelper`, revision `19-d339684ee7f5eaf4cc18d84da753832d`, to target database `registry`. Error: `unauthorized`, reason: `Please log in before writing to the db`.

Any ideas?

Thanks,

Andy
--

Alex Kocharin

unread,
May 18, 2013, 1:20:49 PM5/18/13
to nod...@googlegroups.com

Cache will protect against an outage. I mean, if you was using some module before, and npmjs.org is gone, such repository would just answer with an old version of a package pretending like new versions were not published. If you are beginning to use some new package, it would fail all right, but it's better than installing couchdb and replicating...

By the way, is it possible to track changes to a list of packages on npmjs.org repository? I mean something like partial replication, if I want to have an update of express.js ASAP, but if there is update to some-unknown-lib.js, I wouldn't even want to spend a traffic to know about it.

I'm checking out shadow-npm, but it's 400 lines of js code and it wasn't modified in a year. Doesn't look so promising -_-    And it still uses CouchDB... for some reason I think this task could be solved without any database at all (as long as we're trying to keep things simple and don't talk about heavy loading, filesystem could serve as a database).

Andy Ennamorato

unread,
May 18, 2013, 6:39:40 PM5/18/13
to nod...@googlegroups.com
Alex,
 Response inline. :)

Sent from my iPhone

On May 18, 2013, at 10:45 AM, Alex Kocharin <al...@equenext.com> wrote:


Why do you want replication at all? I was thinking about it for myself recently, but I found out that there're lots of libraries you won't ever use. 

So isn't it better to write something like proxying repository server that would host your private projects, but proxy all other requests to npm central repository (with caching of course to avoid heavy load)?

We don't have any other choice, sadly - we have no network connectivity to npm central where we have to develop from. 

So our thought was to replicate against npm while connected, then take a copy of couch/npm (even with a bunch of stuff we won't use) offline. 

I would prefer to use some sort of proxy/cache, were we only pull mods we need (ala Artifactory or Nexus in Java land) but I hadn't really seen anything like that. Ill have to check out shadow-npm.

Andy


On Tuesday, May 14, 2013 7:16:51 AM UTC+4, andy wrote:
Based on the awesome feedback I got from https://groups.google.com/d/msg/nodejs/sX4mbsRPwls/WtDDE-To2o4J, we tried replicating the npm repo so we could use it in an offline environment.

We're essentially following the instructions at http://clock.co.uk/tech-blogs/how-to-create-a-private-npmjs-repository but replication fails after syncing about 17k documents.

We've tried reinstalling couch (found one issue that suggested using a patched version of SpiderMonkey) but the same thing keeps happening, even after restarting replication several times.

Here's our setup:

CentOS 6.4
CouchDB 1.3
SpiderMonkey 1.8.5-7 

Replication works fine for over 17,000 documents, then we see this error and can't get past it:

[Sat, 11 May 2013 00:55:39 GMT] [error] [<0.12970.4>] Replicator: couldn't write document `bufferhelper`, revision `19-d339684ee7f5eaf4cc18d84da753832d`, to target database `registry`. Error: `unauthorized`, reason: `Please log in before writing to the db`.

Any ideas?

Thanks,

Andy

--

Jonathan Kunkee

unread,
May 19, 2013, 12:45:35 AM5/19/13
to nod...@googlegroups.com
Andy,

I would follow up with Gregg Caines' idea of playing with different versions. I might, perhaps, suggest 1.2.1. I think 1.0.1 works too, but it's been a while. (If I get the chance to test it again myself this coming week I'll let you know.)

Cheers,
Jon

Alex Kocharin

unread,
May 22, 2013, 11:04:16 AM5/22/13
to nod...@googlegroups.com
Andy,


> I would prefer to use some sort of proxy/cache, were we only pull mods we need (ala Artifactory or Nexus in Java land) but I hadn't really seen anything like that.

Well... what do you think about creating one? :)

I started a project on github - https://github.com/rlidwka/npmrepod , and described how I would like to see this thingy. There's no code there, but it's expected to change soon enough. The idea was around for 6 months or so, and I more or less know how it would work (except for authentication and access rights...).

Any thoughts on this?


--
// alex

andy e

unread,
May 23, 2013, 11:13:45 PM5/23/13
to nod...@googlegroups.com
Alex,

I think that's an awesome idea. But...a) I stink at node b) probably don't have time to devote to help beyond complaints/a list of what I'd like to see c) I'm full of excuses.

There is Mike Brevoort's node-reggie - https://github.com/mbrevoort/node-reggie (I keep name dropping that hoping he'll appear out of thin air with a bunch of commits that finish it up :) ), maybe that can be of some use.

I'll definitely keep my eye on your repo and maybe I can help out at some point.

andy

andy e

unread,
May 24, 2013, 10:56:03 AM5/24/13
to nod...@googlegroups.com
Joshwa,

Cool, thanks for the tip. Were you running Couch 1.2 or 1.3?

We still need to try a few of these suggestions (guy doing most of this has been out for a week) but will definitely give this a shot.

Thanks!

Andy

On Thu, May 23, 2013 at 11:53 PM, Joshwa Fugett <jfug...@gmail.com> wrote:
Give this one a shot,

curl -svX PUT http://admin:pas...@127.0.0.1:5984/_replicator/mirror_npm -d '{"source":"http://isaacs.iriscouch.com/registry/", "target":"registry", "user_ctx": {"name": "admin"}, "continuous": true' -HContent-Type:application/json

Switch the admin in the user_ctx as well. Also make sure that the admin is added to the registry database as well as an admin so that it has full permissions. This worked for me for everything except for _design/app and I just replicated that separately through futon after this one finished.

Hope it helps,
Joshwa

Joshwa Fugett

unread,
May 25, 2013, 1:19:06 AM5/25/13
to nod...@googlegroups.com
It was 1.2 but should work on 1.3 as well since the only major change to replication in 1.3 from the latest 1.2 was the UUID handling to make resumes more effective.

Andy Ennamorato

unread,
Jun 11, 2013, 11:06:30 AM6/11/13
to nod...@googlegroups.com
Thanks for that update.

I forgot to follow up. Unfortunately I wasn't running the replication (a coworker was trying it at a separate location and he's too nervous/scared/old to post to the mailing list himself - haha) to know what exactly worked in terms of tweaking the settings. Ill try to find out though.

We tried it using 1.2 on Linux/OS X after trying 1.3 and it failed again. We then ran using Couch on Windows and it worked fine. So maybe we were building couch wrong. (I thought there were binaries for couch on *nix?)

So now we've got 48GB of modules we need to package up and I'm sure there will be more questions along the way.

Big thanks to all the suggestions and help its much appreciated (and if you ever come to Denver or the DenverJS meetup ill buy you a beer or three).

Andy



Sent from my iPhone

On Jun 11, 2013, at 2:18 AM, Erdem Agaoglu <erdem....@gmail.com> wrote:

A somewhat old post but for anyone stuck on this, it seems that the npmjs couchapp (the thing that works for /registry/_design/app/...) has a validate_doc_update function to user-control 'npm login' and 'npm publish' operations. But it also effects the replication, simply the function does not let replication user to write to the database. For couchdb 1.3. the _replicator database also accepts a user_ctx parameter that the replication process will use. So instead of using the old style _replicate command, couchdb 1.3 users should do something like:

curl -X POST http://127.0.0.1:5984/_replicator -d '{"_id": "npmjs_repl", "source":"http://isaacs.iriscouch.com/registry/", "target":"registry", "continuous":true, "user_ctx": {"name":"replicator", "roles":["_admin"]}}' -H "Content-Type: application/json"

This will run the replication as an admin user (roles: _admin) and validate_doc_update function will allow writes. My replication seems to continue now

--

Dmytro Semenov

unread,
Feb 6, 2014, 7:42:16 PM2/6/14
to nod...@googlegroups.com
Hi Andy,

I am not sure if you need this, but maybe some other folks would find this helpful.

I was able to reproduce this reliably on unrestricted couchdb (no login required)

1. Replicate one module to a new db to make sure replication works:
curl http://localhost:5984/_replicate -X POST -d '{"source":"http://isaacs.iriscouch.com/registry/","target":"pub-registry2","create_target":true,"continuous":false,"doc_ids":["system"]}' -H "Content-Type: application/json"
2. Replicate _design/app
curl http://npmjs-db.stratus.dev.ebay.com/_replicate -X POST -d '{"source":"http://isaacs.iriscouch.com/registry/","target":"pub-registry2","create_target":true,"continuous":false,"doc_ids":["_design/app"]}' -H "Content-Type: application/json"
3. Then try replicate some other module
curl http://npmjs-db.stratus.dev.ebay.com/_replicate -X POST -d '{"source":"http://isaacs.iriscouch.com/registry/","target":"pub-registry2","create_target":true,"continuous":false,"doc_ids":["socket.io"]}' -H "Content-Type: application/json"
4. Now delete _design/app and compact the DB and try replicating other modue as socket.io would be marked as failed and probably would take time to re-try by couchdb later.

The root cause is _design/app (not sure if it is specific to one or any design docs), once it gets replicated, the replication starts getting //Error: `forbidden`, reason: `Please log in before writing to the db`.//
And _design/app is used to handle some requests to couchdb.

So in your case it looks like the case where it was replicating till it got some _design doc replicated which forbad further replication.

In my case I was replicating from http://isaacs.iriscouch.com/registry/

I have not found what part in _design/app restricts the replication and require login first, just posting if someone got better clue faster than me.

Regards,
Dmytro


On Monday, May 13, 2013 8:16:51 PM UTC-7, andy wrote:

Dmytro Semenov

unread,
Feb 6, 2014, 8:42:49 PM2/6/14
to nod...@googlegroups.com
Quick update: the _design document has validate_doc_update which validate updates.
By commenting out the following section fixed the issue:

  // can't write to the db without logging in.
  if (!user || !user.name) {
    throw { forbidden: "Please log in before writing to the db" }
  }

Regards,
Dmytro

andy e

unread,
Feb 6, 2014, 11:21:57 PM2/6/14
to nod...@googlegroups.com
Dmytro,

Thanks for following up. Glad you got it working. 

Thankfully, someone else gained responsibility for running the repo (so we didn't have to) but the more info there is on how to do this the better. Hopefully npm, inc and the resulting changes they're planning will make it easier to run your own repo and replicate from the main one as well.

andy



--
Reply all
Reply to author
Forward
0 new messages