killOp has no effect

918 views
Skip to first unread message

Dustin Norlander

unread,
Aug 7, 2010, 12:16:34 PM8/7/10
to mongod...@googlegroups.com
I've been trying to kill a mysteriously long running op for an hour now.

from db.currentOp()

...
{
"opid" : 1290985011,
"active" : true,
"lockType" : "write",
"waitingForLock" : false,
"secs_running" : 33554,
"op" : "update",
"ns" : "foursquare.daily",
"query" : {
"ts" : "Sun Aug 08 2010 03:59:59 GMT+0000 (UTC)",
"dataset_id" : ObjectId("4b847454754cc34074347c1c")
},
"client" : "67.202.28.9:36833",
"desc" : "conn"
},
...

This is a normal query that gets run thousands of times a day, but
this one seems to have locked the database.


running killOp has no effect (i've been trying for an hour). I also
killed the client that generated the query.

> db.killOp(1290985011)
{ "info" : "attempting to kill op" }
> db.killOp('1290985011')
{ "err" : "no op number field specified?" }
> db.killOp(1290985011)
{ "info" : "attempting to kill op" }
> db.killOp(1290985011)
{ "info" : "attempting to kill op" }
> db.killOp(1290985011)
{ "info" : "attempting to kill op" }
> db.killOp(1290985011)
{ "info" : "attempting to kill op" }
> db.killOp(1290985011)
{ "info" : "attempting to kill op" }
> db.killOp(1290985011)
{ "info" : "attempting to kill op" }
>


I'd prefer to not have to hard kill the database. I'm running 1.4.4.

Eliot Horowitz

unread,
Aug 7, 2010, 1:03:26 PM8/7/10
to mongod...@googlegroups.com
Is this a single or multi update?
Is the box idle or is something running?

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>

Dustin Norlander

unread,
Aug 7, 2010, 1:26:04 PM8/7/10
to mongod...@googlegroups.com
should be a single update. mongo process is 100% cpu and seemingly
has been for hours

Eliot Horowitz

unread,
Aug 7, 2010, 4:02:02 PM8/7/10
to mongod...@googlegroups.com
Can you send a snapshot of the web console?

Dustin Norlander

unread,
Aug 7, 2010, 10:45:12 PM8/7/10
to mongod...@googlegroups.com
I ended up doing a kill -9 on it, then upgrading to 1.6.. will keep
an eye out for issues.

I've always had the webconsole disabled, is there info on there that
is not available via the console?

Eliot Horowitz

unread,
Aug 7, 2010, 11:00:25 PM8/7/10
to mongod...@googlegroups.com
The webconsole has a lot of aggregated info from a lot of commands, so
its a great quick diagnostic.

Dustin Norlander

unread,
Aug 10, 2010, 11:29:30 AM8/10/10
to mongod...@googlegroups.com
Exact same problem again today. Can't get to the webconsole, thought
I had enabled it, but maybe not.

What else can I try? killOp still has no effect.

Eliot Horowitz

unread,
Aug 10, 2010, 11:34:35 AM8/10/10
to mongod...@googlegroups.com
Can you send db.currentOp() again?
How many results should match that query?
Did this box every crash? Its possible an indexex is corrupted.
Can you try doing a --repair?
Also - upgrading to 1.6.0 would be interesting.

Dustin Norlander

unread,
Aug 10, 2010, 1:00:20 PM8/10/10
to mongod...@googlegroups.com
> How many results should match that query?
will match 1 record, and indexes are set up correctly so nscanned is always 1.


I guess it is possible an index is corrupt. I can't really run
repairdatabase on the whole server as it is approaching .5 tb and I
can't take the downtime.

from the console if I run:

db.repairDatabase();

will it block the whole server, or just lock that db? (also will the
table still be readable?)

> Also - upgrading to 1.6.0 would be interesting.

I did upgrade to 1.6 the last time this happened.

Thanks so much for the help,

Eliot Horowitz

unread,
Aug 10, 2010, 6:35:01 PM8/10/10
to mongod...@googlegroups.com
It will block the server.
A faster option would just be dropping and recreating that index used
by that query.

Dustin Norlander

unread,
Aug 22, 2010, 11:55:14 AM8/22/10
to mongod...@googlegroups.com
GAH.. Mongo keeps locking up on similar queries. about once every
1-3 days it locks completely. cpu is 100% forever.
I have switched to a fresh slave, so I don't think there is any data
corruption (unless corruption can propagate slaves). the web console
page gives:

error loading page: timeout getting readlock

the query is :

{
"opid" : 16703858,


"active" : true,
"lockType" : "write",
"waitingForLock" : false,

"secs_running" : 15934,
"op" : "update",
"ns" : "trendrr_data.weekly",
"query" : {
"ts" : "Sat Aug 28 2010 03:59:59 GMT+0000 (UTC)",
"dataset_id" : ObjectId("4c35f778a04bf761eca2e6aa")
},
"client" : "xxxx",
"desc" : "conn"
},

Eliot Horowitz

unread,
Aug 22, 2010, 12:11:43 PM8/22/10
to mongod...@googlegroups.com
This is on 1.6? Can you send the full update you're doing? Can you turn on -vv and run? Will give a lot more debugging.

Adam Greene

unread,
Aug 30, 2010, 2:12:18 AM8/30/10
to mongodb-user
Hi Eliot,

I ran into the same issue. I think it locked up because in a m/r job
I'm pulling from db.collectionA, and in the finalize method upserting
into db.collectionB. I seem to recall that after 1.5.x that was a bad
thing (though I don't know why).

but going back to the thread, shouldn't this just kill the op
regardless of the locking state?

thanks,
adam

Eliot Horowitz

unread,
Aug 30, 2010, 8:43:06 AM8/30/10
to mongod...@googlegroups.com
What version are you running?
killOp can only work if we have a hook in that place to kill it.
so if there is a bug - may not work

can you start a new thread if this is on 1.6.1?


--
Reply all
Reply to author
Forward
0 new messages