I have a 2 shard setup and I recently discovered duplicate documents
between shards. I have turned off the balancer so it is not an issue with
an in-progress balancer operation. Is there a tool that I can use to clean
up those duplicates? If not, is there a command that will determine which
shard is the owner of the document?
I'm assuming that you have an index *unique:true* and the duplicates exist because of a migration failed from one shard to another. This resulted in 2 shards having the same data and the configs didn't get updated.
There isn't a single command which will fix this problem unfortunately.
If this is the case you'll need a script which finds and removes orphaned documents.
On Wednesday, September 26, 2012 1:24:01 PM UTC+1, Patrick Scott wrote:
> I have a 2 shard setup and I recently discovered duplicate documents > between shards. I have turned off the balancer so it is not an issue with > an in-progress balancer operation. Is there a tool that I can use to clean > up those duplicates? If not, is there a command that will determine which > shard is the owner of the document?
On Wed, Sep 26, 2012 at 12:01 PM, Gianfranco <gianfra...@10gen.com> wrote:
> Hi,
> I'm assuming that you have an index *unique:true* and the duplicates
> exist because of a migration failed from one shard to another.
> This resulted in 2 shards having the same data and the configs didn't get
> updated.
> There isn't a single command which will fix this problem unfortunately.
> If this is the case you'll need a script which finds and removes orphaned
> documents.
> On Wednesday, September 26, 2012 1:24:01 PM UTC+1, Patrick Scott wrote:
>> I have a 2 shard setup and I recently discovered duplicate documents
>> between shards. I have turned off the balancer so it is not an issue with
>> an in-progress balancer operation. Is there a tool that I can use to clean
>> up those duplicates? If not, is there a command that will determine which
>> shard is the owner of the document?
>> Thanks,
>> Patrick
> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongodb-user@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user+unsubscribe@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb
Could you run this script with the path to the filename of orphanage.js?
Note: The script must be run from a 2.x shell.
And you must connect to primary
If it is in the current working directory, where you started mongo shell, it will be:
1
load("orphanage.js")
After, you'll see a series of options you can now run:
Balancer.stop() -- Do this first, if it's not stopped already
Orphans.find('db.collection') – Find orphans in a given namespace
Orphans.findAll() – Find orphans in all namespaces
Orphans.remove('db.collection') – Remove all orphans in a namespace
Balancer.start()
Please follow the directions and make sure the output of documents to delete is correct before running remove.
On Wednesday, September 26, 2012 6:32:39 PM UTC+1, Patrick Scott wrote:
> So how can I found out which shard "owns" the document?
> On Wed, Sep 26, 2012 at 12:01 PM, Gianfranco <gianf...@10gen.com<javascript:>
> > wrote:
>> Hi,
>> I'm assuming that you have an index *unique:true* and the duplicates >> exist because of a migration failed from one shard to another.
>> This resulted in 2 shards having the same data and the configs didn't get >> updated.
>> There isn't a single command which will fix this problem unfortunately.
>> If this is the case you'll need a script which finds and removes orphaned >> documents.
>> On Wednesday, September 26, 2012 1:24:01 PM UTC+1, Patrick Scott wrote:
>>> I have a 2 shard setup and I recently discovered duplicate documents >>> between shards. I have turned off the balancer so it is not an issue with >>> an in-progress balancer operation. Is there a tool that I can use to clean >>> up those duplicates? If not, is there a command that will determine which >>> shard is the owner of the document?
>>> Thanks,
>>> Patrick
>> -- >> You received this message because you are subscribed to the Google
>> Groups "mongodb-user" group.
>> To post to this group, send email to mongod...@googlegroups.com<javascript:>
>> To unsubscribe from this group, send email to
>> mongodb-user...@googlegroups.com <javascript:>
>> See also the IRC channel -- freenode.net#mongodb
How is db.collection.count() computed? I noticed that it was decreasing as
orphaned documents were deleted. It scared me enough that I stopped the
script but then I checked each shard individually for the document count
and together they equaled the result of a call to db.collection.count()
from mongos.
My guess is that count() reflects the total count of objects in the
collection on each shard which may include orphaned documents.
On Tue, Oct 2, 2012 at 5:12 AM, Gianfranco <gianfra...@10gen.com> wrote:
> Hi Patrick,
> Sorry for the delay.
> Could you run this script with the path to the filename of orphanage.js?
> Note: The script must be run from a 2.x shell.
> And you must connect to primary
> If it is in the current working directory, where you started mongo shell,
> it will be:
> 1
> load("orphanage.js")
> After, you'll see a series of options you can now run:
> Balancer.stop() -- Do this first, if it's not stopped already
> Orphans.find('db.collection') – Find orphans in a given namespace
> Orphans.findAll() – Find orphans in all namespaces
> Orphans.remove('db.collection') – Remove all orphans in a namespace
> Balancer.start()
> Please follow the directions and make sure the output of documents to
> delete is correct before running remove.
> On Wednesday, September 26, 2012 6:32:39 PM UTC+1, Patrick Scott wrote:
>> So how can I found out which shard "owns" the document?
>> On Wed, Sep 26, 2012 at 12:01 PM, Gianfranco <gianf...@10gen.com> wrote:
>>> Hi,
>>> I'm assuming that you have an index *unique:true* and the duplicates
>>> exist because of a migration failed from one shard to another.
>>> This resulted in 2 shards having the same data and the configs didn't
>>> get updated.
>>> There isn't a single command which will fix this problem unfortunately.
>>> If this is the case you'll need a script which finds and removes
>>> orphaned documents.
>>> On Wednesday, September 26, 2012 1:24:01 PM UTC+1, Patrick Scott wrote:
>>>> I have a 2 shard setup and I recently discovered duplicate documents
>>>> between shards. I have turned off the balancer so it is not an issue with
>>>> an in-progress balancer operation. Is there a tool that I can use to clean
>>>> up those duplicates? If not, is there a command that will determine which
>>>> shard is the owner of the document?
>>>> Thanks,
>>>> Patrick
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "mongodb-user" group.
>>> To post to this group, send email to mongod...@googlegroups.com
>>> To unsubscribe from this group, send email to
>>> mongodb-user...@**googlegroups.com
>>> See also the IRC channel -- freenode.net#mongodb
>> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongodb-user@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user+unsubscribe@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb
On Tuesday, October 2, 2012 2:35:48 PM UTC+1, Patrick Scott wrote:
> How is db.collection.count() computed? I noticed that it was decreasing as > orphaned documents were deleted. It scared me enough that I stopped the > script but then I checked each shard individually for the document count > and together they equaled the result of a call to db.collection.count() > from mongos.
> My guess is that count() reflects the total count of objects in the > collection on each shard which may include orphaned documents.
> On Tue, Oct 2, 2012 at 5:12 AM, Gianfranco <gianf...@10gen.com<javascript:>
> > wrote:
>> Hi Patrick,
>> Sorry for the delay.
>> Could you run this script with the path to the filename of orphanage.js?
>> Note: The script must be run from a 2.x shell.
>> And you must connect to primary
>> If it is in the current working directory, where you started mongo shell, >> it will be:
>> 1
>> load("orphanage.js")
>> After, you'll see a series of options you can now run:
>> Balancer.stop() -- Do this first, if it's not stopped already
>> Orphans.find('db.collection') – Find orphans in a given namespace
>> Orphans.findAll() – Find orphans in all namespaces
>> Orphans.remove('db.collection') – Remove all orphans in a namespace
>> Balancer.start()
>> Please follow the directions and make sure the output of documents to >> delete is correct before running remove.
>> On Wednesday, September 26, 2012 6:32:39 PM UTC+1, Patrick Scott wrote:
>>> So how can I found out which shard "owns" the document?
>>> On Wed, Sep 26, 2012 at 12:01 PM, Gianfranco <gianf...@10gen.com> wrote:
>>>> Hi,
>>>> I'm assuming that you have an index *unique:true* and the duplicates >>>> exist because of a migration failed from one shard to another.
>>>> This resulted in 2 shards having the same data and the configs didn't >>>> get updated.
>>>> There isn't a single command which will fix this problem unfortunately.
>>>> If this is the case you'll need a script which finds and removes >>>> orphaned documents.
>>>> On Wednesday, September 26, 2012 1:24:01 PM UTC+1, Patrick Scott wrote:
>>>>> I have a 2 shard setup and I recently discovered duplicate documents >>>>> between shards. I have turned off the balancer so it is not an issue with >>>>> an in-progress balancer operation. Is there a tool that I can use to clean >>>>> up those duplicates? If not, is there a command that will determine which >>>>> shard is the owner of the document?
>>>>> Thanks,
>>>>> Patrick
>>>> -- >>>> You received this message because you are subscribed to the Google
>>>> Groups "mongodb-user" group.
>>>> To post to this group, send email to mongod...@googlegroups.com
>>>> To unsubscribe from this group, send email to
>>>> mongodb-user...@**googlegroups.com
>>>> See also the IRC channel -- freenode.net#mongodb
>>> -- >> You received this message because you are subscribed to the Google
>> Groups "mongodb-user" group.
>> To post to this group, send email to mongod...@googlegroups.com<javascript:>
>> To unsubscribe from this group, send email to
>> mongodb-user...@googlegroups.com <javascript:>
>> See also the IRC channel -- freenode.net#mongodb
On Tue, Oct 2, 2012 at 11:30 AM, Gianfranco <gianfra...@10gen.com> wrote:
> The db.collection.count() from mongoS is a global operation, so it has
> communicate with the shards containing that collection.
> What version of mongo are you running? all the same?
> On Tuesday, October 2, 2012 2:35:48 PM UTC+1, Patrick Scott wrote:
>> How is db.collection.count() computed? I noticed that it was decreasing
>> as orphaned documents were deleted. It scared me enough that I stopped the
>> script but then I checked each shard individually for the document count
>> and together they equaled the result of a call to db.collection.count()
>> from mongos.
>> My guess is that count() reflects the total count of objects in the
>> collection on each shard which may include orphaned documents.
>> On Tue, Oct 2, 2012 at 5:12 AM, Gianfranco <gianf...@10gen.com> wrote:
>>> Hi Patrick,
>>> Sorry for the delay.
>>> Could you run this script with the path to the filename of orphanage.js?
>>> Note: The script must be run from a 2.x shell.
>>> And you must connect to primary
>>> If it is in the current working directory, where you started mongo
>>> shell, it will be:
>>> 1
>>> load("orphanage.js")
>>> After, you'll see a series of options you can now run:
>>> Balancer.stop() -- Do this first, if it's not stopped already
>>> Orphans.find('db.collection') – Find orphans in a given namespace
>>> Orphans.findAll() – Find orphans in all namespaces
>>> Orphans.remove('db.collection'**) – Remove all orphans in a namespace
>>> Balancer.start()
>>> Please follow the directions and make sure the output of documents to
>>> delete is correct before running remove.
>>> On Wednesday, September 26, 2012 6:32:39 PM UTC+1, Patrick Scott wrote:
>>>> So how can I found out which shard "owns" the document?
>>>> On Wed, Sep 26, 2012 at 12:01 PM, Gianfranco <gianf...@10gen.com>wrote:
>>>>> Hi,
>>>>> I'm assuming that you have an index *unique:true* and the duplicates
>>>>> exist because of a migration failed from one shard to another.
>>>>> This resulted in 2 shards having the same data and the configs didn't
>>>>> get updated.
>>>>> There isn't a single command which will fix this problem unfortunately.
>>>>> If this is the case you'll need a script which finds and removes
>>>>> orphaned documents.
>>>>> On Wednesday, September 26, 2012 1:24:01 PM UTC+1, Patrick Scott wrote:
>>>>>> I have a 2 shard setup and I recently discovered duplicate documents
>>>>>> between shards. I have turned off the balancer so it is not an issue with
>>>>>> an in-progress balancer operation. Is there a tool that I can use to clean
>>>>>> up those duplicates? If not, is there a command that will determine which
>>>>>> shard is the owner of the document?
>>>>>> Thanks,
>>>>>> Patrick
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "mongodb-user" group.
>>>>> To post to this group, send email to mongod...@googlegroups.com
>>>>> To unsubscribe from this group, send email to
>>>>> mongodb-user...@**googlegroups.**com
>>>>> See also the IRC channel -- freenode.net#mongodb
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "mongodb-user" group.
>>> To post to this group, send email to mongod...@googlegroups.com
>>> To unsubscribe from this group, send email to
>>> mongodb-user...@**googlegroups.com
>>> See also the IRC channel -- freenode.net#mongodb
>> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongodb-user@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user+unsubscribe@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb
On Tuesday, October 2, 2012 4:40:40 PM UTC+1, Patrick Scott wrote:
> My shards and mongos' are running 2.0.6.
> On Tue, Oct 2, 2012 at 11:30 AM, Gianfranco <gianf...@10gen.com<javascript:>
> > wrote:
>> The db.collection.count() from mongoS is a global operation, so it has >> communicate with the shards containing that collection.
>> What version of mongo are you running? all the same?
>> On Tuesday, October 2, 2012 2:35:48 PM UTC+1, Patrick Scott wrote:
>>> How is db.collection.count() computed? I noticed that it was decreasing >>> as orphaned documents were deleted. It scared me enough that I stopped the >>> script but then I checked each shard individually for the document count >>> and together they equaled the result of a call to db.collection.count() >>> from mongos.
>>> My guess is that count() reflects the total count of objects in the >>> collection on each shard which may include orphaned documents.
>>> On Tue, Oct 2, 2012 at 5:12 AM, Gianfranco <gianf...@10gen.com> wrote:
>>>> Hi Patrick,
>>>> Sorry for the delay.
>>>> Could you run this script with the path to the filename of orphanage.js?
>>>> Note: The script must be run from a 2.x shell.
>>>> And you must connect to primary
>>>> If it is in the current working directory, where you started mongo >>>> shell, it will be:
>>>> 1
>>>> load("orphanage.js")
>>>> After, you'll see a series of options you can now run:
>>>> Balancer.stop() -- Do this first, if it's not stopped already
>>>> Orphans.find('db.collection') – Find orphans in a given namespace
>>>> Orphans.findAll() – Find orphans in all namespaces
>>>> Orphans.remove('db.collection'**) – Remove all orphans in a namespace
>>>> Balancer.start()
>>>> Please follow the directions and make sure the output of documents to >>>> delete is correct before running remove.
>>>> On Wednesday, September 26, 2012 6:32:39 PM UTC+1, Patrick Scott wrote:
>>>>> So how can I found out which shard "owns" the document?
>>>>> On Wed, Sep 26, 2012 at 12:01 PM, Gianfranco <gianf...@10gen.com>wrote:
>>>>>> Hi,
>>>>>> I'm assuming that you have an index *unique:true* and the duplicates >>>>>> exist because of a migration failed from one shard to another.
>>>>>> This resulted in 2 shards having the same data and the configs didn't >>>>>> get updated.
>>>>>> There isn't a single command which will fix this problem >>>>>> unfortunately.
>>>>>> If this is the case you'll need a script which finds and removes >>>>>> orphaned documents.
>>>>>> On Wednesday, September 26, 2012 1:24:01 PM UTC+1, Patrick Scott >>>>>> wrote:
>>>>>>> I have a 2 shard setup and I recently discovered duplicate documents >>>>>>> between shards. I have turned off the balancer so it is not an issue with >>>>>>> an in-progress balancer operation. Is there a tool that I can use to clean >>>>>>> up those duplicates? If not, is there a command that will determine which >>>>>>> shard is the owner of the document?
>>>>>>> Thanks,
>>>>>>> Patrick
>>>>>> -- >>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "mongodb-user" group.
>>>>>> To post to this group, send email to mongod...@googlegroups.com
>>>>>> To unsubscribe from this group, send email to
>>>>>> mongodb-user...@**googlegroups.**com
>>>>>> See also the IRC channel -- freenode.net#mongodb
>>>>> -- >>>> You received this message because you are subscribed to the Google
>>>> Groups "mongodb-user" group.
>>>> To post to this group, send email to mongod...@googlegroups.com
>>>> To unsubscribe from this group, send email to
>>>> mongodb-user...@**googlegroups.com
>>>> See also the IRC channel -- freenode.net#mongodb
>>> -- >> You received this message because you are subscribed to the Google
>> Groups "mongodb-user" group.
>> To post to this group, send email to mongod...@googlegroups.com<javascript:>
>> To unsubscribe from this group, send email to
>> mongodb-user...@googlegroups.com <javascript:>
>> See also the IRC channel -- freenode.net#mongodb
I'm doing updates but not with upserts. I just want to make sure I'm
deleting true orphaned documents. I have about 100000 out of ~83 million
which isn't a lot. If collection.count() includes orphaned items then it
makes perfect sense for the global count to decrease as I delete orphans. I
just want to verify that behavior.
On Tue, Oct 2, 2012 at 12:10 PM, Gianfranco <gianfra...@10gen.com> wrote:
> If you are doing updates with upserts, there is a Fix in 2.1.0 to prevent
> this to happen again.
> https://jira.mongodb.org/browse/SERVER-4639
> On Tuesday, October 2, 2012 4:40:40 PM UTC+1, Patrick Scott wrote:
>> My shards and mongos' are running 2.0.6.
>> On Tue, Oct 2, 2012 at 11:30 AM, Gianfranco <gianf...@10gen.com> wrote:
>>> The db.collection.count() from mongoS is a global operation, so it has
>>> communicate with the shards containing that collection.
>>> What version of mongo are you running? all the same?
>>> On Tuesday, October 2, 2012 2:35:48 PM UTC+1, Patrick Scott wrote:
>>>> How is db.collection.count() computed? I noticed that it was decreasing
>>>> as orphaned documents were deleted. It scared me enough that I stopped the
>>>> script but then I checked each shard individually for the document count
>>>> and together they equaled the result of a call to db.collection.count()
>>>> from mongos.
>>>> My guess is that count() reflects the total count of objects in the
>>>> collection on each shard which may include orphaned documents.
>>>> On Tue, Oct 2, 2012 at 5:12 AM, Gianfranco <gianf...@10gen.com> wrote:
>>>>> Hi Patrick,
>>>>> Sorry for the delay.
>>>>> Could you run this script with the path to the filename of
>>>>> orphanage.js?
>>>>> Note: The script must be run from a 2.x shell.
>>>>> And you must connect to primary
>>>>> If it is in the current working directory, where you started mongo
>>>>> shell, it will be:
>>>>> 1
>>>>> load("orphanage.js")
>>>>> After, you'll see a series of options you can now run:
>>>>> Balancer.stop() -- Do this first, if it's not stopped already
>>>>> Orphans.find('db.collection') – Find orphans in a given namespace
>>>>> Orphans.findAll() – Find orphans in all namespaces
>>>>> Orphans.remove('db.collection'****) – Remove all orphans in a
>>>>> namespace
>>>>> Balancer.start()
>>>>> Please follow the directions and make sure the output of documents to
>>>>> delete is correct before running remove.
>>>>> On Wednesday, September 26, 2012 6:32:39 PM UTC+1, Patrick Scott wrote:
>>>>>> So how can I found out which shard "owns" the document?
>>>>>> On Wed, Sep 26, 2012 at 12:01 PM, Gianfranco <gianf...@10gen.com>wrote:
>>>>>>> Hi,
>>>>>>> I'm assuming that you have an index *unique:true* and the
>>>>>>> duplicates exist because of a migration failed from one shard to another.
>>>>>>> This resulted in 2 shards having the same data and the configs
>>>>>>> didn't get updated.
>>>>>>> There isn't a single command which will fix this problem
>>>>>>> unfortunately.
>>>>>>> If this is the case you'll need a script which finds and removes
>>>>>>> orphaned documents.
>>>>>>> On Wednesday, September 26, 2012 1:24:01 PM UTC+1, Patrick Scott
>>>>>>> wrote:
>>>>>>>> I have a 2 shard setup and I recently discovered duplicate
>>>>>>>> documents between shards. I have turned off the balancer so it is not an
>>>>>>>> issue with an in-progress balancer operation. Is there a tool that I can
>>>>>>>> use to clean up those duplicates? If not, is there a command that will
>>>>>>>> determine which shard is the owner of the document?
>>>>>>>> Thanks,
>>>>>>>> Patrick
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "mongodb-user" group.
>>>>>>> To post to this group, send email to mongod...@googlegroups.com
>>>>>>> To unsubscribe from this group, send email to
>>>>>>> mongodb-user...@**googlegroups.**c**om
>>>>>>> See also the IRC channel -- freenode.net#mongodb
>>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "mongodb-user" group.
>>>>> To post to this group, send email to mongod...@googlegroups.com
>>>>> To unsubscribe from this group, send email to
>>>>> mongodb-user...@**googlegroups.**com
>>>>> See also the IRC channel -- freenode.net#mongodb
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "mongodb-user" group.
>>> To post to this group, send email to mongod...@googlegroups.com
>>> To unsubscribe from this group, send email to
>>> mongodb-user...@**googlegroups.com
>>> See also the IRC channel -- freenode.net#mongodb
>> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongodb-user@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user+unsubscribe@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb
Sorry, I'm not sure what count() function you're referring to.
The normal one on the shell? or a similar one on the script? which line?
If you want to make sure you can go back incase a non duplicate is deleted, as in similar situations, you should back up the datafiles or use mongoexport, specially if it's a production system.
On Tuesday, October 2, 2012 5:18:43 PM UTC+1, Patrick Scott wrote:
> I'm doing updates but not with upserts. I just want to make sure I'm > deleting true orphaned documents. I have about 100000 out of ~83 million > which isn't a lot. If collection.count() includes orphaned items then it > makes perfect sense for the global count to decrease as I delete orphans. I > just want to verify that behavior.
> On Tue, Oct 2, 2012 at 12:10 PM, Gianfranco <gianf...@10gen.com<javascript:>
> > wrote:
>> On Tuesday, October 2, 2012 4:40:40 PM UTC+1, Patrick Scott wrote:
>>> My shards and mongos' are running 2.0.6.
>>> On Tue, Oct 2, 2012 at 11:30 AM, Gianfranco <gianf...@10gen.com> wrote:
>>>> The db.collection.count() from mongoS is a global operation, so it has >>>> communicate with the shards containing that collection.
>>>> What version of mongo are you running? all the same?
>>>> On Tuesday, October 2, 2012 2:35:48 PM UTC+1, Patrick Scott wrote:
>>>>> How is db.collection.count() computed? I noticed that it was >>>>> decreasing as orphaned documents were deleted. It scared me enough that I >>>>> stopped the script but then I checked each shard individually for the >>>>> document count and together they equaled the result of a call to >>>>> db.collection.count() from mongos.
>>>>> My guess is that count() reflects the total count of objects in the >>>>> collection on each shard which may include orphaned documents.
>>>>> On Tue, Oct 2, 2012 at 5:12 AM, Gianfranco <gianf...@10gen.com> wrote:
>>>>>> Hi Patrick,
>>>>>> Sorry for the delay.
>>>>>> Could you run this script with the path to the filename of >>>>>> orphanage.js?
>>>>>> Note: The script must be run from a 2.x shell.
>>>>>> And you must connect to primary
>>>>>> If it is in the current working directory, where you started mongo >>>>>> shell, it will be:
>>>>>> 1
>>>>>> load("orphanage.js")
>>>>>> After, you'll see a series of options you can now run:
>>>>>> Balancer.stop() -- Do this first, if it's not stopped already
>>>>>> Orphans.find('db.collection') – Find orphans in a given namespace
>>>>>> Orphans.findAll() – Find orphans in all namespaces
>>>>>> Orphans.remove('db.collection'****) – Remove all orphans in a >>>>>> namespace
>>>>>> Balancer.start()
>>>>>> Please follow the directions and make sure the output of documents to >>>>>> delete is correct before running remove.
>>>>>> On Wednesday, September 26, 2012 6:32:39 PM UTC+1, Patrick Scott >>>>>> wrote:
>>>>>>> So how can I found out which shard "owns" the document?
>>>>>>> On Wed, Sep 26, 2012 at 12:01 PM, Gianfranco <gianf...@10gen.com>wrote:
>>>>>>>> Hi,
>>>>>>>> I'm assuming that you have an index *unique:true* and the >>>>>>>> duplicates exist because of a migration failed from one shard to another.
>>>>>>>> This resulted in 2 shards having the same data and the configs >>>>>>>> didn't get updated.
>>>>>>>> There isn't a single command which will fix this problem >>>>>>>> unfortunately.
>>>>>>>> If this is the case you'll need a script which finds and removes >>>>>>>> orphaned documents.
>>>>>>>> On Wednesday, September 26, 2012 1:24:01 PM UTC+1, Patrick Scott >>>>>>>> wrote:
>>>>>>>>> I have a 2 shard setup and I recently discovered duplicate >>>>>>>>> documents between shards. I have turned off the balancer so it is not an >>>>>>>>> issue with an in-progress balancer operation. Is there a tool that I can >>>>>>>>> use to clean up those duplicates? If not, is there a command that will >>>>>>>>> determine which shard is the owner of the document?
>>>>>>>>> Thanks,
>>>>>>>>> Patrick
>>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "mongodb-user" group.
>>>>>>>> To post to this group, send email to mongod...@googlegroups.com
>>>>>>>> To unsubscribe from this group, send email to
>>>>>>>> mongodb-user...@**googlegroups.**c**om
>>>>>>>> See also the IRC channel -- freenode.net#mongodb
>>>>>>> -- >>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "mongodb-user" group.
>>>>>> To post to this group, send email to mongod...@googlegroups.com
>>>>>> To unsubscribe from this group, send email to
>>>>>> mongodb-user...@**googlegroups.**com
>>>>>> See also the IRC channel -- freenode.net#mongodb
>>>>> -- >>>> You received this message because you are subscribed to the Google
>>>> Groups "mongodb-user" group.
>>>> To post to this group, send email to mongod...@googlegroups.com
>>>> To unsubscribe from this group, send email to
>>>> mongodb-user...@**googlegroups.com
>>>> See also the IRC channel -- freenode.net#mongodb
>>> -- >> You received this message because you are subscribed to the Google
>> Groups "mongodb-user" group.
>> To post to this group, send email to mongod...@googlegroups.com<javascript:>
>> To unsubscribe from this group, send email to
>> mongodb-user...@googlegroups.com <javascript:>
>> See also the IRC channel -- freenode.net#mongodb
On Wed, Oct 3, 2012 at 6:07 AM, Gianfranco <gianfra...@10gen.com> wrote:
> Sorry, I'm not sure what count() function you're referring to.
> The normal one on the shell? or a similar one on the script? which line?
> If you want to make sure you can go back incase a non duplicate is
> deleted, as in similar situations, you should back up the datafiles or use
> mongoexport, specially if it's a production system.
> On Tuesday, October 2, 2012 5:18:43 PM UTC+1, Patrick Scott wrote:
>> I'm doing updates but not with upserts. I just want to make sure I'm
>> deleting true orphaned documents. I have about 100000 out of ~83 million
>> which isn't a lot. If collection.count() includes orphaned items then it
>> makes perfect sense for the global count to decrease as I delete orphans. I
>> just want to verify that behavior.
>> On Tue, Oct 2, 2012 at 12:10 PM, Gianfranco <gianf...@10gen.com> wrote:
>>> On Tuesday, October 2, 2012 4:40:40 PM UTC+1, Patrick Scott wrote:
>>>> My shards and mongos' are running 2.0.6.
>>>> On Tue, Oct 2, 2012 at 11:30 AM, Gianfranco <gianf...@10gen.com> wrote:
>>>>> The db.collection.count() from mongoS is a global operation, so it has
>>>>> communicate with the shards containing that collection.
>>>>> What version of mongo are you running? all the same?
>>>>> On Tuesday, October 2, 2012 2:35:48 PM UTC+1, Patrick Scott wrote:
>>>>>> How is db.collection.count() computed? I noticed that it was
>>>>>> decreasing as orphaned documents were deleted. It scared me enough that I
>>>>>> stopped the script but then I checked each shard individually for the
>>>>>> document count and together they equaled the result of a call to
>>>>>> db.collection.count() from mongos.
>>>>>> My guess is that count() reflects the total count of objects in the
>>>>>> collection on each shard which may include orphaned documents.
>>>>>> On Tue, Oct 2, 2012 at 5:12 AM, Gianfranco <gianf...@10gen.com>wrote:
>>>>>>> Hi Patrick,
>>>>>>> Sorry for the delay.
>>>>>>> Could you run this script with the path to the filename of
>>>>>>> orphanage.js?
>>>>>>> Note: The script must be run from a 2.x shell.
>>>>>>> And you must connect to primary
>>>>>>> If it is in the current working directory, where you started mongo
>>>>>>> shell, it will be:
>>>>>>> 1
>>>>>>> load("orphanage.js")
>>>>>>> After, you'll see a series of options you can now run:
>>>>>>> Balancer.stop() -- Do this first, if it's not stopped already
>>>>>>> Orphans.find('db.collection') – Find orphans in a given namespace
>>>>>>> Orphans.findAll() – Find orphans in all namespaces
>>>>>>> Orphans.remove('db.collection'******) – Remove all orphans in a
>>>>>>> namespace
>>>>>>> Balancer.start()
>>>>>>> Please follow the directions and make sure the output of documents
>>>>>>> to delete is correct before running remove.
>>>>>>> On Wednesday, September 26, 2012 6:32:39 PM UTC+1, Patrick Scott
>>>>>>> wrote:
>>>>>>>> So how can I found out which shard "owns" the document?
>>>>>>>> On Wed, Sep 26, 2012 at 12:01 PM, Gianfranco <gianf...@10gen.com>wrote:
>>>>>>>>> Hi,
>>>>>>>>> I'm assuming that you have an index *unique:true* and the
>>>>>>>>> duplicates exist because of a migration failed from one shard to another.
>>>>>>>>> This resulted in 2 shards having the same data and the configs
>>>>>>>>> didn't get updated.
>>>>>>>>> There isn't a single command which will fix this problem
>>>>>>>>> unfortunately.
>>>>>>>>> If this is the case you'll need a script which finds and removes
>>>>>>>>> orphaned documents.
>>>>>>>>> On Wednesday, September 26, 2012 1:24:01 PM UTC+1, Patrick Scott
>>>>>>>>> wrote:
>>>>>>>>>> I have a 2 shard setup and I recently discovered duplicate
>>>>>>>>>> documents between shards. I have turned off the balancer so it is not an
>>>>>>>>>> issue with an in-progress balancer operation. Is there a tool that I can
>>>>>>>>>> use to clean up those duplicates? If not, is there a command that will
>>>>>>>>>> determine which shard is the owner of the document?
>>>>>>>>>> Thanks,
>>>>>>>>>> Patrick
>>>>>>>>> --
>>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>>> Groups "mongodb-user" group.
>>>>>>>>> To post to this group, send email to mongod...@googlegroups.com
>>>>>>>>> To unsubscribe from this group, send email to
>>>>>>>>> mongodb-user...@**googlegroups.**c****om
>>>>>>>>> See also the IRC channel -- freenode.net#mongodb
>>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "mongodb-user" group.
>>>>>>> To post to this group, send email to mongod...@googlegroups.com
>>>>>>> To unsubscribe from this group, send email to
>>>>>>> mongodb-user...@**googlegroups.**c**om
>>>>>>> See also the IRC channel -- freenode.net#mongodb
>>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "mongodb-user" group.
>>>>> To post to this group, send email to mongod...@googlegroups.com
>>>>> To unsubscribe from this group, send email to
>>>>> mongodb-user...@**googlegroups.**com
>>>>> See also the IRC channel -- freenode.net#mongodb
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "mongodb-user" group.
>>> To post to this group, send email to mongod...@googlegroups.com
>>> To unsubscribe from this group, send email to
>>> mongodb-user...@**googlegroups.com
>>> See also the IRC channel -- freenode.net#mongodb
>> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongodb-user@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user+unsubscribe@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb
On Wednesday, October 3, 2012 1:25:40 PM UTC+1, Patrick Scott wrote:
> I'm referring to the shell command db.<collection>.count(). Does it > include orphaned documents?
> On Wed, Oct 3, 2012 at 6:07 AM, Gianfranco <gianf...@10gen.com<javascript:>
> > wrote:
>> Sorry, I'm not sure what count() function you're referring to.
>> The normal one on the shell? or a similar one on the script? which line?
>> If you want to make sure you can go back incase a non duplicate is >> deleted, as in similar situations, you should back up the datafiles or use >> mongoexport, specially if it's a production system.
>> On Tuesday, October 2, 2012 5:18:43 PM UTC+1, Patrick Scott wrote:
>>> I'm doing updates but not with upserts. I just want to make sure I'm >>> deleting true orphaned documents. I have about 100000 out of ~83 million >>> which isn't a lot. If collection.count() includes orphaned items then it >>> makes perfect sense for the global count to decrease as I delete orphans. I >>> just want to verify that behavior.
>>> On Tue, Oct 2, 2012 at 12:10 PM, Gianfranco <gianf...@10gen.com> wrote:
>>>> On Tuesday, October 2, 2012 4:40:40 PM UTC+1, Patrick Scott wrote:
>>>>> My shards and mongos' are running 2.0.6.
>>>>> On Tue, Oct 2, 2012 at 11:30 AM, Gianfranco <gianf...@10gen.com>wrote:
>>>>>> The db.collection.count() from mongoS is a global operation, so it >>>>>> has communicate with the shards containing that collection.
>>>>>> What version of mongo are you running? all the same?
>>>>>> On Tuesday, October 2, 2012 2:35:48 PM UTC+1, Patrick Scott wrote:
>>>>>>> How is db.collection.count() computed? I noticed that it was >>>>>>> decreasing as orphaned documents were deleted. It scared me enough that I >>>>>>> stopped the script but then I checked each shard individually for the >>>>>>> document count and together they equaled the result of a call to >>>>>>> db.collection.count() from mongos.
>>>>>>> My guess is that count() reflects the total count of objects in the >>>>>>> collection on each shard which may include orphaned documents.
>>>>>>> On Tue, Oct 2, 2012 at 5:12 AM, Gianfranco <gianf...@10gen.com>wrote:
>>>>>>>> Hi Patrick,
>>>>>>>> Sorry for the delay.
>>>>>>>> Could you run this script with the path to the filename of >>>>>>>> orphanage.js?
>>>>>>>> Note: The script must be run from a 2.x shell.
>>>>>>>> And you must connect to primary
>>>>>>>> If it is in the current working directory, where you started mongo >>>>>>>> shell, it will be:
>>>>>>>> 1
>>>>>>>> load("orphanage.js")
>>>>>>>> After, you'll see a series of options you can now run:
>>>>>>>> Balancer.stop() -- Do this first, if it's not stopped already
>>>>>>>> Orphans.find('db.collection') – Find orphans in a given namespace
>>>>>>>> Orphans.findAll() – Find orphans in all namespaces
>>>>>>>> Orphans.remove('db.collection'******) – Remove all orphans in a >>>>>>>> namespace
>>>>>>>> Balancer.start()
>>>>>>>> Please follow the directions and make sure the output of documents >>>>>>>> to delete is correct before running remove.
>>>>>>>> On Wednesday, September 26, 2012 6:32:39 PM UTC+1, Patrick Scott >>>>>>>> wrote:
>>>>>>>>> So how can I found out which shard "owns" the document?
>>>>>>>>> On Wed, Sep 26, 2012 at 12:01 PM, Gianfranco <gianf...@10gen.com>wrote:
>>>>>>>>>> Hi,
>>>>>>>>>> I'm assuming that you have an index *unique:true* and the >>>>>>>>>> duplicates exist because of a migration failed from one shard to another.
>>>>>>>>>> This resulted in 2 shards having the same data and the configs >>>>>>>>>> didn't get updated.
>>>>>>>>>> There isn't a single command which will fix this problem >>>>>>>>>> unfortunately.
>>>>>>>>>> If this is the case you'll need a script which finds and removes >>>>>>>>>> orphaned documents.
>>>>>>>>>> On Wednesday, September 26, 2012 1:24:01 PM UTC+1, Patrick Scott >>>>>>>>>> wrote:
>>>>>>>>>>> I have a 2 shard setup and I recently discovered duplicate >>>>>>>>>>> documents between shards. I have turned off the balancer so it is not an >>>>>>>>>>> issue with an in-progress balancer operation. Is there a tool that I can >>>>>>>>>>> use to clean up those duplicates? If not, is there a command that will >>>>>>>>>>> determine which shard is the owner of the document?
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Patrick
>>>>>>>>>> -- >>>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>>>> Groups "mongodb-user" group.
>>>>>>>>>> To post to this group, send email to mongod...@googlegroups.com
>>>>>>>>>> To unsubscribe from this group, send email to
>>>>>>>>>> mongodb-user...@**googlegroups.**c****om
>>>>>>>>>> See also the IRC channel -- freenode.net#mongodb
>>>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "mongodb-user" group.
>>>>>>>> To post to this group, send email to mongod...@googlegroups.com
>>>>>>>> To unsubscribe from this group, send email to
>>>>>>>> mongodb-user...@**googlegroups.**c**om
>>>>>>>> See also the IRC channel -- freenode.net#mongodb
>>>>>>> -- >>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "mongodb-user" group.
>>>>>> To post to this group, send email to mongod...@googlegroups.com
>>>>>> To unsubscribe from this group, send email to
>>>>>> mongodb-user...@**googlegroups.**com
>>>>>> See also the IRC channel -- freenode.net#mongodb
>>>>> -- >>>> You received this message because you are subscribed to the Google
>>>> Groups "mongodb-user" group.
>>>> To post to this group, send email to mongod...@googlegroups.com
>>>> To unsubscribe from this group, send email to
>>>> mongodb-user...@**googlegroups.com
>>>> See also the IRC channel -- freenode.net#mongodb
>>> -- >> You received this message because you are subscribed to the Google
>> Groups "mongodb-user" group.
>> To post to this group, send email to mongod...@googlegroups.com<javascript:>
>> To unsubscribe from this group, send email to
>> mongodb-user...@googlegroups.com <javascript:>
>> See also the IRC channel -- freenode.net#mongodb
On Wed, Oct 3, 2012 at 8:35 AM, Gianfranco <gianfra...@10gen.com> wrote:
> Yes it does. It counts all the documents across the shards for that
> collection (when connected to the mongoS)
> On Wednesday, October 3, 2012 1:25:40 PM UTC+1, Patrick Scott wrote:
>> I'm referring to the shell command db.<collection>.count(). Does it
>> include orphaned documents?
>> On Wed, Oct 3, 2012 at 6:07 AM, Gianfranco <gianf...@10gen.com> wrote:
>>> Sorry, I'm not sure what count() function you're referring to.
>>> The normal one on the shell? or a similar one on the script? which line?
>>> If you want to make sure you can go back incase a non duplicate is
>>> deleted, as in similar situations, you should back up the datafiles or use
>>> mongoexport, specially if it's a production system.
>>> On Tuesday, October 2, 2012 5:18:43 PM UTC+1, Patrick Scott wrote:
>>>> I'm doing updates but not with upserts. I just want to make sure I'm
>>>> deleting true orphaned documents. I have about 100000 out of ~83 million
>>>> which isn't a lot. If collection.count() includes orphaned items then it
>>>> makes perfect sense for the global count to decrease as I delete orphans. I
>>>> just want to verify that behavior.
>>>> On Tue, Oct 2, 2012 at 12:10 PM, Gianfranco <gianf...@10gen.com> wrote:
>>>>> On Tuesday, October 2, 2012 4:40:40 PM UTC+1, Patrick Scott wrote:
>>>>>> My shards and mongos' are running 2.0.6.
>>>>>> On Tue, Oct 2, 2012 at 11:30 AM, Gianfranco <gianf...@10gen.com>wrote:
>>>>>>> The db.collection.count() from mongoS is a global operation, so it
>>>>>>> has communicate with the shards containing that collection.
>>>>>>> What version of mongo are you running? all the same?
>>>>>>> On Tuesday, October 2, 2012 2:35:48 PM UTC+1, Patrick Scott wrote:
>>>>>>>> How is db.collection.count() computed? I noticed that it was
>>>>>>>> decreasing as orphaned documents were deleted. It scared me enough that I
>>>>>>>> stopped the script but then I checked each shard individually for the
>>>>>>>> document count and together they equaled the result of a call to
>>>>>>>> db.collection.count() from mongos.
>>>>>>>> My guess is that count() reflects the total count of objects in the
>>>>>>>> collection on each shard which may include orphaned documents.
>>>>>>>> On Tue, Oct 2, 2012 at 5:12 AM, Gianfranco <gianf...@10gen.com>wrote:
>>>>>>>>> Hi Patrick,
>>>>>>>>> Sorry for the delay.
>>>>>>>>> Could you run this script with the path to the filename of
>>>>>>>>> orphanage.js?
>>>>>>>>> Note: The script must be run from a 2.x shell.
>>>>>>>>> And you must connect to primary
>>>>>>>>> If it is in the current working directory, where you started mongo
>>>>>>>>> shell, it will be:
>>>>>>>>> 1
>>>>>>>>> load("orphanage.js")
>>>>>>>>> After, you'll see a series of options you can now run:
>>>>>>>>> Balancer.stop() -- Do this first, if it's not stopped already
>>>>>>>>> Orphans.find('db.collection') – Find orphans in a given namespace
>>>>>>>>> Orphans.findAll() – Find orphans in all namespaces
>>>>>>>>> Orphans.remove('db.collection'********) – Remove all orphans in a
>>>>>>>>> namespace
>>>>>>>>> Balancer.start()
>>>>>>>>> Please follow the directions and make sure the output of documents
>>>>>>>>> to delete is correct before running remove.
>>>>>>>>> On Wednesday, September 26, 2012 6:32:39 PM UTC+1, Patrick Scott
>>>>>>>>> wrote:
>>>>>>>>>> So how can I found out which shard "owns" the document?
>>>>>>>>>> On Wed, Sep 26, 2012 at 12:01 PM, Gianfranco <gianf...@10gen.com>wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>> I'm assuming that you have an index *unique:true* and the
>>>>>>>>>>> duplicates exist because of a migration failed from one shard to another.
>>>>>>>>>>> This resulted in 2 shards having the same data and the configs
>>>>>>>>>>> didn't get updated.
>>>>>>>>>>> There isn't a single command which will fix this problem
>>>>>>>>>>> unfortunately.
>>>>>>>>>>> If this is the case you'll need a script which finds and removes
>>>>>>>>>>> orphaned documents.
>>>>>>>>>>> On Wednesday, September 26, 2012 1:24:01 PM UTC+1, Patrick Scott
>>>>>>>>>>> wrote:
>>>>>>>>>>>> I have a 2 shard setup and I recently discovered duplicate
>>>>>>>>>>>> documents between shards. I have turned off the balancer so it is not an
>>>>>>>>>>>> issue with an in-progress balancer operation. Is there a tool that I can
>>>>>>>>>>>> use to clean up those duplicates? If not, is there a command that will
>>>>>>>>>>>> determine which shard is the owner of the document?
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Patrick
>>>>>>>>>>> --
>>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>>> Google
>>>>>>>>>>> Groups "mongodb-user" group.
>>>>>>>>>>> To post to this group, send email to mongod...@googlegroups.com
>>>>>>>>>>> To unsubscribe from this group, send email to
>>>>>>>>>>> mongodb-user...@**googlegroups.**c******om
>>>>>>>>>>> See also the IRC channel -- freenode.net#mongodb
>>>>>>>>>> --
>>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>>> Groups "mongodb-user" group.
>>>>>>>>> To post to this group, send email to mongod...@googlegroups.com
>>>>>>>>> To unsubscribe from this group, send email to
>>>>>>>>> mongodb-user...@**googlegroups.**c****om
>>>>>>>>> See also the IRC channel -- freenode.net#mongodb
>>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "mongodb-user" group.
>>>>>>> To post to this group, send email to mongod...@googlegroups.com
>>>>>>> To unsubscribe from this group, send email to
>>>>>>> mongodb-user...@**googlegroups.**c**om
>>>>>>> See also the IRC channel -- freenode.net#mongodb
>>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "mongodb-user" group.
>>>>> To post to this group, send email to mongod...@googlegroups.com
>>>>> To unsubscribe from this group, send email to
>>>>> mongodb-user...@**googlegroups.**com
>>>>> See also the IRC channel -- freenode.net#mongodb
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "mongodb-user" group.
>>> To post to this group, send email to mongod...@googlegroups.com
>>> To unsubscribe from this group, send email to
>>> mongodb-user...@**googlegroups.com
>>> See also the IRC channel -- freenode.net#mongodb
>> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongodb-user@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user+unsubscribe@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb