Mongodb : "from collection cannot be sharded"

2,376 views
Skip to first unread message

yash vora

unread,
Oct 9, 2017, 3:00:32 AM10/9/17
to mongodb-user
Hello,

Scenario :
I have one sharded collection (test1) in which i run below query and it exits with error code 4567 displaying that "from collection cannot be sharded".

db.test1.aggregate([
{ "$group" : { "_id" : "$field1", "field1" : { "$first" : "$field1" } } },  
{ "$lookup" : {  "from" : "vw1",  "localField" : "field1",  "foreignField" : "field1",  "as" : "vw1"  } 
}, { "$match" : { 
    "vw1" : { "$eq" : [] }, "DateTime" : { "$gte" : ISODate("2016-06-19T00:00:00.000Z"), "$lt" : ISODate("2017-06-19T23:59:59.997Z") }  } 
}, { "$project" : { "_id" : "$_id.field1", "field1" : "$field1" } }]).forEach( function(objRule2) {
    db.test2.insert(objRule2);   
});


Below is the error log which i get :

assert: command failed: {
"ok" : 0,
"errmsg" : "from collection cannot be sharded",
"code" : 4567,
"codeName" : "Location4567"
} : aggregate failed
_getErrorWithCode@src/mongo/shell/utils.js:25:13
doassert@src/mongo/shell/assert.js:16:14
assert.commandWorked@src/mongo/shell/assert.js:370:5
DBCollection.prototype.aggregate@src/mongo/shell/collection.js:1319:5
@(shell):1:1

2017-10-09T01:37:12.977-0400 E QUERY    [thread1] Error: command failed: {
"ok" : 0,
"errmsg" : "from collection cannot be sharded",
"code" : 4567,
"codeName" : "Location4567"
} : aggregate failed :
_getErrorWithCode@src/mongo/shell/utils.js:25:13
doassert@src/mongo/shell/assert.js:16:14
assert.commandWorked@src/mongo/shell/assert.js:370:5
DBCollection.prototype.aggregate@src/mongo/shell/collection.js:1319:5
@(shell):1:1

As there were shards, i created a view named vw1 and then queried that view as shown in query.
Need help in understanding log and whether $lookup containing from clause does work on shards or not.
Do we have anything on mongo that it behaves differently in windows and linux ? as same query runs in windows environment successfully.
Whereas for same query when i tried to run on Linux then it gives me above error log.

Thanks in advance !!!
Cheers !!!!

Rhys Campbell

unread,
Oct 9, 2017, 4:10:59 AM10/9/17
to mongodb-user
No it doesn;t work with sharded collection...


"Do we have anything on mongo that it behaves differently in windows and linux ? as same query runs in windows environment successfully."

No difference in behaviour with Linux & Windows AFAIK. Your Windows mongo instance is probably not sharded.  

Stephen Steneker

unread,
Oct 9, 2017, 4:15:00 AM10/9/17
to mongodb-user
On Monday, 9 October 2017 18:00:32 UTC+11, yash vora wrote:
Scenario :
I have one sharded collection (test1) in which i run below query and it exits with error code 4567 displaying that "from collection cannot be sharded".

Hi Yash,

This is an expected limitation of the $lookup aggregation stage:

Performs a left outer join to an unsharded collection in the same database to filter in documents from the “joined” collection for processing.

Do we have anything on mongo that it behaves differently in windows and linux ?

I expect the difference is that you sharded the from collection in Linux but did not shard this collection in your Windows environment. Assuming both environments are running the same version of MongoDB, the query processing should be consistent.

Regards,
Stennie

yash vora

unread,
Oct 9, 2017, 4:27:11 AM10/9/17
to mongodb-user
Hello Rhys, 

Thanks for prompt reply.

I already have sharded the same collection in windows and its working fine.

Also as per its definition suggests lookup works on unsharded collection. I have vw1 as an unsharded view referencing my shard collection. Still it will not work ?

Also is there any alternative for this?

yash vora

unread,
Oct 9, 2017, 4:27:29 AM10/9/17
to mongodb-user
Hello Rhys, 

Thanks for prompt reply.

I already have sharded the same collection in windows and its working fine.

Also as per its definition suggests lookup works on unsharded collection. I have vw1 as an unsharded view referencing my shard collection. Still it will not work ?

Also is there any alternative for this?

Rhys Campbell

unread,
Oct 9, 2017, 5:27:11 AM10/9/17
to mongodb-user
Well, it shouldn't be working on a sharded collection at all according to the documentation. I would guess you have not actually sharded the collection. Double check with sh.status(). Sticking a view in front of the collection isn't a way around the issue. I guess MongoDB force this because you could end up running loads of queries on your entire cluster is they allowed it.

You probably have to do this in two operations.

yash vora

unread,
Oct 9, 2017, 5:53:02 AM10/9/17
to mongodb-user
I checked for both environments.Hashed Shard is done on both of them on DateTime column.

And I ran similar query which ran successfully on windows and again ran same on linux os and it gave me same error.

Is there any issue with build version.

On windows its : 3.4.6
On linux its : 3.4.9.

Also with two operations what i understood is to bring data in different unsharded collection and then include that collection on same query. Right ???

Rhys Campbell

unread,
Oct 9, 2017, 5:57:18 AM10/9/17
to mongodb-user
Paste the output from sh.status from both environments.

yash vora

unread,
Oct 9, 2017, 6:16:52 AM10/9/17
to mongodb-user
On Linux : 

--- Sharding Status --- 
  sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("59d5f493d51272db946b0520")
}
  shards:
{  "_id" : "a",  "host" : "a/localhost.localdomain:27000,localhost.localdomain:27001,localhost.localdomain:27002",  "state" : 1 }
{  "_id" : "b",  "host" : "b/localhost.localdomain:27100,localhost.localdomain:27101,localhost.localdomain:27102",  "state" : 1 }
{  "_id" : "c",  "host" : "c/localhost.localdomain:27200,localhost.localdomain:27201,localhost.localdomain:27202",  "state" : 1 }
{  "_id" : "d",  "host" : "d/localhost.localdomain:27300,localhost.localdomain:27301,localhost.localdomain:27302",  "state" : 1 }
  active mongoses:
"3.4.9" : 1
 autosplit:
Currently enabled: yes
  balancer:
Currently enabled:  yes
Currently running:  no
Balancer lock taken at Thu Oct 05 2017 05:00:03 GMT-0400 (EDT) by ConfigServer:Balancer
Failed balancer rounds in last 5 attempts:  0
Migration Results for the last 24 hours: 
No recent migrations
  databases:
{  "_id" : "DB1",  "primary" : "d",  "partitioned" : true }
DB1.Collection1
shard key: { "DateTime" : "hashed" }
unique: false
balancing: true
chunks:
a 14
b 13
c 14
d 13
too many chunks to print, use verbose if you want to force print
{  "_id" : "FMSCDR_DWH",  "primary" : "a",  "partitioned" : false }

On windows :

--- Sharding Status --- 
  sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("599c0c547907d8be247367c9")
}
  shards:
{  "_id" : "shard_0",  "host" : "shard_0/BSS-BC4-Benchmark2:27000,BSS-BC4-Benchmark2:27001,BSS-BC4-Benchmark2:27002",  "state" : 1,  "tags" : [ "zone1" ] }
{  "_id" : "shard_1",  "host" : "shard_1/localhost:27003,localhost:27004,localhost:27005",  "state" : 1,  "tags" : [ "zone1" ] }
{  "_id" : "shard_2",  "host" : "shard_2/localhost:27006,localhost:27007,localhost:27008",  "state" : 1,  "tags" : [ "zone1" ] }
{  "_id" : "shard_3",  "host" : "shard_3/localhost:27009,localhost:27010,localhost:27011",  "state" : 1,  "tags" : [ "zone1" ] }
  active mongoses:
"3.4.7" : 1
 autosplit:
Currently enabled: yes
  balancer:
Currently enabled:  yes
Currently running:  no
Balancer lock taken at Thu Sep 07 2017 22:05:39 GMT-0700 (Pacific Standard Time) by ConfigServer:Balancer
Failed balancer rounds in last 5 attempts:  5
Last reported error:  interrupted at shutdown
Time of Reported error:  Thu Sep 07 2017 22:03:58 GMT-0700 (Pacific Standard Time)
Migration Results for the last 24 hours: 
No recent migrations
  databases:
{  "_id" : "DB1",  "primary" : "shard_0",  "partitioned" : true }
DB1.Collection1
shard key: { "DateTime" : "hashed" }
unique: false
balancing: true
chunks:
shard_0 14
shard_1 14
shard_2 14
shard_3 14
too many chunks to print, use verbose if you want to force print
DB1.Collection2
shard key: { "DateTime" : "hashed" }
unique: false
balancing: true
chunks:
shard_0 17
shard_1 17
shard_2 17
shard_3 17
too many chunks to print, use verbose if you want to force print
DB1.Collection3
shard key: { "DateTime" : 1, "Hour" : 1 }
unique: false
balancing: true
chunks:
shard_0 85
too many chunks to print, use verbose if you want to force print
tag: zone1  { "DateTime" : { "$minKey" : 1 }, "Hour" : 0 } -->> { "DateTime" : { "$maxKey" : 1 }, "Hour" : 7 }
tag: zone1  { "DateTime" : { "$minKey" : 1 }, "Hour" : 7 } -->> { "DateTime" : { "$maxKey" : 1 }, "Hour" : 13 }
tag: zone1  { "DateTime" : { "$minKey" : 1 }, "Hour" : 13 } -->> { "DateTime" : { "$maxKey" : 1 }, "Hour" : 19 }
tag: zone1  { "DateTime" : { "$minKey" : 1 }, "Hour" : 19 } -->> { "DateTime" : { "$maxKey" : 1 }, "Hour" : 24 }
DB1.Collection4
shard key: { "DateTime" : 1, "Hour" : 1, "Min" : 1 }
unique: false
balancing: true
chunks:
shard_0 83
too many chunks to print, use verbose if you want to force print
tag: zone1  { "DateTime" : { "$minKey" : 1 }, "Hour" : { "$minKey" : 1 }, "Min" : 0 } -->> { "DateTime" : { "$maxKey" : 1 }, "Hour" : { "$maxKey" : 1 }, "Min" : 16 }
tag: zone1  { "DateTime" : { "$minKey" : 1 }, "Hour" : { "$minKey" : 1 }, "Min" : 16 } -->> { "DateTime" : { "$maxKey" : 1 }, "Hour" : { "$maxKey" : 1 }, "Min" : 31 }
tag: zone1  { "DateTime" : { "$minKey" : 1 }, "Hour" : { "$minKey" : 1 }, "Min" : 31 } -->> { "DateTime" : { "$maxKey" : 1 }, "Hour" : { "$maxKey" : 1 }, "Min" : 46 }
tag: zone1  { "DateTime" : { "$minKey" : 1 }, "Hour" : { "$minKey" : 1 }, "Min" : 46 } -->> { "DateTime" : { "$maxKey" : 1 }, "Hour" : { "$maxKey" : 1 }, "Min" : 60 }
DB1.Collection5
shard key: { "IMSI" : "hashed" }
unique: false
balancing: true
chunks:
shard_0 14
shard_1 14
shard_2 14
shard_3 14
too many chunks to print, use verbose if you want to force print
DB1.Collection6
shard key: { "MSISDN" : "hashed" }
unique: false
balancing: true
chunks:
shard_0 14
shard_1 14
shard_2 14
shard_3 14
too many chunks to print, use verbose if you want to force print
DB1.Collection7
shard key: { "DateTime" : "hashed" }
unique: false
balancing: true
chunks:
shard_0 2
shard_1 2
shard_2 2
shard_3 2
{ "DateTime" : { "$minKey" : 1 } } -->> { "DateTime" : NumberLong("-6917529027641081850") } on : shard_0 Timestamp(4, 2) 
{ "DateTime" : NumberLong("-6917529027641081850") } -->> { "DateTime" : NumberLong("-4611686018427387900") } on : shard_0 Timestamp(4, 3) 
{ "DateTime" : NumberLong("-4611686018427387900") } -->> { "DateTime" : NumberLong("-2305843009213693950") } on : shard_1 Timestamp(4, 4) 
{ "DateTime" : NumberLong("-2305843009213693950") } -->> { "DateTime" : NumberLong(0) } on : shard_1 Timestamp(4, 5) 
{ "DateTime" : NumberLong(0) } -->> { "DateTime" : NumberLong("2305843009213693950") } on : shard_2 Timestamp(4, 6) 
{ "DateTime" : NumberLong("2305843009213693950") } -->> { "DateTime" : NumberLong("4611686018427387900") } on : shard_2 Timestamp(4, 7) 
{ "DateTime" : NumberLong("4611686018427387900") } -->> { "DateTime" : NumberLong("6917529027641081850") } on : shard_3 Timestamp(4, 8) 
{ "DateTime" : NumberLong("6917529027641081850") } -->> { "DateTime" : { "$maxKey" : 1 } } on : shard_3 Timestamp(4, 9) 
{  "_id" : "DB2",  "primary" : "shard_2",  "partitioned" : true }
DB2.Collection2
shard key: { "DateTime" : "hashed" }
unique: false
balancing: true
chunks:
shard_0 2
shard_1 2
shard_2 2
shard_3 2
{ "DateTime" : { "$minKey" : 1 } } -->> { "DateTime" : NumberLong("-6917529027641081850") } on : shard_0 Timestamp(4, 2) 
{ "DateTime" : NumberLong("-6917529027641081850") } -->> { "DateTime" : NumberLong("-4611686018427387900") } on : shard_0 Timestamp(4, 3) 
{ "DateTime" : NumberLong("-4611686018427387900") } -->> { "DateTime" : NumberLong("-2305843009213693950") } on : shard_1 Timestamp(4, 4) 
{ "DateTime" : NumberLong("-2305843009213693950") } -->> { "DateTime" : NumberLong(0) } on : shard_1 Timestamp(4, 5) 
{ "DateTime" : NumberLong(0) } -->> { "DateTime" : NumberLong("2305843009213693950") } on : shard_2 Timestamp(4, 6) 
{ "DateTime" : NumberLong("2305843009213693950") } -->> { "DateTime" : NumberLong("4611686018427387900") } on : shard_2 Timestamp(4, 7) 
{ "DateTime" : NumberLong("4611686018427387900") } -->> { "DateTime" : NumberLong("6917529027641081850") } on : shard_3 Timestamp(4, 8) 
{ "DateTime" : NumberLong("6917529027641081850") } -->> { "DateTime" : { "$maxKey" : 1 } } on : shard_3 Timestamp(4, 9) 
DB2.Collection2
shard key: { "DateTime" : "hashed" }
unique: false
balancing: true
chunks:
shard_0 2
shard_1 2
shard_2 2
shard_3 2
{ "DateTime" : { "$minKey" : 1 } } -->> { "DateTime" : NumberLong("-6917529027641081850") } on : shard_0 Timestamp(4, 2) 
{ "DateTime" : NumberLong("-6917529027641081850") } -->> { "DateTime" : NumberLong("-4611686018427387900") } on : shard_0 Timestamp(4, 3) 
{ "DateTime" : NumberLong("-4611686018427387900") } -->> { "DateTime" : NumberLong("-2305843009213693950") } on : shard_1 Timestamp(4, 4) 
{ "DateTime" : NumberLong("-2305843009213693950") } -->> { "DateTime" : NumberLong(0) } on : shard_1 Timestamp(4, 5) 
{ "DateTime" : NumberLong(0) } -->> { "DateTime" : NumberLong("2305843009213693950") } on : shard_2 Timestamp(4, 6) 
{ "DateTime" : NumberLong("2305843009213693950") } -->> { "DateTime" : NumberLong("4611686018427387900") } on : shard_2 Timestamp(4, 7) 
{ "DateTime" : NumberLong("4611686018427387900") } -->> { "DateTime" : NumberLong("6917529027641081850") } on : shard_3 Timestamp(4, 8) 
{ "DateTime" : NumberLong("6917529027641081850") } -->> { "DateTime" : { "$maxKey" : 1 } } on : shard_3 Timestamp(4, 9) 
DB2.Collection2
shard key: { "DateTime" : "hashed" }
unique: false
balancing: true
chunks:
shard_0 2
shard_1 2
shard_2 2
shard_3 2
{ "DateTime" : { "$minKey" : 1 } } -->> { "DateTime" : NumberLong("-6917529027641081850") } on : shard_0 Timestamp(4, 2) 
{ "DateTime" : NumberLong("-6917529027641081850") } -->> { "DateTime" : NumberLong("-4611686018427387900") } on : shard_0 Timestamp(4, 3) 
{ "DateTime" : NumberLong("-4611686018427387900") } -->> { "DateTime" : NumberLong("-2305843009213693950") } on : shard_1 Timestamp(4, 4) 
{ "DateTime" : NumberLong("-2305843009213693950") } -->> { "DateTime" : NumberLong(0) } on : shard_1 Timestamp(4, 5) 
{ "DateTime" : NumberLong(0) } -->> { "DateTime" : NumberLong("2305843009213693950") } on : shard_2 Timestamp(4, 6) 
{ "DateTime" : NumberLong("2305843009213693950") } -->> { "DateTime" : NumberLong("4611686018427387900") } on : shard_2 Timestamp(4, 7) 
{ "DateTime" : NumberLong("4611686018427387900") } -->> { "DateTime" : NumberLong("6917529027641081850") } on : shard_3 Timestamp(4, 8) 
{ "DateTime" : NumberLong("6917529027641081850") } -->> { "DateTime" : { "$maxKey" : 1 } } on : shard_3 Timestamp(4, 9) 

Rhys Campbell

unread,
Oct 9, 2017, 10:18:51 AM10/9/17
to mongodb-user
In your query the collection is test1 but I can;t see that in the sh.status output. 

Also provide the source for vw1.

Also one of your shards has tags confirured while the other hasn't. They all have the same tag too. 

yash vora

unread,
Oct 10, 2017, 12:58:01 AM10/10/17
to mongodb-user
Hi Rhys,

Good Morning !!!

test1 (unsharded collection) in my query is temp collection with which i am going to left join with my vw1(unsharded view) which contains reference to my sharded collection (collection2).

The source of vw1 is as follows :

db.createView(
    'vw1', 
    'collection2', 
    [{ $match: { CallTYPE: {'$eq': 0 } } }]
);

As view suggests its referencing sharded collection (collection2) and will bring records where calltype is found to be zero.
And further as sh.status shows collection2 is having hashed shard on DateTime column.

Rhys Campbell

unread,
Oct 10, 2017, 3:08:29 AM10/10/17
to mongodb-user
Shading is not relevant to views to a view cannot really be unsharded like a collection can. The underlying collection is sharded so it makes sense that MongoDB prevents you from using this in certain queries. MongoDB does a lookup fro every key, on the localhost this is no problem, on a sharded cluster it would have to do a network roundtrip for each one. So it makes sense they prevent you from doing this.

I don't have a sharded windows cluster to test this out. Perhaps you've found a bug if what you describe is correct. try upgrading to the latest version in that branch.

Rhys

yash vora

unread,
Oct 10, 2017, 6:19:28 AM10/10/17
to mongodb-user
Hi Stephen,

Good Morning !!!

I have one query regarding sharding working on Linux and Windows environment.

As i stated above in my issue, $lookup is not working for sharded collection in from clause in Linux OS where it works successfully in windows environment.
Below is the query which is running successfully in windows environment but giving issue in linux environment

db.test1.aggregate([
{ "$group" : { "_id" : "$field1", "field1" : { "$first" : "$field1" } } },  
{ "$lookup" : {  "from" : "vw1",  "localField" : "field1",  "foreignField" : "field1",  "as" : "vw1"  } 
}, { "$match" : { 
    "vw1" : { "$eq" : [] }, "DateTime" : { "$gte" : ISODate("2016-06-19T00:00:00.000Z"), "$lt" : ISODate("2017-06-19T23:59:59.997Z") }  } 
}, { "$project" : { "_id" : "$_id.field1", "field1" : "$field1" } }]).forEach( function(objRule2) {
    db.test2.insert(objRule2);   
});

In my query test1 (unsharded collection) is temp collection with which i am going to left join with my vw1(unsharded view) which contains reference to my sharded collection (collection2).

The source of vw1 is as follows :

db.createView(
    'vw1', 
    'collection2', 
    [{ $match: { CallTYPE: {'$eq': 0 } } }]
);

As view suggests its referencing sharded collection (collection2) and will bring records where calltype is found to be zero.
And further as sh.status shows collection2 is having hashed shard on DateTime column in windows environment as per sharding status.

Stephen, its my humble request to check for the issue, in my both environments i have initialized shard (hashed shard) on DateTime column on my collection.

My Current windows and Linux environment mongodb version are as follows :

Windows : Mongodb Version : 3.4.7
Linux : Mongodb Version : 3.4.9
Both version installed are enterprise editions.

Below is the sharding status for both of the environments :

--- Sharding Status --- 

On Linux :
{  "_id" : "DB2",  "primary" : "a",  "partitioned" : false }

On windows :

--- Sharding Status --- 
  sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("599c0c547907d8be247367c9")
}
  shards:
{  "_id" : "shard_0",  "host" : "shard_0/localhost:27000,localhost:27001,localhost:27002",  "state" : 1,  "tags" : [ "zone1" ] }
On Monday, October 9, 2017 at 1:45:00 PM UTC+5:30, Stephen Steneker wrote:

Wan Bachtiar

unread,
Nov 13, 2017, 1:53:45 AM11/13/17
to mongodb-user

Hi Yash,

The code for $lookup is the same for both Linux and Windows OS, also for both version 3.4.7 and 3.4.9:

v3.4.7: document_source_lookup.cpp#L125
v3.4.9: document_source_lookup.cpp#L125

In both cases, it should throw an exception code 4567, for example:

assert: command failed: {
    "ok" : 0,
    "errmsg" : "from collection cannot be sharded",
    "code" : 4567,
    "codeName" : "Location4567"
} : aggregate failed
_getErrorWithCode@src/mongo/shell/utils.js:25:13
doassert@src/mongo/shell/assert.js:16:14
assert.commandWorked@src/mongo/shell/assert.js:370:5
DBCollection.prototype.aggregate@src/mongo/shell/collection.js:1319:5
@(shell):1:1

Regards,
Wan.

Reply all
Reply to author
Forward
0 new messages