Using regex to query for documents for one day using pymongo

238 views
Skip to first unread message

abi...@ipredictt.com

unread,
Oct 15, 2015, 5:36:05 AM10/15/15
to mongodb-user
I am using MongoDB to store around 20 Crore documents. 

A sample of the document in my collection looks like this:

{"key1":"value1","key2":"value2","date":"2015-08-19 12:58:00","key3":"value3",........,....}



There are around 20 key-value pairs in a document.

Among them the date is stored like this : "date":"2015-08-19 12:58:00". I want to find documents corresponding to one day. i.e. for example the documents corresponding 2015-08-19.

I am also using pandas to convert the documents into a dataframe format. So I used the following commands:

regx = re.compile("2015-08-01\s[012][0-9]:[0-5][0-9]:[0-5][0-9]")

click_data_sample
=pd.DataFrame(list(coll1.find({"date":regx})))



I have indexed the date key in MongoDb.
But its taking a lot of time. I have been waiting for half an hour now, but still no result.

Please help!!

Thanks in advance!!

Rhys Campbell

unread,
Oct 15, 2015, 8:59:13 AM10/15/15
to mongodb-user
Using a regex probably means an index can't be used. What does explain say?

I'm not sure you're using the correct syntax (this is Python?)


Are you storing the date as a string or an actual ISODate? 

I think you'd be much better off storing it as a date and then doing a range query. Here's an example doing it by month...


var start = new Date(2010, 11, 1);
var end = new Date(2010, 11, 30);

db.posts.find({created_on: {$gte: start, $lt: end}});
//taken from http://cookbook.mongodb.org/patterns/date_range/


It should be easy enough to modify this to do it by day.


Rhys Campbell

unread,
Oct 15, 2015, 9:12:58 AM10/15/15
to mongodb-user
Had to check what a Crore was...

Bernie Hackett

unread,
Oct 15, 2015, 2:05:10 PM10/15/15
to mongodb-user
To add to what Rhys Campbell said, Python regular expression syntax and MongoDB regular expression syntax are similar, but not the same. MongoDB uses libpcre, which python's syntax is not fully compatible with. If you really want to use a regex for this (understanding the caveats previously mentioned) you'll likely get better results using $regex syntax rather than python regular expressions.
Reply all
Reply to author
Forward
0 new messages