Message from discussion
Querying 'views' with a time frame
Received: by 10.101.197.29 with SMTP id z29mr566989anp.12.1310094358522;
Thu, 07 Jul 2011 20:05:58 -0700 (PDT)
X-BeenThere: mongodb-user@googlegroups.com
Received: by 10.101.83.10 with SMTP id k10ls516541anl.4.gmail; Thu, 07 Jul
2011 20:05:46 -0700 (PDT)
Received: by 10.100.237.5 with SMTP id k5mr570932anh.7.1310094346628;
Thu, 07 Jul 2011 20:05:46 -0700 (PDT)
Received: by 10.100.237.5 with SMTP id k5mr570931anh.7.1310094346605;
Thu, 07 Jul 2011 20:05:46 -0700 (PDT)
Return-Path: <dwi...@10gen.com>
Received: from mail-gw0-f54.google.com (mail-gw0-f54.google.com [74.125.83.54])
by gmr-mx.google.com with ESMTPS id h29si9239203anq.2.2011.07.07.20.05.46
(version=TLSv1/SSLv3 cipher=OTHER);
Thu, 07 Jul 2011 20:05:46 -0700 (PDT)
Received-SPF: pass (google.com: domain of dwi...@10gen.com designates 74.125.83.54 as permitted sender) client-ip=74.125.83.54;
Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of dwi...@10gen.com designates 74.125.83.54 as permitted sender) smtp.mail=dwi...@10gen.com
Received: by gwb15 with SMTP id 15so803375gwb.41
for <mongodb-user@googlegroups.com>; Thu, 07 Jul 2011 20:05:46 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.236.144.231 with SMTP id n67mr1810486yhj.354.1310094346330;
Thu, 07 Jul 2011 20:05:46 -0700 (PDT)
Received: by 10.236.136.169 with HTTP; Thu, 7 Jul 2011 20:05:46 -0700 (PDT)
In-Reply-To: <CAHWLjBCH486gbggfP1eoZfP9iFHqC_W49pWgMk65SrxDEP1CEA@mail.gmail.com>
References: <b640b7e7-05c5-4701-af05-cc464628e04e@d22g2000yqn.googlegroups.com>
<CAHWLjBCH486gbggfP1eoZfP9iFHqC_W49pWgMk65SrxDEP1CEA@mail.gmail.com>
Date: Thu, 7 Jul 2011 23:05:46 -0400
Message-ID: <CAHWLjBA-gzGwJp7eKY5sQiTdGuTwYNCbKQmhpEA-6sm3v6bJ2A@mail.gmail.com>
Subject: Re: [mongodb-user] Querying 'views' with a time frame
From: Dwight Merriman <dwi...@10gen.com>
To: mongodb-user@googlegroups.com
Content-Type: multipart/alternative; boundary=20cf3040ea0c22a0d304a7861c3e
--20cf3040ea0c22a0d304a7861c3e
Content-Type: text/plain; charset=ISO-8859-1
you might also want to make view_stats a capped collection so that old stats
automatically eject.
an index on hour might make sense. in your map/reduce job you coudl do
something like
{ hour : { gte : <start>, lte : <end> } }
as a query filter and the index would then go to just those stats documents.
On Thu, Jul 7, 2011 at 11:04 PM, Dwight Merriman <dwi...@10gen.com> wrote:
> one way is to have a view_stats collection, as you say. on each view of an
> image, $inc a counter in a particular doc in view_stats. you might for
> example have a doc per image per hour. or you could have the hours as an
> array or subobject. to keep it simple imagine something like
>
> { image: <imageid>, hour : <hoursinceepoch>, views : <number> }
>
> you can then periodically run a map/reduce job which outputs to some
> collection such as most_viewed_today. then anytime needed you query that
> collection. the stats don't need to be instantly up to date (typically) so
> you can run the map/reduce periodically -- maybe once per 5 minutes.
>
>
> On Thu, Jul 7, 2011 at 12:54 PM, strada <msuk...@gmail.com> wrote:
>
>> I'm using mongodb to store image metadata and S3 urls. I would like to
>> show most viewed items on a time period, like 'today', 'this week' ,
>> just like youtube does with videos. How should I store 'views' on
>> mongodb to query that kind of data?
>>
>> I'm thinking of creating a view collection and storing each 'view' as
>> a document with date information, and also storing the view count in
>> individual image documents for fast retrieval of view counts. These
>> two will be updated on each pageview. I'm assuming MapReduce would get
>> the job done form there, but I can't quite picture it.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "mongodb-user" group.
>> To post to this group, send email to mongodb-user@googlegroups.com.
>> To unsubscribe from this group, send email to
>> mongodb-user+unsubscribe@googlegroups.com.
>> For more options, visit this group at
>> http://groups.google.com/group/mongodb-user?hl=en.
>>
>>
>
--20cf3040ea0c22a0d304a7861c3e
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
you might also want to make view_stats a capped collection so that old stat=
s automatically eject.<div><br></div><div>an index on hour might make sense=
. =A0in your map/reduce job you coudl do something like=A0</div><div><br></=
div>
<div>=A0 { hour : { gte : <start>, lte : <end> } }</div><div><b=
r></div><div>as a query filter and the index would then go to just those st=
ats documents.</div><div><br><br><div class=3D"gmail_quote">On Thu, Jul 7, =
2011 at 11:04 PM, Dwight Merriman <span dir=3D"ltr"><<a href=3D"mailto:d=
wi...@10gen.com">dwi...@10gen.com</a>></span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex;">one way is to have a view_stats collection,=
as you say. =A0on each view of an image, $inc a counter in a particular do=
c in view_stats. =A0you might for example have a doc per image per hour. =
=A0or you could have the hours as an array or subobject. =A0to keep it simp=
le imagine something like=A0<div>
<br></div><div>{ image: <imageid>, hour : <hoursinceepoch>, vie=
ws : <number> }</div><div><br></div><div>you can then periodically ru=
n a map/reduce job which outputs to some collection such as most_viewed_tod=
ay. =A0then anytime needed you query that collection. =A0the stats don'=
t need to be instantly up to date (typically) so you can run the map/reduce=
periodically -- maybe once per 5 minutes.</div>
<div><div></div><div class=3D"h5">
<div><br><br><div class=3D"gmail_quote">On Thu, Jul 7, 2011 at 12:54 PM, st=
rada <span dir=3D"ltr"><<a href=3D"mailto:msuk...@gmail.com" target=3D"_=
blank">msuk...@gmail.com</a>></span> wrote:<br><blockquote class=3D"gmai=
l_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left=
:1ex">
I'm using mongodb to store image metadata and S3 urls. I would like to<=
br>
show most viewed items on a time period, like 'today', 'this we=
ek' ,<br>
just like youtube does with videos. How should I store 'views' on<b=
r>
mongodb to query that kind of data?<br>
<br>
I'm thinking of creating a view collection and storing each 'view&#=
39; as<br>
a document with date information, and also storing the view count in<br>
individual image documents for fast retrieval of view counts. These<br>
two will be updated on each pageview. I'm assuming MapReduce would get<=
br>
the job done form there, but I can't quite picture it.<br>
<font color=3D"#888888"><br>
--<br>
You received this message because you are subscribed to the Google Groups &=
quot;mongodb-user" group.<br>
To post to this group, send email to <a href=3D"mailto:mongodb-user@googleg=
roups.com" target=3D"_blank">mongodb-user@googlegroups.com</a>.<br>
To unsubscribe from this group, send email to <a href=3D"mailto:mongodb-use=
r%2Bunsubscribe@googlegroups.com" target=3D"_blank">mongodb-user+unsubscrib=
e@googlegroups.com</a>.<br>
For more options, visit this group at <a href=3D"http://groups.google.com/g=
roup/mongodb-user?hl=3Den" target=3D"_blank">http://groups.google.com/group=
/mongodb-user?hl=3Den</a>.<br>
<br>
</font></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
--20cf3040ea0c22a0d304a7861c3e--