Moddeling question

50 views
Skip to first unread message

Olav Rask

unread,
Sep 30, 2014, 4:27:09 PM9/30/14
to rav...@googlegroups.com
Hi Guys

I'm wondering how you guys would go about modeling something like the dating app Tinder in RavenDB. The primary thing i am trying to figure out is the best way to handke the requirement that a user should not be shown the same person twice.

So far what i can think of is either to have a document for each user with a list of people that user has allready voted on or to have a document for each person with a list of user who have voted on that person.

For the first solution i would expect the query to look something like "from p in session.Query<Person>() where !currentUser.HasViewed.Contains(p.Id) select p"
The second solution would be something like "from p in session.Query<Person>() where !p.ViewedBy.Contains(currentUser.Id) select p"


So what do you think about these solution? Which would be more practical (or even possible) and what would perform better? Are there any alternatives?

Regards,
Olav

Itamar Syn-Hershko

unread,
Sep 30, 2014, 5:00:17 PM9/30/14
to rav...@googlegroups.com
You should take into account both will operate in eventual consistency guarantees, meaning it is NOT guaranteed to not give you the same people twice

Also, .Contains won't work in queries. Use id.In(p.ViewedBy).

--

Itamar Syn-Hershko
http://code972.com | @synhershko
Freelance Developer & Consultant

--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Olav Rask

unread,
Oct 1, 2014, 5:28:07 AM10/1/14
to rav...@googlegroups.com
Hello

Thank you for the input! I did not think about eventual consistency, do you have any idea how you would compensate for this?

I did a little testing after making the initial post and it turns out the first solution does not seem viable, since it mean sending the id's of the people a user has viewed over the wire in the query. With 200 id's i was getting an error that seems to be due to the query size. It seems the better solution will be !id.In(p.ViewedBy). Actually the In operator does not seem to work with ! - but i found this luceen query that does the trick:

var peopleNotYetViewed = session.Advanced.LuceneQuery<Person>().Where("HasViewed:(* -people/673)").Take(20).ToList(); (assuming the current user is people/673 :) )

Do you think this solution will perform well even when the number of people start to rise?

Olav

Itamar Syn-Hershko

unread,
Oct 1, 2014, 6:00:51 AM10/1/14
to rav...@googlegroups.com
The only way around this is Optimistic Cocurrency and using Load operations only, and then you should also pay attention to the document size.

--

Itamar Syn-Hershko
http://code972.com | @synhershko
Freelance Developer & Consultant

Justin A

unread,
Oct 1, 2014, 6:31:00 AM10/1/14
to rav...@googlegroups.com
Itamar, my first though was to do an join (urgh - yes, sql thinking) between two collections ?

possible?

basically, I'd hate to see a document that gets huge .. like this..

users/1
name: fred
rejectedIds:
{
    "users/2",
    "users/3",
    ... 1000's of other userId's here....
}

Oren Eini (Ayende Rahien)

unread,
Oct 1, 2014, 11:08:25 AM10/1/14
to ravendb
The easiest way to handle this is probably to have a document with the ids that a user has already seen on the server.
Then use a transformer to reject those they already seen.

Note that you might need to use the streaming API to make sure that you can scan enough, if the user saw a lot of other users.

Another thing to note is that this is effectively an O(N) operation, so you might want to want it for very active users.

Hibernating Rhinos Ltd  

Oren Eini l CEO Mobile: + 972-52-548-6969

Office: +972-4-622-7811 l Fax: +972-153-4-622-7811

 


--

Chris Marisic

unread,
Oct 2, 2014, 1:51:08 PM10/2/14
to rav...@googlegroups.com
Emulating tinder you should expect a large amount of requests in a short period of time: swipe swipe click swipe click click click swipe swipe swipe where each one of these actions is generating the next profile result.

I would keep a document that is a list of all the profiles viewed. I wouldn't expect an individual users to vastly exceed 10,000 profiles viewed so even if the doc became several megabytes it should probably be fine. Lets call this the PastViewDoc

Next when the initially invoke their search I would execute the query against profiles that match age, location, whatever other filters you have and pull back anywhere from 10 to 100 profile results. I would chop out all profiles that match PastViewDoc and I would stick these results in the session or similar user scoped cache. I would then return 1 result. When the user moves to the next result, I would return 1 more result and keep popping off users from the profile queue you have in the session. Once you hit 0, you invoke the query again and rehydrate the queue. (You could get fancier and do this in the background, once the users hits some # say 10 profiles to go you background update the queue to have 10 + 100-PastViewDoc) At some point when they are saturating their area you may have to execute the paging into the result list multiple times until you get unique results (could potentially scale the page size when you're reaching starvation conditions).

The next upside, by pooling back the documents in chunks and sticking them in an in memory cache you can load those profiles from memory. It is very reasonable to expect profiles 1, 2, 3, 4, 5, 6 will all be viewed immediately. This eliminates large amounts of chatter with the database. A well designed application is "chunky" as opposed to "chatty" when it comes to IO calls.

This design would allow for easy horizontal scaling of the web tier by avoiding much tighter io bound coupling from most of the solutions here.

For social network applications queueing and background work is your friend, I would likely not have 1 synchronous action in this system EXCEPT "show next profile" which only hits an in memory cache. I would make sure all "likes", "view full profile", "update PastViewDoc", and "add users to profile queue" were all handled from queues asynchronously.  I would only block if the call to next to profile is empty and block until I get a result (with a timeout if nothing happens soon).

Kijana Woodard

unread,
Oct 2, 2014, 2:29:51 PM10/2/14
to rav...@googlegroups.com
I bet you could hack it further depending on business goals.

1. If you only care about not repeating for {n days}, that leads down one path.
2. If you want to never repeat, you could do something like group users into cohorts. Keep a list of viewed per cohort. Keep cohort size at say 10k. You can keep showing from that cohort until exhausted. Then you don't have to worry about an unbounded list.

--
Reply all
Reply to author
Forward
0 new messages