Improving Platform architecture

23 views
Skip to first unread message

Noel Welsh

unread,
Feb 11, 2014, 7:50:15 AM2/11/14
to purpose-pl...@googlegroups.com
Hi all,

(I'm assuming I'm not the only subscriber to this list.) 

I have a general concern about Platform's architecture. Specifically, creating ("cutting") a list takes a very long time as the DB has scan over all events. We have queries taking in excess of a minute, and I understand AllOut has similar slow queries. I'm not too worried about this time per se, but I'm worried about the knock-on effect on other actions like unsubscribing users that may be happening concurrently. Unsubscribing a user should be very quick, but contention for the database / disk could be an issue.

I see a number of solutions:

0. More delayed jobs. 
1. Modify Platform to support two database connections -- one for fast actions and one for slower queries. The slow queries can take place on a read replica.
2. Re-architect Platform into two parts: an API layer than communicates with a DB layer via a message queue. The API layer is for recording events (opens, clicks, unsubs) into the message queue. The DB layer performs the actions, persisting to the database. This is like 0 but more consistently applied.

Thoughts?

Cheers,
Noel

Scott Feinberg

unread,
Feb 11, 2014, 8:03:05 AM2/11/14
to Noel Welsh, purpose-pl...@googlegroups.com
AllOut's solution to this was to just queue every API post request. Unsubscribes, signatures, everything except for donations is dumped in a queue. This stabilized our DB performance so we had consistent DB usage. The disadvantage is that you lose real-time stats, especially during high-traffic. 

How big is your events table? The other route we've considered taking, as our table has surpassed 110 million rows, is to store the User Activity Events in a NoSQL data store and run MapReduce queries over it. 


--
You received this message because you are subscribed to the Google Groups "Purpose Platform dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to purpose-platform...@googlegroups.com.

John McCarthy

unread,
Feb 11, 2014, 9:35:03 AM2/11/14
to Scott Feinberg, Noel Welsh, purpose-pl...@googlegroups.com
Hi, also the Platform already supports two database connections. There is a READONLY_DATABASE_URL env var that can be set to specify the db that will be used for list cutting and blasting queries:


Thanks,
John

Sent from my iPhone

Noel Welsh

unread,
Feb 11, 2014, 9:59:27 AM2/11/14
to Scott Feinberg, purpose-pl...@googlegroups.com
We have a mere 6347904 rows in user_activity_events.

I have some experience with Spark (http://spark.incubator.apache.org/) which is a very good system for querying large amounts of data. However adding another component to Platform will complicate deployment. I found Platform pretty difficult to deploy to start with.

N.
--
Noel Welsh
Untyped Ltd                 http://www.untyped.com/
+44 (0) 121 285 2285    in...@untyped.com
UK company number    06226466

Noel Welsh

unread,
Feb 12, 2014, 11:09:21 AM2/12/14
to purpose-pl...@googlegroups.com
What I take from this discussion is that there is no grand plan / this is not a concern. 

I'm writing SendGrid event handlers right now. I expect this will be a high traffic API endpoint, and I would like to contribute this code back. I will use delayed job. 

(I'm not a huge fan of the model used by delayed job as I would rather queue the event than the action to take following the event. Enqueuing the event on a general message bus allows many services to listen for that event and take appropriate action. Enqueuing the action leads to a mess if an event requires more than one action.)

N.

Scott Feinberg

unread,
Feb 12, 2014, 11:13:12 AM2/12/14
to Noel Welsh, purpose-pl...@googlegroups.com
AllOut is actually using Resque for our non-blaster jobs instead of DelayedJob due to de-serialization issues. 


--

Noel Welsh

unread,
Feb 12, 2014, 11:33:32 AM2/12/14
to Scott Feinberg, purpose-pl...@googlegroups.com
That doesn't fill me with confidence.

N.


On Wed, Feb 12, 2014 at 4:13 PM, Scott Feinberg <feinber...@gmail.com> wrote:
AllOut is actually using Resque for our non-blaster jobs instead of DelayedJob due to de-serialization issues. 


Scott Feinberg

unread,
Feb 12, 2014, 11:57:36 AM2/12/14
to Noel Welsh, purpose-pl...@googlegroups.com
I wasn't around for when this problem occurred but my understanding is that as long as you don't pass objects around to the DelayedJobs, you're totally fine. 

John McCarthy

unread,
Feb 12, 2014, 11:59:13 AM2/12/14
to Noel Welsh, Scott Feinberg, purpose-pl...@googlegroups.com
Hi Noel, to clarify you're concerned with using delayed_job for the SendGrid event handling work you're doing because enqueuing and action is messier than enqueuing the event if disparate parts of the app need to respond to the event? Also, your concern about list cutting putting load on the main database is handled at this point by using the read-only connection?

Thanks,
John


--
You received this message because you are subscribed to the Google Groups "Purpose Platform dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to purpose-platform...@googlegroups.com.



--
John McCarthy | Lead Developer, Purpose Platform | Purpose | jo...@purpose.com | 631.742.3484

Noel Welsh

unread,
Feb 12, 2014, 12:15:21 PM2/12/14
to John McCarthy, Scott Feinberg, purpose-pl...@googlegroups.com
On Wed, Feb 12, 2014 at 4:59 PM, John McCarthy <jo...@purpose.com> wrote:
Hi Noel, to clarify you're concerned with using delayed_job for the SendGrid event handling work you're doing because enqueuing and action is messier than enqueuing the event if disparate parts of the app need to respond to the event?

Yes. It's not difficult to think of multiple actions occurring after one event. The most obvious one I can think of is subscribing a user and sending an auto responder. There is also sending an auto responder on donation which I know is something FFtF want.

I not really concerned about using delayed_job for this work per se. I'm more concerned about the overall architecture of Platform, and not building a plate of spaghetti over the long term.

This blog post describes the type of architecture I'm talking about, and is an architecture we're using in other projects:


 
Also, your concern about list cutting putting load on the main database is handled at this point by using the read-only connection?

We don't have a read replica at the moment, but we do have long queries (and timeouts). We're looking at options to speed up database access, and a read replica is one of them.

N.

 
Thanks,
John



--
John McCarthy | Lead Developer, Purpose Platform | Purpose | jo...@purpose.com | 631.742.3484

John McCarthy

unread,
Feb 12, 2014, 12:27:03 PM2/12/14
to Noel Welsh, Scott Feinberg, purpose-pl...@googlegroups.com
I believe Scott has moved action processing to a background job (redis/resque). I'm not sure at which point in the app they enqueue the action, but if it's at the API controller action level, that will take care of everything needed to subscribe/unsubscribe/create/update users and sending autofire emails (which may be enqueued themselves).

Noel Welsh

unread,
Feb 12, 2014, 4:54:07 PM2/12/14
to John McCarthy, Scott Feinberg, purpose-pl...@googlegroups.com
What I'm trying to get at is the right direction to take with new code for the long term success of Platform. What's the vision? Where do we want to be in six months? How are we going to get the code base to support the features we want? What I'm getting is there isn't really a vision. If that's the case, that's ok. I'll still write my code and probably submit a pull request, but it does feel like some effort is being wasted. 

N.

John McCarthy

unread,
Feb 12, 2014, 5:10:26 PM2/12/14
to Noel Welsh, Scott Feinberg, purpose-pl...@googlegroups.com
Hi, we haven't formalized and agreed upon the product roadmap and technical roadmap for the open source project yet. That will be coming soon (I expect the next couple of months).

Noel Welsh

unread,
Feb 12, 2014, 5:17:24 PM2/12/14
to John McCarthy, Scott Feinberg, purpose-pl...@googlegroups.com
Thanks, that's given me the answer I need.

N.
Reply all
Reply to author
Forward
0 new messages