Eventual consistency - when to read?

Neil Sheth

unread,

Dec 10, 2014, 4:09:52 AM12/10/14

to secor...@googlegroups.com

Hello,

I'm trying to see if Secor helps solve an issue we're facing - basically knowing when something is safe to read from S3.

I started here -

https://github.com/pinterest/secor/blob/master/DESIGN.md

One of the stated objectives here is:

low log-to-ready_for_consumption delay: logged messages should be ready for consumption by analytical tools asap.

Reading the design doc, it's not clear to me how this is supported.

In the uploader.check_policy() method, there's a call to upload_files_to_s3(t, p). Now, if we have an external reader, an analytics tool looking at S3 as a data source, how do we get around the eventual consistency issues here?

Perhaps I'm missing something, appreciate any pointers!

Thanks!

Neil

Pawel Garbacki

unread,

Dec 10, 2014, 5:33:04 PM12/10/14

to Neil Sheth, secor...@googlegroups.com

Hi,

depends how you define "safe to read". S3 guarantees atomicity which means that if a file is there, it is complete. Keep in mind, though, that in some rare cases it is possible for Secor to replace an uploaded file.

In general, the eventual consistency model of S3 is inconvenient to work with, especially on the reader end. Secor is designed not to ever have to read any data from S3 so it is agnostic to the consistency model peculiarities. There are projects such as Netflix's S3mper that use a strongly consistent metadata store to simplify writing clients that read data from S3 but those types of solutions are quite sophisticated.

At Pinterest, real time analytics tools read data directly from Kafka. Secor logs are being consumed by batch jobs that kick off only after daily partitions have been finalized. Secor comes with a partition finalizer tool that implements a heuristic to decide if data for a complete day has been consumed.

HTH,

-pawel

--
You received this message because you are subscribed to the Google Groups "secor-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to secor-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Neil Sheth

unread,

Dec 12, 2014, 4:30:47 PM12/12/14

to Pawel Garbacki, secor...@googlegroups.com

Thanks! We were also looking at S3mper, and it perhaps is more in line with what we need. Wanted to just make sure we understood what Secor provided.

Reply all

Reply to author

Forward