Developing Kinesis Apps Locally

1,120 views
Skip to first unread message

dani...@simplybusiness.co.uk

unread,
Jan 19, 2016, 11:18:22 AM1/19/16
to Snowplow
Hi there,

At Simply Business we have an integration test suite for our real time (streaming) processing setup. But we didn't like to have to create real Kinesis streams for it, so we started looking for alternatives. We found Kinesalite, which is "An implementation of Amazon's Kinesis, focused on correctness and performance, and built on LevelDB". The setup turned out to be really easy:
  • Installing nodejs and npm (on Mac: brew install node npm)
  • Installing kinesalite: npm install -g kinesalite
  • Running kinesalite: kinesalite

At that point the service is running at localhost:4567 and you can interact with it like it was Kinesis. For example, you can create a stream with awscli: aws kinesis create-stream --stream-name raw-good --shard-count 1 --endpoint-url http://localhost:4567


Then we just had to modify the endpoint where the Stream Collector writes to and the endpoint where the Stream Enricher reads from and everything worked fine.


We think it's really cool to be able to run the whole pipeline locally, so the question is: would you accept PRs to enable setting the endpoint in these 2 projects?


Cheers,

Dani

Alex Dean

unread,
Jan 19, 2016, 11:54:18 AM1/19/16
to Snowplow
Hi Dani,

That's awesome - thanks for letting us know. We've been following the Kinesalite project a little ourselves.

PR welcome! Good timing as we are finishing off a new Kinesis release at the moment...

A

--
You received this message because you are subscribed to the Google Groups "Snowplow" group.
To unsubscribe from this group and stop receiving emails from it, send an email to snowplow-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Co-founder
Snowplow Analytics
The Roma Building, 32-38 Scrutton Street, London EC2A 4RQ, United Kingdom
+44 (0)203 589 6116
+44 7881 622 925
@alexcrdean

dani...@simplybusiness.co.uk

unread,
Jan 21, 2016, 1:01:26 PM1/21/16
to Snowplow
Hi Alex,

Quick update on that one. I've realized that in addition to using Kinesalite to mock Kinesis we have to use Dynalite to mock DynamoDB. There's no much point in using just one, otherwise you keep connecting to Amazon.

This is annoying, because it makes the configuration options confusing (having to provide endpoints for Dynamo and Kinesis) and the worker construction becomes ugly (would need 6 params).

What's worse for us, we're going to use snowplow-common-enrich in Spark, but the Spark-Kinesis integration doesn't expose the DynamoDB endpoint configuration at all. We've contacted Amazon to request it, but it won't happen anytime soon.

So we'll park this option by now and keep using real Kinesis streams.

Regards,
Dani

Alex Dean

unread,
Jan 21, 2016, 1:04:07 PM1/21/16
to Snowplow
Hi Dani,

Makes sense - thanks for the heads-up. Exciting about embedding common enrich in Spark - let us know how you get on!

Alex
Reply all
Reply to author
Forward
0 new messages