Sky v0.3.0.beta1

48 views
Skip to first unread message

b...@skylandlabs.com

unread,
Mar 21, 2013, 5:36:51 PM3/21/13
to sk...@googlegroups.com
hey everyone-

I have the Go port of Sky working and it's up on GitHub:


In the rewrite, I made some changes to the database to simplify everything. Here's a quick list:
  1. The API is now JSON over HTTP and it's RESTful. There's a list of all the calls within the README.

  2. There's no longer a concept of actions, per se. Actions were previously encoded as a lookup so the first action stored (e.g. "signup") would be stored as a 1, the next one (e.g. "checkout") would be stored as a 2, etc. That's been replaced with the "factor" data type (similar to R's factors) so you can now just create a "action" factor property.

  3. Properties are now referred to as "transient" or "permanent" instead of "action properties" or "object properties". The value of transient properties only exist for a single point in time while the value of permanent properties will persist from a point in time until changed.

  4. When inserting an event, transient and permanent properties are sent in one single hash instead of two separate ones.

  5. The query interface has changed substantially. It's currently composed of two primitives: selections & conditions. These can be nested inside of each other to create funnel analysis and I'll be adding a third iteration primitive (probably called "foreach") that will let you do cohort analysis. This is a much larger topic so I'll some docs here soon.
If you have a minute, I'd appreciate it if you could give the code a quick download, see if it compiles correctly and run some of the cURL API calls from the README to see if you hit any snags.

I'm going to fix the Ruby gem next and then do a v0.3.0 release soon after!


Ben

Ben Johnson

unread,
Mar 21, 2013, 6:55:29 PM3/21/13
to sk...@googlegroups.com
There's a couple issues with leveldb in the Sky C API on Linux. I need to do a small refactor and then it'll be fixed. It should work on OS X fine though.

Ben

Ben Johnson

unread,
Mar 21, 2013, 11:49:20 PM3/21/13
to sk...@googlegroups.com
I fixed the Linux issues but I'm having an issue getting go to recognize /usr/local/lib (despite being in /etc/ld.so.conf). I'm having to set LD_LIBRARY_PATH manually right now when running "go run" like this:

$ LD_LIBRARY_PATH=/usr/local/lib go run skyd.go

The fixes are on the go branch in the latest commit (5b68ba502). Let me know anybody knows how to fix LD_LIBRARY_PATH issue. I'll do some more research later.


Ben

Louis Landry

unread,
Mar 22, 2013, 2:36:57 PM3/22/13
to sk...@googlegroups.com
On CENTOS at least:

$ sudo echo '/usr/local/lib' > /etc/ld.so.conf.d/sky.conf && ldconfig

Should set the LD_LIBRARY_PATH with the appropriate information so you don't have to worry with it.

Cheers.

- Louis

Ben Johnson

unread,
Mar 22, 2013, 4:10:31 PM3/22/13
to sk...@googlegroups.com
Thanks, Louis. I'll give it a try again.

Ben

Edward Middleton

unread,
Mar 26, 2013, 12:42:21 AM3/26/13
to b...@skylandlabs.com, sk...@googlegroups.com
On 03/22/2013 06:36 AM, b...@skylandlabs.com wrote:
>
> If you have a minute, I'd appreciate it if you could give the code a
> quick download, see if it compiles correctly and run some of the cURL
> API calls from the README to see if you hit any snags.

I just gave it a run. Good call on changing to Go, very nice language.
I had an issues getting the database to compile but it might be a
problem my end. Go kept trying to link with luajit-2.0.0 instead of
luajit-2.0.1 which I had installed. I couldn't work out where it was
being pulled in from so I just symlinked it back to luajit-2.0.1 which
fixed the problem.

I hacked together a Go based etl/loader for my snowplow logs and got it
to work with the api from the readme. I am finding inserts run fast
until the database gets to about 40M then things slow down dramatically.
I thought it could be a problem with my loader code but restarting the
loader doesn't result in improved performance so I am guessing not.

Edward

Ben Johnson

unread,
Mar 26, 2013, 9:46:11 AM3/26/13
to Edward Middleton, sk...@googlegroups.com
Yeah, Go is a fun language. You were able to get an ETL loader working pretty fast. :)

I'm not sure why Go was linking to luajit-2.0.0 instead of 2.0.1. I'm just specifying the Lua version to link against in skyd/execution_engine.go.

The insert slowness is probably on my side right now. It's currently deserializing the entire event stream for an object, inserting an event and then reserializing the whole thing and saving it. It's really slow for large event streams. I'll get a fix in today or tomorrow to optimize appends (e.g. inserting events that occur after all other existing events for an object). It should speed inserts up by orders of magnitude for large data sets.


Ben

Ben Johnson

unread,
Mar 26, 2013, 11:01:08 AM3/26/13
to Edward Middleton, sk...@googlegroups.com
Edward-

I have the Ruby gem working against Sky v0.3.0. I stripped down a lot so it's pretty simple. I'm going to move the importer into Go so it's fast and so it can be compiled. You can find the new gem in the unstable branch:


I have some integration tests to add and I'm adding some ease-of-use functions to the table object but other than that it's pretty much done. I'm going to work on the Go library after that.


Ben


On Mar 25, 2013, at 10:42 PM, Edward Middleton wrote:

Ben Johnson

unread,
Mar 27, 2013, 6:51:58 AM3/27/13
to Edward Middleton, sk...@googlegroups.com
I added some bug fixes, cleaned up some APIs and finished off the Ruby client. And it has documentation! :)


I have two more outstanding items left before the v0.3.0 release. Hopefully I can get to both tomorrow.



Ben


On Mar 26, 2013, at 9:00 AM, Ben Johnson wrote:

Edward-

I have the Ruby gem working against Sky v0.3.0. I stripped down a lot so it's pretty simple. I'm going to move the importer into Go so it's fast and so it can be compiled. You can find the new gem in the unstable branch:


I have some integration tests to add and I'm adding some ease-of-use functions to the table object but other than that it's pretty much done. I'm going to work on t


Ben


On Mar 25, 2013, at 10:42 PM, Edward Middleton wrote:

Ben Johnson

unread,
Mar 27, 2013, 6:14:05 PM3/27/13
to Edward Middleton, sk...@googlegroups.com
The append optimization has been added so hopefully that speeds things up significantly for you on large imports. The data format has changed a little bit though so you'll need to clear out old data and recompile csky:

# From the root Sky source directory.
$ sudo make csky
$ sudo rm -rf /var/lib/sky/*


Ben

Edward Middleton

unread,
Mar 27, 2013, 9:36:06 PM3/27/13
to sk...@googlegroups.com, Ben Johnson
I have been using Go as an alternative scripting language because of the
really fast compile times.

Having the importer in Go would be great. Supporting streaming input
could also be nice. I was thinking about the practicality of an nginx
style approach were a lot of higher level functionality is implemented
as compile time modules so you get flexibility and high performance.

I pulled the current Go database and it seems to have less of an issue
loading but it still seems to be a bit bursty. I need to look at the Go
profiling tools to get a better idea what is happening.

The make script silently fails to build for me. I worked around it by
manually running

# cd skyd && go build -o /path/to/source ../skyd.go

The issue I mentioned bellow with Go luajit trying to compile against
luajit-2.0.0 was a packaging error. I am building system packages for
the dependencies and I followed the convention in the existing system
packages of renaming the library libluajit-2.0.so.1 instead of
libluajit-5.1. Unfortunately I missed changing the SONAME in the
library. Fixing that resolved the issue.

Edward

On 03/27/2013 12:01 AM, Ben Johnson wrote:
> Edward-
>
> I have the Ruby gem working against Sky v0.3.0. I stripped down a lot so
> it's pretty simple. I'm going to move the importer into Go so it's fast
> and so it can be compiled. You can find the new gem in the unstable branch:
>
> https://github.com/skydb/sky.rb/tree/unstable
>
> I have some integration tests to add and I'm adding some ease-of-use
> functions to the table object but other than that it's pretty much done.
> I'm going to work on the Go library after that.
>
>
> Ben
>
>
> On Mar 25, 2013, at 10:42 PM, Edward Middleton wrote:
>
>> On 03/22/2013 06:36 AM, b...@skylandlabs.com

Ben Johnson

unread,
Mar 28, 2013, 12:42:33 AM3/28/13
to Edward Middleton, sk...@googlegroups.com
Let me know what you find with the Go profiling tools. I haven't even touched profiling yet so I'm sure there's a lot of room for improvement.

I also like the idea of modularizing the importer. I want to try to make it dead simple at first though. I feel like the old Ruby importer became over architected. Streaming sounds cool though. Are you thinking of basically tailing the Apache/nginx log?


Ben

Ben Johnson

unread,
Mar 28, 2013, 8:37:23 PM3/28/13
to Edward Middleton, sk...@googlegroups.com
The Go client for Sky is up on GitHub:


Documentation can be found on godoc.org:


You can also find some usage in the test cases. It mostly mirrors the Ruby client. Here's the basic usage overview:

// Create a client and grab a table reference.
client := sky.NewClient("localhost")
table, err := client.GetTable("users")

// Create properties on the table.
table.CreateProperty(NewProperty("action", true, sky.Factor))
table.CreateProperty(NewProperty("gender", false, sky.Factor))

// Create an event for an object, retrieve and delete it.
timestamp := time.Now()
table.AddEvent("jsmith", sky.NewEvent(timestamp, map[string]interface{}{"action":"view home page", "gender":"male"}, sky.Merge)
event, err := table.GetEvent("jsmith", timestamp)
table.DeleteEvent("jsmith", event)

The query interface is very raw right now. In fact, it's even named table.RawQuery(). It's mainly meant to be a pass through for queries sent as JSON. If you really want to build a query in Go then check out the table_test.go file for an example.

I'll get a GitHub Archive loader put up in short order and update Skybox to work with v0.3.0 after that.


Ben

Reply all
Reply to author
Forward
0 new messages