Eve specs

156 views
Skip to first unread message

Joseph

unread,
May 21, 2017, 3:35:33 AM5/21/17
to Eve talk
Hi.

It would be useful to review the current specs for Eve, in order to asses what I can currently do with Eve and what to expect in the future.These are questions such as:
1. Does Eve use an in-memory database or is the database stored in mass storage?
2. What are the limitations in terms of size of records, size of strings, number of records that Eve can handle
a) in absolute terms,
b) with reasonable search / response time?
3. Is it possible to search records by substrings contained in a string?
4. Is it possible to programmatically convert a string into a tag? 
5. How does the size of strings / number of tags per record effect search / response time?
6. Is there an absolute limit on number of tags per record?

If (some of) these questions have been discussed elsewhere, I would appreciate pointers.

Thanks.

Josh Cole

unread,
May 22, 2017, 7:09:36 PM5/22/17
to Eve talk
Preface: 0.2 and 0.3 of Eve are implemented with a JS runtime. Because of this, we've inherited some limitations from the host language (and also picked up a few others as necessary trade-offs for performance). We're working on a non-JS runtime for 0.4 that will lift some of these (and also provide a big performance boost in general). Initial testing suggests that we can package the non-JS runtime using Emscripten so that it can still be used in the browser while maintaining most of those beneficial properties.

1) In-memory. We'll have on-disk persistence in the near-ish future now that we've stabilized the semantics and are finishing the new runtime implementation. There are some interesting options for offloading relatively-unused data to disk, but the performance hit for querying anything that isn't in-memory is immense, so it's pretty low priority.

2a) As the above implies, however much RAM you have. All values in Eve are interned, so technically you could exhaust the interned value space, but (in the JS implementation) you'll need 2^53  (maximum safe integer in JS) discrete strings before that becomes a problem. That's easily fixed with a runtime built on top of Rust. There's also a theoretical limit on the number of attributes that can be part of the identity of a record (again, specific to the JS implementation) but you'll need 2^29 (maximum string length in JS) bytes worth of identity-contributing values on a single record to break that.

2b) Starting in 0.4 you should start to see reliably good performance irrespective of the DB state so long as your queries are reasonably specific and not huge. In 0.2, total DB size seriously impacted performance. 0.3 greatly reduced that, and in 0.4 it's almost irrelevant (scaling instead with almost purely the number of partially matching records). The difference between 100K and 100M facts in the DB is a couple hundred nanoseconds.

3) We haven't provided any special substring indexes, but you can use standard library functions for filtering as you'd expect. If that substring is the only differentiator between millions of records matching or not, that's going to be exactly as expensive as doing it in python sans-indexing. That said, watchers being able to maintain their own incremental indexes is definitely a possibility down the road, but it's not a priority yet. In the interim, you could probably build a lucene watcher that mirrors the relevant eve state there for a specific use case pretty easily.

4)

```
prefix = "my-custom-prefix"

[tag: "{{prefix}}/my-tag"]
```

5) 
size of strings: irrelevant unless you're using string manipulating functions
tags per record: minor impact on overall query performance in 0.3 (though you'll probably hurt performance more with too few tags), effectively no impact in 0.4, excepting that additional specificity lets Eve cheaply reduce the scope of queries.

6) I guess 2^53 (in the JS runtime), since tags are interned as strings.

Hope that helps. We'll also release more specific performance information about 0.4 as we go. We're pretty excited about it.

Joseph

unread,
May 22, 2017, 7:21:12 PM5/22/17
to Eve talk
Thanks or the detailed replies. 

This is starting to look very exciting! I'm very much looking forward to 0.4.

As for alternatives to in-memory databases:

Google is releasing (portions of?) Firebase as open source.

Rethinkdb was open sourced a while ago.

Would adopting one of those be an option?


On Sunday, May 21, 2017 at 12:35:33 AM UTC-7, Joseph wrote:

Josh Cole

unread,
May 23, 2017, 6:22:13 PM5/23/17
to Eve talk
Hey Joseph,

Eve is actually both a language and a database.  The programs you write in Eve are essentially a big set of queries, so performance that'd be acceptable for a database (which is queried sparingly) is unacceptably slow for us. Because we were focused on building a highly performant query engine, we haven't added persistence yet (which makes it a little harder to see the database story). If we were to try to turn Eve into just a query language on top of Firebase or SQL, we'd be equally slow (actually much slower, since providing our semantics on top of a system not designed for them would also be quite expensive).

Joseph

unread,
May 23, 2017, 6:46:52 PM5/23/17
to Eve talk
Got it. The objective is different. You need a tighter integration with the language, and a database designed to perform in this context. 

Thanks.
Reply all
Reply to author
Forward
0 new messages