Preface: 0.2 and 0.3 of Eve are implemented with a JS runtime. Because of this, we've inherited some limitations from the host language (and also picked up a few others as necessary trade-offs for performance). We're working on a non-JS runtime for 0.4 that will lift some of these (and also provide a big performance boost in general). Initial testing suggests that we can package the non-JS runtime using Emscripten so that it can still be used in the browser while maintaining most of those beneficial properties.
1) In-memory. We'll have on-disk persistence in the near-ish future now that we've stabilized the semantics and are finishing the new runtime implementation. There are some interesting options for offloading relatively-unused data to disk, but the performance hit for querying anything that isn't in-memory is immense, so it's pretty low priority.
2a) As the above implies, however much RAM you have. All values in Eve are interned, so technically you could exhaust the interned value space, but (in the JS implementation) you'll need 2^53 (maximum safe integer in JS) discrete strings before that becomes a problem. That's easily fixed with a runtime built on top of Rust. There's also a theoretical limit on the number of attributes that can be part of the identity of a record (again, specific to the JS implementation) but you'll need 2^29 (maximum string length in JS) bytes worth of identity-contributing values on a single record to break that.
2b) Starting in 0.4 you should start to see reliably good performance irrespective of the DB state so long as your queries are reasonably specific and not huge. In 0.2, total DB size seriously impacted performance. 0.3 greatly reduced that, and in 0.4 it's almost irrelevant (scaling instead with almost purely the number of partially matching records). The difference between 100K and 100M facts in the DB is a couple hundred nanoseconds.
3) We haven't provided any special substring indexes, but you can use standard library functions for filtering as you'd expect. If that substring is the only differentiator between millions of records matching or not, that's going to be exactly as expensive as doing it in python sans-indexing. That said, watchers being able to maintain their own incremental indexes is definitely a possibility down the road, but it's not a priority yet. In the interim, you could probably build a lucene watcher that mirrors the relevant eve state there for a specific use case pretty easily.
4)
```
prefix = "my-custom-prefix"
[tag: "{{prefix}}/my-tag"]
```
5)
size of strings: irrelevant unless you're using string manipulating functions
tags per record: minor impact on overall query performance in 0.3 (though you'll probably hurt performance more with too few tags), effectively no impact in 0.4, excepting that additional specificity lets Eve cheaply reduce the scope of queries.
6) I guess 2^53 (in the JS runtime), since tags are interned as strings.
Hope that helps. We'll also release more specific performance information about 0.4 as we go. We're pretty excited about it.