Hi,
I'm evaluating Rama for our company use cases (finance) and got a few related questions while making a small prototype (in Clojure):
1. What is the idiomatic way to create a PState which sources another PState? I'm trying to represent an orderbook as a PState and have another PState to store timeseries features (e.g. {DateTime Float}), which are calculated from the orderbook state every D minutes (let's assume D=5). This ETL should be able to reconstruct feature values from the past during the depots' history replay, so it can't depend on the wall clock as a trigger. I see 2 approaches:
a) subscribe to `updated-at` timestamp of the orderbook PState and trigger computation when t_prev < floor(t_cur, "5min") <= t_cur. Seems like this would require reactive capabilities of a query topology (e.g. diffs) and I'm not sure if it's possible to write to PStates from query topologies?
b) do a regular transform like `(local-transform> [(keypath (t/floor *orderbook-updated-at "5min")) (termval (calc-some-feature *orderbook))] $$some-feature)`, which would work if orderbook states were coming from a depot every 5min, but I'm not sure if I can subscribe to a PState? Or maybe I could write PState snapshots to a new depot periodically?
2. Let there be N almost identical streams of external events. We'd like to merge them into a single stream, deduplicating events with the same IDs. All the other entities in our system should only know about the final stream:
a) Do I understand correctly that partitioning is exclusively an optimization strategy and partitions should not be assigned semantic meaning: namely that we should create separate depots for each pre-merge stream of data, instead of using separate partitions of a single depot?
b) Could we write the final stream into a new dedicated depot, for other ETLs to source it, or is it required to abstract at the level of Dataflow functions/macros?
Thank you for your time.