Hello everyone,
First, thank you for Newebe. I haven't tested it yet, but I already
like its philosophy very much. I also believe that each one's data
should sit in one's computer, and share with everyone what is needed,
and that storing everyone's life in a central place is a bad design.
Just like IRL. Thus, the aproach was very relevant to me.
I have a few questions/notes on the overall implementation :
1/
When a node has some activities related to a contact, it sends it to
the contact and stores it in the database. If, at any moment, a long-
asleep contact wakes up, he fetches for all the information he as
missed.
I was thinking that it would be more efficient to directly store the
activities in the DB, and let the participators of this activity
replicate this, instead of sending it to every one. This is already
implemented anyway.
Going further, we could let CouchDB handle it directly. It has a
wonderful `_changes` API that gives you every changes in the database
since any revision id in the past, and can feed you continuously. With
that, we could store all the participators' URL in a field alongside
the activity, and let them retrieve what's related to them when they
want.
This poses 2 problems :
- I don't know if we can listen to many `_changes` feed in parallel.
Even if a user typically doesn't have 100k contacts, we can assume
that a number of 100 is somewhat reasonable
- Authentication : everyone shouldn't be able to retrieve all the
activities of a contact if he is not in the participants. There are
some
But it also has some nice points :
- It doesn't require a 99.9% uptime. This is already managed by asking
a contact for all missed information, so why not reuse it ?
- It only requires pulls, and no pushes. This eliminates the need of
routing/opening ports, and makes it far more easy to setup and
use(think about mobile use, or using a public hotspot)
- What I had in mind is a typical user who fires newebe as a software,
but more as a client than as a server (no uptime needed, no
maintenance, no port forwarding, etc) on his computer, just like he
would launch his MUA who fetches all the mails (-> activities) from
his mail server (-> contacts). This is much more resilient to crashes
too =]
- Suppose A has an activity with B and C. At the moment, A sends it to
B and C. Message is not considered as delivered until all the
recipients have received it (or so I think). In the reactive approach
I was thinking about, B pulls from A, and C can pull from B or A; if A
stops after B pulled the activity, C can still retrieve it from B.
After all, it's exactly the same activity for everyone (participators
are the same, news/picture/note is the same).
2/
I see that the data is distributed among multiple databases in
CouchDB. Is there a reason for that ? Instictively, I prefer stashing
everything in one database and appending a `doc_type` field, which is
already done anyway.
3/
I see that we use Node.js for developping client-side stuff. Is it
used only as a developping tool, not as a brick of the final software
stack ? This is what I understeed.
4/
I see in some of your views that you emit the full `doc` as a value.
The pro hint is to emit a `null` value and retrieve the doc at query
time using `&include_docs=true`. This eliminates the need to duplicate
the doc in the database _and_ in the view. The little added latency is
invisible.
---
As you may have seen, I am more directed towards CouchDB, mainly
because I like it (and I have made a few toys to play with it, too :
see
https://github.com/rakoo/MultiBin or
https://github.com/rakoo/dml).
In fact, I think that the aproach of Newebe is essentially what drove
the development of CouchDB, and that most of its function can be
directly provided by it.
Anyway, keep up with the good work !
--
Matthieu Rakotojaona