Hi Mat,
filtering the content by tag (or in any other way) is rather easy
technically, but the difficulty lies in the heavy lifting of offline
scraping, I mean the computational power needed for the offline
part.
This is not visible when you look at the resulting wiki, but behind
the scene the (full) wikis are downloaded as standalone html pages,
then converted to Node.js version: only these steps take about 10 to
15 minute on my machine for around 30 wikis. I configured a daily
update, which is enough for the current usage of the wiki, but it's
ridiculous compared to real time updates, which is probably what you
need for twitter-like behaviour. And of course the servers in charge
would also have to be able to deal with a much larger amount of
tiddlers. So basically I don't think the method is really scalable,
but of course if somebody wants to buy a massive amount of servers
we could try! ;-)
Anyway this is only my very "DIY" approach, maybe Jeremy or others
have ideas about how to do this in a smarter way. I'm not at all an
expert in this domain to be honest :-)
Also thank you all wiki authors for your agreement and everybody for
the suggestions, this helps me figure out which direction to take. I
did a list of the next possible improvements at
https://github.com/erwanm/tw-aggregator/issues, don't hesitate to
add or comment (but please be very patient for things to be
implemented!).
Regards
Erwan