Proposed Roadmap for Iteration 1 (10 Weeks)

2 views
Skip to first unread message

Jon Gosier

unread,
Dec 18, 2009, 12:42:34 PM12/18/09
to swift...@googlegroups.com
So one of the first things I've laid out is what we need to accomplish by when.  We need a working prototype by March 30 at the latest.  So here are my thoughts on how to move forward on a first iteration...

Currently I'm working on the central NLP, which I feel should be in the cloud so all Swift 'instances' can leverage accumulated logic that we build from parsing feeds on a over time.  I'll cover my thoughts and ideas on scaling databases and architecture in another email.

Feedback, advice or comments are totally encouraged.
2wks    (Jan 13 - 25)
predictive tagging
    - code for parsing incoming feeds (iteration 1: headlines only), caching into a central database and extracting meta information that can be used for our tags
    - sits in the cloud, all swift feeds pass through it to build up the natural language filter vocabulary
        - users should be able to add custom 'glossaries' for a swift instance (not in iteration 1)
        - users would download their swift instance, point it to the cloud and designate sources
    - interface design in parallel

6 wks trust algorithms (Jan 25 - Mar 19)
    - user can designate news sources
    - user can open the 'flood' from a large list of sources we aggregate feeds from to find content
    - voting/sweeping
        - voting up tags that the 'trusted' users are voting down decreases trust points (and vice versa)
        - voting up tags that trusted users are voting up builds credibility
    - news sources also build or lose 'trust' as users interact with items that have originated from that source
    - we'll end up needing several complex algorithms.  One for users, one for verified users (editors) and one for organizations
    - web interface design in parallel
    - design and build download-able swift instance

2 wks (Mar 19 - 30)
    - final interface design and integration
    - testing
Beyond this point the focus will shift to improving on components and concepts developed during the initial 8 week sprint.  This way we have a fully functional, scalable product moving forward.

Kaushal Jhalla

unread,
Dec 18, 2009, 1:13:55 PM12/18/09
to swift...@googlegroups.com
Jon, definitely have a look at the roadmap that we have up currently. A lot of your outline is very much in sync with this which is great :) http://wiki.github.com/ajturner/swiftriver/roadmap

Kaushal

--

You received this message because you are subscribed to the Google Groups "Swift River" group.
To post to this group, send email to swift...@googlegroups.com.
To unsubscribe from this group, send email to swiftriver+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/swiftriver?hl=en.

Jon Gosier

unread,
Dec 18, 2009, 2:05:22 PM12/18/09
to swift...@googlegroups.com
I have indeed, Kaushal, and basically I've just condensed my own work into a definitive two month timeframe around what's in place already.  So, first, I'm building a centralized architecture around what we've got.  It would be really helpful to have the user installation continue development in parallel. 

Our goal is to have an intuitive user interface that talks to that central resource as soon as possible.  Since we do have a deadline, it's important that we have some dedicated resources, (attention span etc.) leading up to it.  So if there are people with the time and ability, we can definitely support you to help move things along, just let me know what you can or can't contribute, by when and what your requirements are from the team.  Happy to have anyone on the list contribute so that we can set some definitive milestones.

Our specific 'mandate' is the ability to filter through Twitter and SMS streams first, everything else should be prioritized for iteration 2 or 3.

Keep in mind these aren't our version numbers.  I estimate we get to version 1 around Nov of 2010 but possibly faster with additional contributions.

Jon`

Steve Andersen

unread,
Dec 20, 2009, 11:46:26 AM12/20/09
to swift...@googlegroups.com
My name is Steve Andersen--I'm a coder and technical PM at Salesforce.com Foundation. I'm happy to help however I can. I won't be able to contribute any code on this aggressive timeline, but I'm happy to help with docs, wikis, plans, etc.

Thanks,

Steve

Steve Andersen
Salesforce.com Foundation
Solutions Architect 

Chris Blow

unread,
Dec 20, 2009, 9:10:43 PM12/20/09
to swift...@googlegroups.com
Welcome Steve!

Wonderful to have your help.

Jon and I are getting together in person soon to do more brainshare and planning; we will have specific todos in the coming weeks.


c

Gabriele Sani

unread,
Dec 21, 2009, 9:52:03 AM12/21/09
to swift...@googlegroups.com
Hi,

my two cents on the fantastic developments planned:

If you plan on using bayesian categorization, i would suggest ruinning the raw text through a Shannon Information theory-like filter to identify the most relevant words in a text. With even a mild cut on the relevancy you can reduce the index size while increasing the overall quality of the matches.... and all this would be language-independent.

Regarding the tagging as trusted or not trusted: having trusted editors is always good, but then you risk not being able to scale, and to be attacked for enforcing a left/right/religious/atheist/whatever point of view. What I would love to see is a system that correlates info,a and then lets users understand it. For example: I have A, B, C, D, and E submitting reports. A, B and C tell me that the sun is yellow and the grass is green, D tells me that the sun is red and the grass is blue, E tells me that the sun is red and grass is yellow. The system will cluster A, B, C, and give me a value that determines the cluster veracity  as a function of the veracity of the 3 people submitting the reports, while it shows that D agrees mildly with them while E doesn't on any point.  As as user, I can see a computed veracity that will point me to the most likely truthful reports, but if I know for a fact that the grass in that region is yellow, as E states, then maybe I will trust E more than the others.  This system would offer several advantages: besides lowering the challenge of identifying experts on the field in a short time, it would show who departs more often from the truth, and allow users to choose their "side" of the truth, while being aware of other points of view.

Cheers, and welcome Jon!!

Gabriele


2009/12/21 Chris Blow <unthin...@gmail.com>
Reply all
Reply to author
Forward
0 new messages