Ehren, that's exactly the kind of brainstormy-response that'll
encourage me to keep digging at this idea. Thank you!
In a couple of places you're describing the possibilities in terms of
either/or ("Should the emphasis be on linking within the social
actions corpus or outside of it?" "Publish those tags/links, or offer
them as fields on the API").
Do you think there'd be an opportunity to pursue both/and? In other
words,
-- link within the social actions corpus as an example of what can be
done in any dataset, no matter where it is, and encourage/support that
as much as possible too
-- publish the tags/links AND offer them as fields in the API (and
make sure the action sources know they can fold the data w/ tags etc.
back into what they use in-house any time they want, too)
Also, here's a suggested course of action culled from your post, which
we can add to and otherwise tweak/enhance as the ideas keep flowing:
1. Tag the social actions data as best we can, publish those tags, and
encourage others to link to it
2. Figure out how to pull the generalized topics from the actions so
it's easier to link actions to actions from different sources, easier
to link those actions to the canonical tags for those terms or topics,
and easier to provide feeds of actions of a particular topic (rather
than search term or source or type).
3. Try linking actions (or topics) with individuals who have published
linked data about themselves.
Any big pieces missing there? Looking, too, for input on "how huge of
an undertaking is this" and "what kind of group might like to fund
it." I guess those answers depend on the shape the project takes, if
the tagging is automatic, volunteer-based, or Mechanical Turk-ish...
Christine
On Nov 2, 12:40 pm, Ehren Foss <
ehren.f...@gmail.com> wrote:
> That's a tough question - should the emphasis be on linking within the
> social actions corpus or outside of it?
>
> Since the actions are already tagged as related to the action source
> and type, what comes to mind for me is topics. Topics like: actions
> about the environment, actions about refugees, actions about health.
> There are some techniques I've mentioned before, and am still working
> on, that would provide this service without requiring that the action
> sources mark them up themselves. E.g.
>
>
http://en.wikipedia.org/wiki/Tf%E2%80%93idfhttp://en.wikipedia.org/wiki/Latent_Dirichlet_allocation
>
> If we can figure out how to pull the generalized topics from the
> actions, it would be easier to link actions to actions from different
> sources. It would also be easier to link those actions to the
> canonical tags for those terms or topics, and to provide feeds of
> actions of a particular topic (rather than search term or source or
> type).
>
>
http://open.blogs.nytimes.com/2009/10/29/first-5000-tags-released-to-...
>
> If that goes well I'd recommend trying to then link actions (or
> topics) with individuals who have published linked data about
> themselves.
>
> Also, a big part of this is simply to tag the social actions data as
> best we can, publish those tags, and encourage others to link to it -
> saves a lot of work that way!
>
> So I guess in answer to your question, I'd recommend tagging the data
> by source, type, and to the best of our abilities "tags" or "topics"
> that are consistent across the corpus. Publish those tags/links, or
> offer them as fields on the API.
>
> Ehren
>
> On Mon, Nov 2, 2009 at 10:11 AM, Christine Egger
>