Update from informal 'hackathon'

2 views
Skip to first unread message

Ehren Foss

unread,
Jan 13, 2010, 12:17:05 PM1/13/10
to social-actions-dev
Hey there,

Jake and I had a codeathon on Sunday and I wanted to update you on
what happened. Help & additions welcome on this and all projects, SVN
access available.

1. sadata (graphs)

http://sadata.preludeinteractive.com/site/graphs/all

I did some cosmetic work here, improving the X axis, widening the
graph to the maximum allowed by google charts, and adding the "latest
figures" section at the top. I also added a metric counting the total
# of actions each source has in the system. Some of the graphs show a
puzzling dive into negative numbers - I think it is an issue with
y-axis labels.

2. satuner (change the web)

Jake helped me fix the satuner application from the Change the Web
contest. So, this works again but nobody uses it, which is not
surprising. It would be great to see if the technology actually works
in the wild, and to do that we'd need a willing participant action
source who can record the clicks of individual users. More on this in
the next section...

3. "tfidf" tool

tfidf or "term frequency, inverse document frequency" is a simpler
approach than the one we had been trying for boiling down the 'topic'
of an action. It means you assign a score to words in an action based
on the number of times they are used in that action, but penalizing
them if they are used a lot in all the actions. So "giving" and
"help" get low scores but "Guatemalan" or "Microloan" might get a high
score in some actions. If we apply the same techniques to text about
people, I think it should be possible to match people with actions.

Like the stuff behind satuner, we think the tfidf technique might be
useful and effective, but there's no way to know without testing it.
Does anyone know of any actions sources that would be willing to
partner with the effort? We'd need to work with them to, say, select
an experimental and control group of actions. One example: The
experimental group would be sent to a number of people whose TFIDF
terms matched well. The control group would be randomly sent to the
same number of people without regard to matching. At the end of the
test we'd see if the experimentally treated actions were resolved
quicker / received more response than the control group and by how
much.

With satuner, we'd need to work with someone who has a base of users
who search for and click on lots of actions over a measurable period
of time (say 10 actions a month). With their clicks, we feed their
selections into satuner and from then on generate search results based
on satuner's suggestions. The control group would see the normal
search results without satuner. We'd see if there was any bump in
usage or response.

I know it's asking a lot, but that's probably the next stage for these
two projects. Do you know of any orgs who would be interested?

Cheers,

Ehren

Peter Deitz

unread,
Jan 13, 2010, 4:39:07 PM1/13/10
to social-actions
Hi Ehren,

1. sadata

As always, thank you for your incredible efforts to illuminate Social
Actions metrics. The X-Axises on the SAData graphs are looking great.

A few observations:

- Not sure where the 'goal completion' data is coming from. As far as I
know, we don't have this kind of information for any of the action
sources, and yet all of the graphs are showing something for the two
Avg. Goal Completion charts.

- Similarly, the days since 'update' chart should be constant because
(for better or worse) we're only aggregating new content from action
sources. I'd be curious how you're distinguishing Avg. Days since Update
vs. Avg. Dats since Creation. Would it be possible to add another stat
in the header of the results, below "Total Actions in System:" that read
simply "days since last update (and then in parenthesis the date on
which the action source was last updated).

2. satuner and tfidf

I have an idea for a use case of the SAtuner / tfidf technology. For the
last several months, we have been focusing increasingly on the action
packs that Joe Solomon setup last year. You can see a full list of the
action packs here:

http://socialactions.com/actionpacks

We have run into exactly the problem that satuner and tfidf attempt to
solve... that is that the content of each action pack doesn't always
relate to the topic which the action pack is supposed to address. Ie,
AidsActions sometimes shows actions with the keyword 'hearing aids'
instead of actions related to HIV/AIDS.

If you are able to use the tfidf technology to create intelligent feeds
for an action pack like @aidsactions or @climateactions, we can pass the
content through the action pack along with the basic (unintelligent)
feed that the action packs currently rely on. You could then compare
click-through rates of the same user community (ie, people who follow
@aidsactions or @climateactions) to determine the effectiveness of the
system in providing accurate actions for that 'community of action'.

If this interests you, we would be incredibly interested in partnering
with you to improve the content in the action packs.

Let me know what you think.

All the best,
Peter

Social Actions
http://socialactions.com


On 1/13/2010 12:17 PM, Ehren Foss wrote:
> Hey there,
>
> Jake and I had a codeathon on Sunday and I wanted to update you on

> what happened. Help& additions welcome on this and all projects, SVN

Ehren Foss

unread,
Jan 14, 2010, 10:25:53 PM1/14/10
to social-ac...@googlegroups.com
> - Not sure where the 'goal completion' data is coming from. As far as I
> know, we don't have this kind of information for any of the action sources,
> and yet all of the graphs are showing something for the two Avg. Goal
> Completion charts.

You're right - you mentioned this last time too. I added a check not
to print the graph if there are no data points.

> - Similarly, the days since 'update' chart should be constant because (for
> better or worse) we're only aggregating new content from action sources. I'd
> be curious how you're distinguishing Avg. Days since Update vs. Avg. Dats
> since Creation. Would it be possible to add another stat in the header of
> the results, below "Total Actions in System:" that read simply "days since
> last update (and then in parenthesis the date on which the action source was
> last updated).

I'm using the updated_at and created_at fields of the database. These
I believe are Ruby on Rails standard fields which should indicate when
the action rows were added to the database and when they were updated
by the application. If these aren't actually indicative of events
related to the actions, I can remove the graph. The main point was to
show how 'stale' the actions were on average.

> 2. satuner and tfidf
>
> I have an idea for a use case of the SAtuner / tfidf technology. For the
> last several months, we have been focusing increasingly on the action packs
> that Joe Solomon setup last year. You can see a full list of the action
> packs here:
>
> http://socialactions.com/actionpacks
>
> We have run into exactly the problem that satuner and tfidf attempt to
> solve... that is that the content of each action pack doesn't always relate
> to the topic which the action pack is supposed to address. Ie, AidsActions
> sometimes shows actions with the keyword 'hearing aids' instead of actions
> related to HIV/AIDS.
>
> If you are able to use the tfidf technology to create intelligent feeds for
> an action pack like @aidsactions or @climateactions, we can pass the content
> through the action pack along with the basic (unintelligent) feed that the
> action packs currently rely on.  You could then compare click-through rates
> of the same user community (ie, people who follow @aidsactions or
> @climateactions) to determine the effectiveness of the system in providing
> accurate actions for that 'community of action'.
>
> If this interests you, we would be incredibly interested in partnering with
> you to improve the content in the action packs.

I think this interests us very much. I think the satuner technology
could be retooled for this purpose without much effort. Do you have,
or can we work towards, sets of the action pack actions which are
'marked up' as good or bad examples? For either method, we would need
at least a corpus of good examples or a corpus of bad examples for
each action pack - probably 20-30 would do but the more the merrier.
Happy to chat more about this anytime!

Ehren

Peter Deitz

unread,
Jan 15, 2010, 11:29:35 AM1/15/10
to social-ac...@googlegroups.com
Hi Ehren,

Re graphs:
Thanks for your replies. I'd suggest that for now we remove the "updated
at" graph and keep the "created at" graph. It is definitely showing
accurate information, particularly for partner who we have contacted and
have subsequently fixed their feeds (when broken). See Pincgiving for
example.

Re action packs:
Let's start with a single action pack -- one that has a fair number of
subscribers. Perhaps @enviroactions would be a good place to start.
Based on previous tweets, we can identify 30 on-topic alerts and 30 not
so on-topic alerts, and then attempt to apply the satuner magic to the
system. If it's possible to identify the on-topic and not so on-topic
actions based on total click-throughs for the action pack via Twitter,
then we'll have an intelligent system on which to build the corpus. Not
sure what the best next step should be for this. Perhaps a quick phone
call would help.

All the best,
Peter

Social Actions
http://socialactions.com

Ehren Foss

unread,
Jan 15, 2010, 2:53:57 PM1/15/10
to social-ac...@googlegroups.com
> Thanks for your replies. I'd suggest that for now we remove the "updated at"
> graph and keep the "created at" graph. It is definitely showing accurate
> information, particularly for partner who we have contacted and have
> subsequently fixed their feeds (when broken). See Pincgiving for example.

I removed the 'updated' graphs and metrics. Good to know the created
ones are useful.

> Re action packs:
> Let's start with a single action pack -- one that has a fair number of
> subscribers. Perhaps @enviroactions would be a good place to start. Based on
> previous tweets, we can identify 30 on-topic alerts and 30 not so on-topic
> alerts, and then attempt to apply the satuner magic to the system. If it's
> possible to identify the on-topic and not so on-topic actions based on total
> click-throughs for the action pack via Twitter, then we'll have an
> intelligent system on which to build the corpus. Not sure what the best next
> step should be for this. Perhaps a quick phone call would help.

I think a phone call would be great. We can probably figure out a way
to make the tool fairly self-serve - you feed in an RSS feed or CSV
file or something, it helps you select good and bad examples, and then
from then on provides a filtered feed.

Ehren

Peter Deitz

unread,
Jan 20, 2010, 12:15:52 PM1/20/10
to social-ac...@googlegroups.com
Hi Ehren,

I really like the idea of a self-serve system for identifying the
appropriate actions, based on per-community click-through rates and RT
rates. Are you available on Thursday, February 11th at 4pm for a call on
the subject?

All the best,
Peter

Social Actions
http://socialactions.com

Peter Deitz

unread,
Feb 11, 2010, 5:21:34 PM2/11/10
to Social Actions Developers
Hi SA-Dev,

Ehren, Christine, and I had a really interesting discussion on the
ideas contained in this thread.

If you get a chance, you can have a listen here:
http://www.blogtalkradio.com/social-actions/2010/02/11/social-actions-tuner-actions-packs

All the best,
Peter


On Jan 20, 12:15 pm, Peter Deitz <peterde...@gmail.com> wrote:
> Hi Ehren,
>
> I really like the idea of a self-serve system for identifying the
> appropriate actions, based on per-community click-through rates and RT
> rates. Are you available on Thursday, February 11th at 4pm for a call on
> the subject?
>
> All the  best,
> Peter
>

> Social Actionshttp://socialactions.com

Reply all
Reply to author
Forward
0 new messages