Site analytics

5 views
Skip to first unread message

Philip Rutledge

unread,
May 27, 2010, 10:54:31 PM5/27/10
to gcd-software-committee
Just curious... 

I'm a bit of a data driven sort of guy and I was wondering what data is available for the site.  Do we have stats that show how people are getting to the site (google, direct links, other sites etc), which links they actually click on while at the site, usage patterns etc?

I'm not a heavy web developer so while I know there are packages and ways of collecting and analyzing this data, I'm not sure of the infrastructure required to obtain it.

I think it would be interesting to see what is being used and what isn't in the site.

Phil

Alexandros Diamantidis

unread,
May 28, 2010, 8:57:30 AM5/28/10
to gcd-software-committee
* Philip Rutledge [2010-05-27 21:54]:

> I'm a bit of a data driven sort of guy and I was wondering what data is
> available for the site. Do we have stats that show how people are getting
> to the site (google, direct links, other sites etc), which links they
> actually click on while at the site, usage patterns etc?

As far as I know, we have both a plain web server log with a statistics
front-end (AWstats I think), as well as integration with Google
Analytics, which gives lots of info about usage patterns, how visitors
find us, and so on. I don't know who looks at those statistics, though.
I think there is talk about a GCD web team, which I suppose will have as
part of its duties the task of preparing periodic reports based on those
statistics?

By they way, related to the idea of site statistics... I think periodic
reports of user activity based on how indexers have contributed to the
site would be very useful: for example, percentage of registered users
who have done any indexing (and how this changes with time), statistics
about how long it takes the average new user to start contributing,
average time a change waits in the queue before being dealt with,
average number of changes submitted by users correlated with when they
registered to the site, that sort of thing. I think all those questions
can be answered with not too much difficulty with our current code.

A few months ago I set up a simple script to scrape our web pages and
plot a couple of basic variables: http://comix.gr/gcd-stats/200.html

Having some statistics gathering hooked up directly to the site would
enable much richer trend reporting. This will be useful to see if GCD
is gaining momentum, and if the results are also split by language /
country, it would be even better.

For example, looking at the last graph, the graph of users becoming
active (that is, submitting at least one change) seems to have flattened
out, which is not a good sign. I'll see if I can add a monthly running
average of the "additions" graph as well.

Alexandros

Jochen Garcke

unread,
May 30, 2010, 1:08:22 PM5/30/10
to gcd-softwar...@googlegroups.com
Am 28.05.2010 14:57, schrieb Alexandros Diamantidis:
> * Philip Rutledge [2010-05-27 21:54]:
>> I'm a bit of a data driven sort of guy and I was wondering what data is
>> available for the site. Do we have stats that show how people are getting
>> to the site (google, direct links, other sites etc), which links they
>> actually click on while at the site, usage patterns etc?
>
> As far as I know, we have both a plain web server log with a statistics
> front-end (AWstats I think), as well as integration with Google
> Analytics, which gives lots of info about usage patterns, how visitors
> find us, and so on. I don't know who looks at those statistics, though.
> I think there is talk about a GCD web team, which I suppose will have as
> part of its duties the task of preparing periodic reports based on those
> statistics?

So what could we take out of these statistics ?

Yes, we do run awstats for the website.

Jochen

Philip Rutledge

unread,
May 30, 2010, 2:20:19 PM5/30/10
to gcd-software-committee
Is it possible to just make the standard awstats summary stats available?

I'd be interested in understanding
  • Where people are coming from, both geographically and whether they are coming from a search engine, linked from other sites or directly accessing the sites.  
  • How long they are spending at the site and the ratio of consumers of the data vs those contributing to the data (at a gross level).  
  • How many uniques are visiting and is it the same batch of folks or is it growing/shrinking. 
Ultimately however I think we want to develop a set of KPIs (Key Performance Indicators) to measure our success in some of the strategic goals and I think visitor stats can be one aspect of that.  

Phil


Jochen Garcke

unread,
May 31, 2010, 3:16:39 AM5/31/10
to gcd-softwar...@googlegroups.com
Am 30.05.2010 20:20, schrieb Philip Rutledge:
> Is it possible to just make the standard awstats summary stats available?
>
> I'd be interested in understanding
>
> * Where people are coming from, both geographically and whether they

> are coming from a search engine, linked from other sites or
> directly accessing the sites.
> * How long they are spending at the site and the ratio of consumers

> of the data vs those contributing to the data (at a gross level).
> * How many uniques are visiting and is it the same batch of folks or

> is it growing/shrinking.
>
> Ultimately however I think we want to develop a set of KPIs (Key
> Performance Indicators) to measure our success in some of
> the strategic goals and I think visitor stats can be one aspect of that.

The analytics are at a password protected webpage. I think we can give
you access to that, it is not a secret, but we don't want to have it
accessible for everyone.

Jochen

Jason Sacks

unread,
Jun 1, 2010, 6:44:17 PM6/1/10
to gcd-software-committee
I'm curious what the objectives are for such analytics given that the site is a non-profit. I understand the need to gather metrics around issues like page load times, time in returning query results, page weight statistics, etc. And I see the need to gather statistics around issues like the bandwidth of the user's access method, browsers used to access the site, and charset of the user. Those are all topics that will help make decisions around how and where to spend GCD's limited dev, test and PM resources, and inform long-term plans for the site.
 
I can also see the need to track metrics like raw site hits and SQL queries because they will help determine the need for higher performance and faster servers.
 
But what is the case for use of such statistics as Unique Users, amount of time on the site, referral from a search engine, etc.? To me those seem like numbers that a commercial site might want, and numbers that can be used to point to specific goals around page hits. But those sorts of issues seem secondary or even tertiary to me as compared to what I see as the mission of the site. Does GCD really care that much about how many UUs hit the site each month? How would such data get the site closer to its long-term goals?
 
I just don't see metrics like UUs as feeding useful KPIs.
 
I'm not making an argument here; I'm just curious.
 
Jason

Jason Sacks

unread,
Jun 1, 2010, 6:52:07 PM6/1/10
to gcd-software-committee
Again, a broad question: is it a goal of the site to have as many users as
possible becoming active? Is there a generally accepted percentage of users
that we would like to have as active users? From my experience, the number
1-3% comes to mind; would we expect variance from those numbers for a site
like GCD?

Jason

--------------------------------------------------
From: "Alexandros Diamantidis" <ad...@hellug.gr>
Sent: Friday, May 28, 2010 5:57 AM
To: "gcd-software-committee" <gcd-softwar...@googlegroups.com>


Subject: Re: [gcd-software] Site analytics

>

Henry Andrews

unread,
Jun 1, 2010, 7:08:24 PM6/1/10
to gcd-softwar...@googlegroups.com
We definitely want to keep an eye on retention of active indexers, and on whether we're doing better or worse at attracting new indexers. I would expect that we got a significant jump in folks giving the new system a try after December. And then I'd also expect that to tail off into some sort of steady state. Once we establish that sort of steady state, as we work on both the site and our process (particularly how approvers and indexers interact) we can see whether our changes are making the site more or less attractive.

With the GCD, it's definitely not a simple calculation of what percentage of people are interested. The amount of effort is relatively high, and many people decide against it after they take their first close look. We also historically have a very low rate of conversion of new indexers to steady indexers- it used to be about 10%, and that was 10% of the people who were dedicated enough to dig out the email address of somebody and ask us for an account. Which was a *very* small percentage of visitors. There was no "register" button anywhere.

A very interesting question is, of the people who submit their first change in a given month, how many are still submitting changes three, six or twelve months later? And is there a common thread that indicates why they are leaving (only worked on a certain type of change, got in a big fight with an approver, etc.)

thanks,
-henry


----- Original Message ----
> From: Jason Sacks <jason...@hotmail.com>
> To: gcd-software-committee <gcd-softwar...@googlegroups.com>
> Sent: Tue, June 1, 2010 3:52:07 PM
> Subject: Re: [gcd-software] Site analytics
>

> Again, a broad question: is it a goal of the site to have as many users as
> possible becoming active? Is there a generally accepted percentage of users that
> we would like to have as active users? From my experience, the number 1-3% comes
> to mind; would we expect variance from those numbers for a site like
> GCD?
>
> Jason

> --------------------------------------------------
>> From:
>> "Alexandros Diamantidis" <ad...@hellug.gr>

Henry Andrews

unread,
Jun 1, 2010, 7:12:39 PM6/1/10
to gcd-softwar...@googlegroups.com
Some of this is just general curiosity.  It's neat to see how the site's being used.  But even though we're not trying to drive revenue the way a commercial site would, we do want to drive mindshare.  Except for traffic changes driven by new features, that's not really the point of this particular subcommittee.  But the PR team would be very interested in what sort of links we can get that would increase our visibility, which will in turn increase the likelihood of getting new contributors, which will increase the amount of data we can get.  And hopefully its accuracy :-)

thanks,
-henry


From: Jason Sacks <jason...@hotmail.com>
To: gcd-software-committee <gcd-softwar...@googlegroups.com>
Reply all
Reply to author
Forward
0 new messages