Dax
unread,Jan 13, 2012, 5:12:28 PM1/13/12Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Sign in to report message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to google-analytics-api - GA Data Export API
Hi. I've been trying to identify the reason between different
Analytics figures I'm getting for one of my clients. Learning this
will help me to draw appropriate conclusions from the data.
Background:
* I selected the date range (all of year 2011) for both the Data Feed
Query Explorer and Google Analytics. No filters in either location. No
segments selected.
* In the Data Feed Query Explorer (DFQE), I am using the dimension
ga:keywords and pulling the following metrics: visits, pageviews per
visit, avg time on site.
* In GA, I navigate to traffic sources > sources > overview.
* I then searched GA and DFQE for two phrases: "c.s. lewis sermons"
and "bible sermons love".
* Searching in both data sets pulled up several keywords using the
phrase "c.s. lewis sermons". For the exact phrase, GA shows 11 visits
in the last year, while DFQE showed 17.
* For "bible sermons love" I got only one result in both data sets. GA
shows 14 visits while DFQE shows 17.
I have checked another 10 or so phrases in both locations, and the
visits are never the same. Typically the API has higher numbers.
Although, the other pages/visit and avg time on site metrics are
usually different as well, they generally go down with the visits
number in GA. For example, DFQE visits are 17 with 5.5 pageviews/visit
and GA is 14 with 4.79. When multiplying out ratio to get the number
of pageviews, usually pageviews goes up or down proportionally with
visits. Funny thing is that the pageviews ratio also does
consistently.
Something else that is different: GA says there are 280,000 keywords
sending traffic to my client's site and I can easily navigate to the
170,000th phrase or higher. While the DFQE maxes out at 170,000. I
cannot get it to give me keywords beyond the 170,000th mark. The funny
thing is that the DFQE seems to return a broad sample of the key
phrases. The phrases don't stop in the alphabet at phrases starting
with Q. It makes it all the way to the letter Z in phrases. To test
further, in GA I changed the date range to the first 6 months of 2011
and also tried the latter 6 months of 2011. But the numbers dropped to
levels that weren't consistent among the phrases to make any
conclusions--testing the theory that the tool stops at a certain
range. The latter 6 months in GA returns only 170,000 phrases. I
thought that number may be the connection, but can't find a
correlation.
I also tested only selecting December 2011 and the numbers still vary.
Sometimes I find key phrases in the API data set that are not in the
GA set. My purpose in trying to gather this data is to compile a list
of all 280000 phrases in one spot to run pivot tables and analyze long-
tail effects on site content and high-level keyword tagging to help
understand weaknesses/strengths in the site content and www conceptual
popularity.
I cannot find any information as to why the numbers are different.
Does anyone have any insight? I'm wondering if both take a sample of
the data set before returning results or if the DFQE includes a type
of visitor from search that GA does not (since visits are always
higher). If I at least know why they are different, I can qualify my
findings to the client or throw out conclusions all together.
Thanks for any insight anyone can give!
Thank you,
Dax