Data Mining Google Analytics Data

650 views
Skip to first unread message

TheLibrarian

unread,
Mar 8, 2010, 8:58:09 PM3/8/10
to google-analytics-api - GA Data Export API
Hello,

Forgive me if this is a naive question. Assuming I were able to export
all of my Google Analytics data into a Microsoft SQL database: would
it make any sense at all to use SQL Server Analysis Services to
perform data mining routines on that data? I have no real good reason
for wanting to do that other than I am interested in learning data
mining and thought my GA data might serve as interesting data. Maybe
it is more trouble than it is worth. I haven't seen anyone else
talking about this and there's probably a reason for that! It's a dumb
idea!

I did notice that the API documentation mentions that "raw hit-level
data" from the Analytics server logs is not the same as the data found
on the Web UI.

Thanks by the way to Mikael's GA Excel functions for giving me a
glimpse of the possibilty of doing this.

Amer

unread,
Mar 11, 2010, 12:26:37 PM3/11/10
to google-analytics-api - GA Data Export API
Yes it would make sense.
But if your main purpose is to learn Data Mining then it is not
advisable to start with the GA data that you get.
Try to advance the famous Data Mining algorithms, and use a
specialized software like Weka to help you achieving your goal. And on
later stage once you become confident with your data mining
background, try to apply your skills on GA ;)

regards,

TheLibrarian

unread,
Mar 11, 2010, 10:06:12 PM3/11/10
to google-analytics-api - GA Data Export API
Amer, I appreciate the response. That's a big help just knowing it at
least "makes sense." As a follow-up, while all data needs to be
cleansed and preprocessed, are you suggesting that it would take some
extra work to get the GA data ready for data mining? You have me
curious as to why GA data is for advanced DMers.

Thanks for the tip about using Weka. It's good to have recommendations
like that.

> > glimpse of the possibilty of doing this.- Hide quoted text -
>
> - Show quoted text -

csumm...@abfs.com

unread,
Mar 12, 2010, 9:30:26 AM3/12/10
to google-analytics-api - GA Data Export API
I've done exactly what you have described, except I pulled the data
from my data warehouse. I think that you will have a very frustrating
time with data mining, if you don't have a data-mining problem you are
attempting to solve. I was trying to figure out how to do
differential response analysis on my landing page test results. I was
trying to find low occurrence, highly responsive segments of my test
population that preferred a particular recipe. This was beyond the
capabilities of SQL Server Analysis Services, but it was a great
learning experience and I really got to know my data in the process.

You could probably have some success with segmentation using the
decision tree algorithm for geo-targeting, day-part, etc.

I recommend you get the book Data Mining Techniques For Marketing,
Sales, and Customer Relationship Managment by Michael Berry & Gordon
S. Linoff to help you get started. You will find that eighty percent
of the work is in how you prepare the data. The Excel Data Mining Add-
in can help you speed up the data prep and you can run the algorithms
right in Excel.

Good Luck,
Chad

TheLibrarian

unread,
Mar 16, 2010, 8:20:57 PM3/16/10
to google-analytics-api - GA Data Export API
I appreciate the words of support Chad. So there's nothing in
particular about how the GA data is collected and how it is made
accessible to us through the API that would make it unsuitable for
data mining. It's just that you have to be aware of the kind of data
it is and do the necessary preparation, which you agree is 80% of the
work. Of course, if one was serious about data mining, like you are,
one would have their own data in a data warehouse.

Tremendously helpful guys, I appreciate it!

mike3s

unread,
Mar 17, 2010, 9:19:03 PM3/17/10
to google-analytics-api - GA Data Export API
If you are considering Excel, we have a new tool available that runs
as an add-in (Nextanalytics for Excel) that will help with both the
download and the data prep...even some of the analysis. It's not free,
but it is quick, and it may give you a better idea of what you really
are looking for.

Microsoft Analysis Services can take a fair bit of work to get
configured, so you need to start with a pretty good idea of what you
want before you start. Playing with a multi-dimensional model is great
fun, but rebuilding them because you got the design wrong is not. If
it is just you doing the playing/learning, then I would avoid the OLAP
server and use one of the many personal tools available these days.
Qliktech has a free personal edition, and I saw an article in their
forum that someone built a GA connector for it.

m

Reply all
Reply to author
Forward
0 new messages