Google Analytics Data inconsistency

232 views
Skip to first unread message

Dániel Farkas

unread,
Oct 16, 2015, 6:47:03 AM10/16/15
to Google Analytics Reporting API
Hi All,

We are facing a very worrying issue regarding to Google Analytics. We developing a software in .NET framework, which is able to query Google Analytics servers by using HttpRequests (through measurement protocol).

The issue is that we find inconsistency when we query the same historical data.

Let me share with you two example query, which can presentate it:

ProfileId, 
Metrics(ga:sessions), 
Dimensions(ga:year, ga:month), 
the CustomSegment, 
Start-index, 
Max-result, 
and End-date parameters are the SAME for both queries. ONLY the Start-date parameter differ, and there is NO FILTERs defined.

Query1 date range:
start-date: 2014.10.27
end-date: 2015.08.20

Query2 date range:
start-date: 2014.01.01
end-date: 2015.08.20

By these two queries for example the following row's session count has to be the same but these are not.

Query1 result:
ga:year: 2015
ga:month: 02
ga:sessions: 52979

Query2 result:
ga:year: 2015
ga:month: 02
ga:sessions: 58289

In the Google Analytics API JSON response we can see that this Custom Segment contains huge amount of data, and the queries using sampling.

Example sample size
Query1 sample size: 509968
Query2 sample size: 813574

So my question would be how can the session count in february 2015 differ, if the only difference between the two query is the start-date parameter?

Could it be the sampling, that cause the data discrepancy?

Could it be one of the property or settings of the Custom Segment?

Do you have any idea what could cause this mathematically?

If not, I am afraid this is a bug, and I'd like to report it to the Google Analytics developer team, please share a direct contact with me if you can.

Thank you, Regards,
Daniel Farkas

chris@shufflepoint

unread,
Oct 16, 2015, 10:07:09 AM10/16/15
to Google Analytics Reporting API
Sampling is probably your answer. Does the discrepancy go away with a smaller timeframe?
Message has been deleted

JD Becht

unread,
Oct 16, 2015, 11:57:15 AM10/16/15
to Google Analytics Reporting API
What frustrates me most is that using the Google Sheets add-on and comparing that with Google Analytics.  Analytics will state how many sessions it uses and what present of them.  Running multiple queries with the add-on, I may return one of three matching GA's data.  Each time, however, add-ons says that no data has been sampled.  

Dániel Farkas

unread,
Oct 17, 2015, 5:58:04 PM10/17/15
to Google Analytics Reporting API
Probably you are right maybe sampling it is. 
However the difference is very big, even if we use the HIGHER_PRECISION for the samplingLevel parameter (result is the same). 
In addition on the  web interface, on www.google.com/analytics we query these values, and these are the same so there are no differences.
Is there any way to query the data without sampling?
Reply all
Reply to author
Forward
0 new messages