Will I get incorrect dada/analysis if I choose "10,000" as the max value for Data Options and not clicking on "avoid Google analytics sampling"?

8 views
Skip to first unread message

Super-New-User

unread,
Sep 28, 2014, 2:48:02 PM9/28/14
to automate...@googlegroups.com
Hi I am a new user for Data Grabber - Mac version. I like your product - Data Graber, but I have a few questions: 

1. Will I get incorrect dada/analysis if I choose "10,000" as the max value for Data Options and not clicking on "avoid Google sampling"? If the total # of rows of my data is over my specified value, such as 10,000, does Data Graber just truncate the data arbitrarily (e.g., truncating the older data, etc.)? If that's the case, wouldn't it make the data sheet I get useless? If that's not the case, what's the criteria for your tool to decide what data to include/exclude without affecting the validity of the data/analysis?

2. Similarly, if I choose the option for "avoid Google analytics sampling" under the Data Options, how accurate the data/analysis will I get? 30%, 50%, or 90% accurate?

3. I am a Mac user. Am I better off using the Mac version or Google Sheet version of Data Graber? Do I miss out much comparing to the Windows PC version?

Thank you so much for your help!

Regards,

A super new user, who needs help!

Mikael Thuneberg

unread,
Sep 29, 2014, 2:52:32 AM9/29/14
to automate...@googlegroups.com

1. Will I get incorrect dada/analysis if I choose "10,000" as the max value for Data Options and not clicking on "avoid Google sampling"? If the total # of rows of my data is over my specified value, such as 10,000, does Data Graber just truncate the data arbitrarily (e.g., truncating the older data, etc.)? If that's the case, wouldn't it make the data sheet I get useless? If that's not the case, what's the criteria for your tool to decide what data to include/exclude without affecting the validity of the data/analysis?

The "max rows" and "avoid sampling" settings don't have any relation to each other. It's fine to run reports with max rows=10,000 and avoid sampling=off, in fact this is the recommended setting. If there are more than 10,000 result rows available to the query, the first 10,000 will be returned as follows:
  • If you are splitting by a time dimension, the rows will be sorted in chronological order, and older rows will be returned first. So you may not get some newer results. However, it's rare that you would miss rows this way, as when splitting by time, there will seldom be more than 10,000 rows
  • If you are not splitting by any time dimension, the rows will be sorted by the first metric, and the 10,000 rows with largest values will be returned. This usually ensures you get the most relevant results. If the first metric is a calculated metric such as bounce rate, the results will be sorted by the divisor metric. In bounce rate, the divisor is visits, as bounce rate is calculated as bounces divided by visits. 
2. Similarly, if I choose the option for "avoid Google analytics sampling" under the Data Options, how accurate the data/analysis will I get? 30%, 50%, or 90% accurate?

Regardless of whether you choose "avoid sampling", you will usually get 100 % accurate results, meaning the results will be identical to what you see in the Google Analytics web interface. However, in some reports, you may see a note in the upper left corner mentioning that Google has sampled the results. In these cases, the results won't be identical to the GA web UI, but should usually be close enough so they can be used. If you need 100 % accuracy and are willing to accept slower data fetching, you can use the avoid sampling setting to get 100 % accurate data also in these rare cases. Note that using this setting does not always help, the data may still be sampled (you can see if this is the case if there's a note about it in the upper left corner of the report).

As the avoid sampling setting can significantly slow data fetching and usually won't make much difference to the results, we recommend setting it off.
 
3. I am a Mac user. Am I better off using the Mac version or Google Sheet version of Data Graber? Do I miss out much comparing to the Windows PC version?

You should test both to see which you like more, as both have their advantages. The Excel version (both Mac and Windows) have more visualization options and PowerPoint export. On the other hand, with Google Sheets, you have better sharing options (use the regular Google Sheets sharing feature, and soon there will be an option to schedule PDF emailing). Mac and Windows versions of the Excel tool have the same features so you're not missing anything in that sense, but the Windows version is faster and more stable.

Reply all
Reply to author
Forward
0 new messages