I've done some searching of the forum and don't see anything which explains the weird issues I'm running into.
In short: when I submit the same list of keywords to the API, I get different numbers of returned results.
- I submit 12,031 keywords to the TargetingIdeaService
- On one run I get 143,644 results returned (12 months of data per keyword)
- On another run I get 143,592 results returned
I just don't understand why when I'm submitting 200 of these keywords at a time, I get different numbers of results back. And why are there missing results? If I submit 12,031 keywords, shouldn't I get 144,372 returned results (12031 keywords * 12 months of data)?
Below I've included a sample of what I'm sending, along with the code I'm using. I appreciate any and all ideas. I'm also logging the request and response, but they're quite large, and I don't get any obvious errors or exceptions.
My current possible solution is to look at the difference between what I sent and what I got back (when it's all finished running everything at least once), and then re-send what didn't get results, but I feel like something else is at play here and I'd like to understand that before I waste time solving the wrong problem.
Thank you in advance for any pointers, ideas, critiques, comments, etc!!!
[ SAMPLE chunk.search_query ]
In: chunk.search_query.head()
Out[545]:
key
561 151 "scrum lifecycle"
152 "scrum master exam"
153 "definition of done scrum"
154 "scrum team members"
155 "what is agile and scrum"
Name: search_query, dtype: object
[ ADWORDS API SETUP ]
# AdWords setup
adwords_client = adwords.AdWordsClient.LoadFromStorage()
# Initialize appropriate service.
targeting_idea_service = adwords_client.GetService(
'TargetingIdeaService', version='v201609')
[ CODE FOR REQUEST AND INTERPRETING RESPONSE ]
stats_pd = pd.DataFrame()
num_chunks = math.ceil(s.shape[0] / 200)
for chunk in np.array_split(s, num_chunks):
print(chunk.display_term.unique())
print(chunk.shape)
PAGE_SIZE = len(chunk.search_query)
# Construct selector object and retrieve related keywords.
offset = 0
stats_selector = {
'searchParameters': [
{
'xsi_type': 'RelatedToQuerySearchParameter',
'queries': chunk.search_query.tolist()
},
{
# Language setting (optional).
# The ID can be found in the documentation:
# https://developers.google.com/adwords/api/docs/appendix/languagecodes
'xsi_type': 'LanguageSearchParameter',
'languages': [{'id': '1000'}],
},
{
# Network search parameter (optional)
'xsi_type': 'NetworkSearchParameter',
'networkSetting': {
'targetGoogleSearch': True,
'targetSearchNetwork': False,
'targetContentNetwork': False,
'targetPartnerSearchNetwork': False
}
}
],
'ideaType': 'KEYWORD',
'requestType': 'STATS',
'requestedAttributeTypes': ['KEYWORD_TEXT', 'TARGETED_MONTHLY_SEARCHES'],
'paging': {
'startIndex': str(offset),
'numberResults': str(PAGE_SIZE)
}
}
retry_run = True
number_of_attempts = 0
while retry_run:
try:
stats_page = targeting_idea_service.get(stats_selector)
logger.debug(" >>> STATS_PAGE: " + str(stats_page))
except Exception as e:
error_msg += "::Problems with Google AdWords connection::"
inner_error = e
if 'RateExceededError.RATE_EXCEEDED' == inner_error.fault.detail.ApiExceptionFault.errors.errorString:
number_of_attempts += 1
time_to_wait = int(inner_error.fault.detail.ApiExceptionFault.errors.retryAfterSeconds) * ( 2 * number_of_attempts)
print('>> ***** RATE EXCEEDED ###### <<')
print('>> Sleeping for ' + str(time_to_wait) + ' seconds <<')
time.sleep(time_to_wait)
break
else:
print(">> OTHER ERROR CAUGHT, NOT HANDLED <<")
raise
retry_run = False
# raise (useful for troubleshooting)
##########################################################################
# Parse results to pandas dataframe
try:
if 'entries' in stats_page:
for stats_result in stats_page['entries']:
stats_attributes = {}
for stats_attribute in stats_result['data']:
#print (stats_attribute)
if stats_attribute['key'] == 'KEYWORD_TEXT':
kt = stats_attribute['value']['value']
else:
for i, val in enumerate(stats_attribute['value'][1]):
data = {'keyword': kt,
'year': val['year'],
'month': val['month'],
'count': val['count']}
data = pd.DataFrame(data, index = [i])
stats_pd = stats_pd.append(data, ignore_index=True)
else:
print(" ######### >>> NO ENTRIES IN STATS_PAGE <<< ########## ")
except:
error_msg += "::Invalid results returned from Google AdWords::"
time.sleep(15) #play with different sleep times to rate limit how hard I hit AdWords