Diacritic characters are collated with latin characters, often with strange search volumes for the keywords

73 views
Skip to first unread message

Martin

unread,
Nov 2, 2010, 6:46:29 AM11/2/10
to AdWords API Forum
Hi (Eric?),

In Swedish and other Scandinavian languages diacritic letters are not
just umlauts, but are actually new letters and compose new words.

We have noticed strange search volumes for keywords with diacritic
characters (in Swedish: åäö) that should have fairly large search
volumes in the localized targeting Sweden/Swedish. This problem is
probably common in several other languages with other diacritic
characters, but we are mainly interested in Swedish keywords.

One example is 'friskvård' (meaning health care) which is collated to
'friskvard' (nonsense). There are other examples where the collation
of non latin characters change the meaning of the word completely,
e.g. 'väder' (weather) vs. 'vader' (calfs) or 'hål' (hole) vs.
'hal' (slippery).

For the example 'friskvard/friskvård' there are 12 local monthly
searches using exact match, but for the less generic keyword
'friskvård uppsala' (where Uppsala is a city) there are 880 local
monthly searches using the same matching. One could argue that the
search volumes actually are correct, but we think it is very
improbable.

The following localized searches suffixed by the 5 largest Swedish
cities (ordered by population size) have suspiciously low or missing
search volumes for "diacritically challenged" keywords:

'catering stockholm' - 74000
'catering göteborg' - n/a
'catering malmö' - n/a
'catering uppsala' - 12100
'catering västerås' - 320

Since we fail to retrieve local search volumes for exact matching,
this might be a different problem, but we suspect the non latin
character collation here as well. Two keywords that seem to deviate
from this problem and have reasonable search volumes are 'catering
örebro' and 'catering norrköping'.

All problems are reproducible in the online AdWords keyword tool, so
it's not an API or client specific problem.

Best regards
--
Martin

AdWords API Advisor

unread,
Nov 3, 2010, 2:22:27 PM11/3/10
to AdWords API Forum
Hi Martin,

Thank your for reporting this issue, and for including detailed
examples of the problem. As you mentioned this isn't an API specific
issue, but I'll forward this feedback on to the appropriate team for
consideration.

Best,
- Eric Koleda, AdWords API Team

Martin

unread,
Nov 30, 2010, 5:15:06 AM11/30/10
to AdWords API Forum
We have noticed a positive change for local search volumes on most of
the words we've previously looked at.

It also seems as the "diacritically challenged" words are now properly
returned in response from the AdWords API.

There are however some examples that still have suspiciously low
volumes. Compare "flyttstädning gävle" with "flyttstädning uppsala",
where Uppsala is a considerably larger Swedish city than Gävle.

Is the issue officially resolved, or is the resolution work in
progress?

Best regards
--
Martin

On Nov 3, 7:22 pm, AdWords API Advisor <adwordsapiadvi...@google.com>
wrote:
> Hi Martin,
>
> Thank your for reporting this issue, and for including detailed
> examples of the problem.  As you mentioned this isn't an API specific
> issue, but I'll forward this feedback on to the appropriate team for
> consideration.
>
> Best,
> - Eric Koleda, AdWords API Team
>
> On Nov 2, 6:46 am, Martin <martin.lars...@jajja.com> wrote:
>
>
>
> > Hi (Eric?),
>
> > In Swedish and other Scandinavian languagesdiacriticletters are not

AdWords API Advisor

unread,
Dec 1, 2010, 7:04:01 PM12/1/10
to AdWords API Forum
Hi Martin,

The core issues behind the anomalies you previously noticed are
complex and still being worked on. I'm glad that the team's recent
work has improved the results, and let us know if you see further
problems with the data.

Best,
- Eric
Reply all
Reply to author
Forward
0 new messages