Why is Cloud Natural Language API not returning English wikipedia url

39 views
Skip to first unread message

Frederick Zhang

unread,
May 9, 2022, 12:50:59 PM5/9/22
to cloud-nl-discuss
Why is Cloud Natural Language API not returning English wikipedia url?

A concrete example is as follows:
>>> document = {"content": "United States", "type_": "PLAIN_TEXT", "language": "EN"}
>>> response = client.analyze_entities(request = {'document': document, 'encoding_type':"UTF8"})
>>> response
entities {
  name: "United States"
  type_: LOCATION
  metadata {
    key: "mid"
    value: "/m/09c7w0"
  }
  metadata {
    key: "wikipedia_url"
    value: "https://de.wikipedia.org/wiki/Vereinigte_Staaten"
  }
  salience: 1.0
  mentions {
    text {
      content: "United States"
    }
    type_: PROPER
  }
}
language: "en"

As you can see, the returned wiki url is actually linking to the German version rather than English version (i.e., https://en.wikipedia.org/wiki/United_States). I think this API call used to work as expected and return English wiki url most of the time.
Does anyone know how to force the response to contain English wiki url? Appreciate any comment and suggestion.

Eduardo Ortiz Caraveo

unread,
May 9, 2022, 4:21:05 PM5/9/22
to cloud-nl-discuss
I found this documentation that might be useful to you, it explains how to use the method correctly.

Gabriel Laframboise

unread,
May 10, 2022, 12:09:12 PM5/10/22
to cloud-nl-discuss
I also have the same problem and I followed the documentation and even did some tests on Google Apis Explorer and I always get the wikipedia urls in German.  Hopefully this is only a temporary problem.

Frederick Zhang

unread,
May 10, 2022, 12:09:18 PM5/10/22
to cloud-nl-discuss
Hi,

Thanks! I have checked the documentation and I think that I am exactly following the steps described in the documentation.
Request body:
{
  "encodingType": "UTF8",
  "document": {
    "content": "United States",
    "language": "en",
    "type": "PLAIN_TEXT",
  }
}

Response: 
{
  "entities": [
    {
      "name": "United States",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/09c7w0",
        "wikipedia_url": "https://de.wikipedia.org/wiki/Vereinigte_Staaten"
      },
      "salience": 1,
      "mentions": [
        {
          "text": {
            "content": "United States",
            "beginOffset": 0
          },
          "type": "PROPER"
        }
      ]
    }
  ],
  "language": "en"
}

Again, it's still returning German wikipedia url. I think there might be something wrong with Google KB.

Eduardo Ortiz Caraveo

unread,
May 10, 2022, 6:23:18 PM5/10/22
to cloud-nl-discuss
What i would recommend to you is to raise an issue tracker ticket, since I reproduced your problem and I'm facing the same issue you are mentioning.
Reply all
Reply to author
Forward
0 new messages