Re: SoilGrids error

74 views
Skip to first unread message

Tomislav Hengl

unread,
Sep 15, 2017, 3:48:34 AM9/15/17
to Rossiter, David, Miguel Alejandro Becerra, Global Soil Information

Thank you David, that is a much longer answer that I would have written
;) but you explained the problem very well.

I would just like to add to that: you can at any time check the Taxonomy
points we are using for supervised training:

https://github.com/ISRICWorldSoil/SoilGrids250m/blob/master/grids/models/TAXNWRB/TAXNWRB_observed.kmz
https://github.com/ISRICWorldSoil/SoilGrids250m/blob/master/grids/models/TAXOUSDA/TAXOUSDA_observed.kmz

Indeed, even though USDA Soil Taxonomy is one of the most comprehensive
and most detailed classification systems, we have only dozen of points
for Argentina, Chile etc, so yes if you have some ground truth data from
existing surveys - please forward and we will add them to the training data.

Gracias,

T. (Tom) Hengl
Researcher @ ISRIC - World Soil Information
Url: http://www.isric.org/tomhengl
Network: http://profiles.google.com/tom.hengl
Publications: http://scholar.google.com/citations?user=2oYU7S8AAAAJ
ORCID ID: http://orcid.org/0000-0002-9921-5129

Important note: I do not work on Wednesdays (parental leave).

On 14-09-17 22:18, Rossiter, David wrote:
> Hello Miguel,
>
> I am copying my answer to Tom Hengl, who is the brains behind SoilGrids, he may be able to add some information.
>
> 1. SoilGrids is based on a global-scale machine-learning model, and there is no way to adjust it for local knowledge. What you see is how a machine learns, based on a large number of (global) calibration points. One solution is to add known points to WoSIS http://www.isric.org/explore/wosis, our profile database; this will improve predictions. That page explains how to contribute.
>
> 2. The most probable class, in this case Ustox, is “competing” with other classes. In SoilGrids, choose “Class Probabilities” instead of “Predicted most probable class”, and you will see a legend showing how probable is the Ustox. For example, in the area around La Rioja, probabilities range from near 0 to about 30%; most of the areas where Ustox is most probable are at about 20%. Other classes have their own probalility. In that area Ustolls are not at all probable, arouind 5%, Ustalfs a bit more, about 10-15%. So you can see just how much one class is favoured over the others.
>
> 3. A particular problem with ST classes is the poor or obsolete classification of profiles worldwide. We can only go by the classification given to us, we can not reclassify everything from the profile description, even if we had time. So SoilGrids is probably more useful for the soil properties themselves. Look at the SOC at various depths, I think you will find that a much more realistic picture. You could use those maps to do your own classification, at least in part.
>
> 4. We encourage you to use SoilGrids as a prior layer, as a covariate in your own mapping using the same methods, which are documented (and R code provided) on the SoilGrids web pages. If you have more or more accurate points and youl work in a smaller area, the prior (global) predictions will be corrected.
>
> I hope this is useful information.
>
> Recibe un saludo cordial de mi parte, atentamente,
>
> Rossiter, D G (David)
> Guest Researcher
> ISRIC - World Soil Information
> david.r...@wur.nl<mailto:david.r...@wur.nl>
>
>
>
>
>
> On 13 Sep 2017, at 20:39, Miguel Alejandro Becerra <mabe...@agro.unc.edu.ar<mailto:mabe...@agro.unc.edu.ar>> wrote:
>
> Hello David,
> I was checking soilgrids.org/<http://soilgrids.org/> and I noticed that a large area of Argentina (and Chile) it is supposed to have Oxisols when it is not true. At least in Argentina the most part of the supposed Oxisols are occupied by Mollisols (in particular Ustolls) or Alfisols.
> I do not know how the prediction is made but I hope that it can be improved in the future.
> I'll be glad to collaborate if you need any help.
>
> Best regards.
>
>
> --
> Ing. Agr. M. Alejandro Becerra
> Cátedra de Topografía
> FCA-UNC
>
Argentina_samples.jpg

Rossiter, David

unread,
Sep 15, 2017, 9:00:00 AM9/15/17
to Tomislav Hengl, Miguel Alejandro Becerra, Global Soil Information
Miguel, Tom,

Thanks for this Tom. I attach a screenshot of the ST points for NW Argentina and N Chile on Google Earth, from the KML which Tom links to. As you can see, there are only some Orthents in the mountains, and the SoilGrids machine learning will find those because of the elevation and slope. But for the whole of the lower elevation part of N Argentina (also for all of Paraguay!) there is nothing! So the algorithm is looking for similar positions worldwide. Notice the big differnce with Uruguay and Brasil, with plenty of points. So the solution is to add the INTA database to WOSIS. ISRIC has a good system for integrating databases, while protecting them according to whatever law is in force (of course, open is best but sometimes that is not possible). Please contact the WOSIS people for more information.
<Argentina_samples.jpg>
[cid:ACB5223F-6639-4F14-9FF4-075BE002D83C]

Tomislav Hengl

unread,
Sep 18, 2017, 4:26:23 AM9/18/17
to Global Soil Information

Thank you David,

I suggested framework to contribute point data to improve SoilGrids is:

1. Register your point data through a professional data registry: 

https://www.nature.com/sdata/policies/repositories

e.g. https://dataverse.harvard.edu/ 

This would allow you to set-up a license, way of accessing data etc (note, DataVerse is not only for publicly available data but also for data with restricted access).

2. After you have registered your data set, send the URL to the WoSIS team (http://www.isric.org/explore/share) and then they will either obtain data from you or simply harvest it from the repository.

3. Once the data set is included in WoSIS, it will automatically be used to produce improved SoilGrids predictions.

An alternative is to contact directly ISRIC director and then we can also sign a data usage agreement in which we would exclusively use your point data only for generating SoilGrids.

HTH,

Tom

Tomislav Hengl

unread,
Sep 19, 2017, 3:30:57 AM9/19/17
to Miguel Alejandro Becerra, Rossiter, David, Global Soil Information

Hi Miguel,

Unless the polygon maps are very detailed scale e.g. 1:25,000 or better
(in which case we might generate pseudo-observations) NO it can not help
making SoilGrids. SoilGrids are driven by ground observations and
measurements of soil properties and soil classes (i.e. the ground truth
or 'hard' soil data). We got quite some data for soil property mapping
(http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0169748#pone-0169748-g003),
but yes many areas in the world are under-represented at the moment.

I encourage you, however, to maybe try using soil polygons maps, in
combination with SoilGrids to generate more detailed local soil maps (as
explained in: https://arxiv.org/abs/1705.08323).

I am sure that there are many soil profiles / many soil field
observations and measurements in Argentina (sometimes hidden in old
reports, Government funded studies etc) - I always get surprised to see
how little of these data is actual used.
On 19-09-17 00:05, Miguel Alejandro Becerra wrote:
> David, Tom,
> thank you both for the quick answer.
> I'll try to add known point to WoSIS following the link you've provided.
> I also will talk with members of INTA to ask if they can add their database.
> There is an open database from INTA with digital soil maps (escale
> 1:500000)
> [www.geointa.inta.gob.ar/wp-content/uploads/downloads/shapefiles/Suelos-de-la-Republica-Argentina-1:500.000_2.zip]
> but it doesn't have soil profiles info, so it won't be useful for the
> machine learning algorithm, maybe just to check.
>
> Best regards
>
>
> 2017-09-15 9:59 GMT-03:00 Rossiter, David <david.r...@wur.nl
> <mailto:david.r...@wur.nl>>:
>
> Miguel, Tom,
>
> Thanks for this Tom. I attach a screenshot of the ST points for NW
> Argentina and N Chile on Google Earth, from the KML which Tom links
> to. As you can see, there are only some Orthents in the mountains,
> and the SoilGrids machine learning will find those because of the
> elevation and slope. But for the whole of the lower elevation part
> of N Argentina (also for all of Paraguay!) there is nothing! So the
> algorithm is looking for similar positions worldwide. Notice the big
> differnce with Uruguay and Brasil, with plenty of points. So the
> solution is to add the INTA database to WOSIS. ISRIC has a good
> system for integrating databases, while protecting them according to
> whatever law is in force (of course, open is best but sometimes that
> is not possible). Please contact the WOSIS people for more information.
>
>
> Rossiter, D G (David)
> Guest Researcher
> ISRIC - World Soil Information
> david.r...@wur.nl
> <mailto:david.r...@wur.nl><mailto:david.r...@wur.nl
> <mailto:david.r...@wur.nl>>
>
>
>
>
>
> On 15 Sep 2017, at 03:48, Tomislav Hengl <tom....@isric.org
> <mailto:tom....@isric.org><mailto:tom....@isric.org
> <mailto:tom....@isric.org>>> wrote:
>
>
> Thank you David, that is a much longer answer that I would have
> written ;) but you explained the problem very well.
>
> I would just like to add to that: you can at any time check the
> Taxonomy points we are using for supervised training:
>
> https://github.com/ISRICWorldSoil/SoilGrids250m/blob/master/grids/models/TAXNWRB/TAXNWRB_observed.kmz
> <https://github.com/ISRICWorldSoil/SoilGrids250m/blob/master/grids/models/TAXNWRB/TAXNWRB_observed.kmz>
> https://github.com/ISRICWorldSoil/SoilGrids250m/blob/master/grids/models/TAXOUSDA/TAXOUSDA_observed.kmz
> <http://www.isric.org/explore/wosis>, our profile database; this
> <mailto:david.r...@wur.nl><mailto:david.r...@wur.nl
> <mailto:david.r...@wur.nl>>
> On 13 Sep 2017, at 20:39, Miguel Alejandro Becerra
> <mabe...@agro.unc.edu.ar
> <mailto:mabe...@agro.unc.edu.ar><mailto:mabe...@agro.unc.edu.ar
> <mailto:mabe...@agro.unc.edu.ar>>> wrote:
> Hello David,
> I was checking soilgrids.org/
> <http://soilgrids.org/><http://soilgrids.org/
Reply all
Reply to author
Forward
0 new messages