prepsources

29 views
Skip to first unread message

McKenzie Boyer

unread,
Mar 1, 2024, 11:16:25 AMMar 1
to IsoriX
Hello! 
I am trying to figure out the natal origins of some thrushes and I am using this package to build my maps but I have run into some problems using the prepsources function.
When I run the following code I get the following warning messages:
GNIPDataNAagg <- prepsources(data = GNIPData,
                             long_min = -180, long_max = -45,
                             lat_min = 0, lat_max = 80,
                             month = 8:11,
                             split_by = NULL)

Warning messages:
1: In FUN(X[[i]], ...) :
  Some latitude values are not unique but should be, so the first element was taken among: 47.83492738 47.34437749 .
2: In FUN(X[[i]], ...) : and so on 

I do have multiple H2 values for the sample location, but they have different years and months so I'm not quite sure why it wants to only take the first element. 
Any help at all would be greatly appreciated! 

Alexandre Courtiol

unread,
Mar 2, 2024, 10:54:38 AMMar 2
to IsoriX
Hi McKenzie,

Welcome to this forum and thanks for using (or trying to use) IsoriX.

Without the data it is hard to be sure, but the message suggests is that it is because you have different source_ID that share the exact same latitude.

So 2 options:
- either you have some source_IDs that correspond to the very same locations, and that would be a mistake -> you need to create new source_IDs.
- the tests that IsoriX uses to detect mistakes from user inputs are too strict and you do have different locations but they happen to share the exact same latitude -> I need to revise the tests.

It is true that I did not foresee that stations could be present on a perfect grid, but that could happen, so I will revise that in any case.
But when I implement that fix depends on you, I will do it in the next days if that is indeed an issue on my side, but if that is a data issue in your side and do not need the patch, I will wait a little more.
Please let me know,

Alex

McKenzie Boyer

unread,
Mar 4, 2024, 1:55:18 AMMar 4
to IsoriX
Hi Alex! 

I gave all my source_id's different names even though they have the same lat and long values. Our samples look like this: Bird A1 has three samples, the P1, P9, and a nail. So for each one I changed the Id to like A1nail, A1P1 and so on. I'm not quite sure why the package isn't differing between the sample_ids. They all have the same lat long values but have different Ids and H2 values. 
Is there a way to alter the package to be able to recognize all the samples?

Thank you again for all your help! 

McKenzie 



Alexandre Courtiol

unread,
Mar 4, 2024, 2:14:41 AMMar 4
to IsoriX
Hi McKenzie,

I see, so that is indeed the issue I had suspected.

IsoriX does not allow that for now since data P1, P9 and nail are not independent.
Dealing with this automatically in prepsources() would be a little difficult since unless I add additional columns (e.g. tissue)...

I will think about that, but in the meantime there is a easy work around for you.
The easiest is that you just skip the call to prepsources() and instead create by "hand" a dataset equivalent in structure to that one created by prepsources().
Here is an example:

Screenshot_20240304_075906.png
So the dataset must have these columns and each row must really be a different location.
For this, you will have to average your P1, P9 and nail per location.
The only thing that is a little unclear to me is whether each group of 3 measurements {P1, P9 and nail} should count for 1 or 3 when counting n_source_value.
I would make that count for 1, but I don't think it will be a large difference as long as you did the same on all birds.
You could in the end rerun the workflow using 3 rather than 1 to see if it makes any difference.

To prepare the aggregated dataset, you could make good use of dplyr::summarise(). I don't know your level in R, but if you struggle and have no idea of what I am talking about, I will show you and example if you send me a short version of your input data.

An alternative to creating the aggregated table by hand is that you select only one tissue (e.g. P1), run the usual workflow and then rerun the workflow independently for P2 and nail.
You should obtain similar results unless the tissue matters, but if the tissue matters, then the first approach sketched above would be problematic.

In general, how to best deal with multiple measurements per individuals, is a complex topic.
But since you have good biological reason to expect the same response here in 3 inert tissues, and since your design is hopefully balanced, I would go for averages (i.e. solution 1) and counting trios of measurements as a single datum.

Let us know how it goes.

Best,

Alex

McKenzie Boyer

unread,
Mar 4, 2024, 4:06:50 PMMar 4
to IsoriX
Hi Alex! 

I went ahead and tried to manually create the data set instead of prepsources using the dplyr package and it did run, however; the isofit function gives the following error: "The dataset seem to contain null or negative value for 'var_source_value'"
I do have "NA"s in the var_source_value, and I have tried changing them to zeros, writing code to ignore the NA and nothing seems to work and the error persists. 
Here is what my data looks like and my code I did. Is there anything jumping out at you that is wrong?
GNIPDataNAagg1 <- GNIPData %>%group_by(lat, long) %>%
  filter(month>8&month<11) %>%
  summarize(
    mean_source_value = mean(source_value, na.rm = TRUE),
    var_source_value = var(source_value, na.rm = TRUE),
    n_source_value = n()
  )
Thank you, 
McKenzie 

Alexandre Courtiol

unread,
Mar 4, 2024, 4:28:35 PMMar 4
to IsoriX
Hi McKenzie,

You have done the right thing but since you ask, many things do jump out to me ;-)
(used old pipes rather than |>, filtering after grouping which slows things down without changing the outcome, not ungrouping at the end which may trigger issues for future operations, using the outdated group_by functions rather than the .by argument within summarize)
but nothing is "wrong" and you can keep the code as it is.

The real issue is that you must have locations with a single measurement, which surprises me since I thought you had at least 3 measurements per bird.

If that is expected, a simple workaround implemented in prepsources() is that IsoriX transforms the variances of 0 into 0.01, and after doing so it displays:
"Null variances were obtained during aggregation. They were changed to 0.01 assuming that the actual variance cannot be smaller than the measurement error variance."

So I would modify things as follows:

GNIPData |>
  filter(month > 8 & month < 11) |>
  summarize(
    mean_source_value = mean(source_value, na.rm = TRUE),
    var_source_value = var(source_value, na.rm = TRUE),
    n_source_value = n(), .by = c("long", "lat")) |>
  mutate(var_source_value = ifelse(is.na(var_source_value), 0.01, var_source_value))

The reason IsoriX "wants" non-zero variances is that it tries to estimate how the variation of the isotopic signature varies in space (no typo: a variation of a variation), since in some locations on earth there is more natural variability than in others.
It is expected that there is always some variability but this cannot be estimated if there is no replicate at all in a given location.

As long as those situations are rare, it should not be a problem, since the 0.01 will be associated with a n_source_value of 1 and will thus not weigh much in the analysis.

I hope this help.

++

Alex

McKenzie Boyer

unread,
Mar 5, 2024, 2:44:48 AMMar 5
to IsoriX
Hi Alex!

Yes that code works better thank you! I think there was miscommunication on my part I'm sorry! I am on chapter 3 on bookdown and I am using a dataset that I downloaded from the GNIP website and I am using this as the data to build my standard isoscape to get the mean H2 values in Canada and North America throughout the decades. Once I have the mean H2 and calibrate it, I am then using my dataset of the birds (1 bird has 3 tissue samples tested to get 3 separate H2 values). 
When I use the code above, I do get all the mean, var and n values, but I am still getting the error about the null or negative values in the var column. 
I uploaded the GNIP data and have them separated into the following columns, source_id, lat, long, elev, year, month and source_value. 
Have you ever seen this error before? 
Error in .prepare_data_sources(data) :
  The dataset seem to contain null or negative value for 'var_source_value'.

Thank you again for all your help this is all new to me so I am trying my best! 

Alexandre Courtiol

unread,
Mar 6, 2024, 2:21:03 AMMar 6
to IsoriX
Hi McKenzie,
Indeed you lost me... If you are using GNIP data, then prepsources() should work as the normal workflow should work.
So please send me you raw data file for the GNIP data to my private email and I will have a look.
++

McKenzie Boyer

unread,
Apr 15, 2024, 3:23:56 PMApr 15
to IsoriX
Hi Alexandre! 

I was able to get the package to work in the end and produce some good hydrogen isoscapes! I was just wondering if there is a way to crop the maps to where we can zoom in onto specific/probably areas? For example I made my maps but I know by birds can only come from a subset of Canada so I would like to crop out the portion of interest. 
 I have tried the zoom and cropping functions but nothing seems to work!


Thank you for all your help! 

Alexandre Courtiol

unread,
Apr 17, 2024, 5:55:19 AMApr 17
to IsoriX
Hi McKenzie,

It is not clear to me if you want to crop or if you want to mask some areas, both things are possible in IsoriX.
For cropping, this is most easily done at the prepraster() stage: just the argument "manual_crop" and provide your 4 coordinates as a vector (see ?prepraster); but you can also crop at the final stage (see example below).
For masking, when you call plot() on an object created by isofind, there is an argument called mask2 and you can add any SpatVector object to define the mask.
You can define the latter in R directly or you can create one in R using the function vect from terra (https://search.r-project.org/CRAN/refmans/terra/html/vect.html) based on any shape file produced in any GIS software.
Often IsoriX user download shape files from the IUCN: https://www.iucnredlist.org/resources/spatial-data-download

#### Example

library(IsoriX)

# Example from ?isofind ---------------------------------------------------

GNIPDataDEagg <- prepsources(data = GNIPDataDE)

GermanFit <- isofit(
  data = GNIPDataDEagg,
  mean_model_fix = list(elev = TRUE, lat_abs = TRUE)
)

## We build the isoscape
GermanScape <- isoscape(
  raster = ElevRasterDE,
  isofit = GermanFit
)

## We fit the calibration model
CalibAlien <- calibfit(
  data = CalibDataAlien,
  isofit = GermanFit
)

## We perform the assignment on land only
AssignmentDry <- isofind(
  data = AssignDataAlien,
  isoscape = GermanScape,
  calibfit = CalibAlien
)

## Simple plot
plot(AssignmentDry, who = "group")


# Cropping assignment maps ---------------------------------------------------

## Define region to keep
coord_after_croping <- terra::ext(5.50207, 15.50583, 47.00592, 53) #long1 long2 lat1 lat2

## Cropping
croppedAssignmentDry <- AssignmentDry
croppedAssignmentDry$group$pv <- terra::crop(AssignmentDry$group$pv, coord_after_croping)
croppedAssignmentDry$sample$pv <- terra::crop(AssignmentDry$sample$pv, coord_after_croping)

## Plotting cropped map
plot(croppedAssignmentDry, who = "group")
plot(croppedAssignmentDry, who = "Alien_1")

# Masking assignment maps ---------------------------------------------------

## Simple creation of a spatial vector on the file
my_mask <- terra::vect("POLYGON ((8 49, 7 50, 8 51, 10 50.5, 11 51, 12 50, 12 49, 10 49.5, 8 49))")

## Import shape file (here from: https://download.geofabrik.de/europe/germany/brandenburg.html)
polygon_object <- terra::vect("~/Downloads/brandenburg-latest-free.shp/gis_osm_landuse_a_free_1.shp")
my_mask2 <- terra::aggregate(polygon_object) ## the shape file as > 270000 geometries, so we merge them all (slow)

## Plot cropped map with custom masks
plot(croppedAssignmentDry, who = "group", mask2 = list(mask = my_mask))
plot(croppedAssignmentDry, who = "group", mask2 = list(mask = my_mask2))

++
Reply all
Reply to author
Forward
0 new messages