Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Number of genome features

148 views
Skip to first unread message

Dylan Feldner

unread,
Dec 9, 2024, 9:28:18 AM12/9/24
to Genomes to Fields (G2F) Genotype by Environment Prediction Competition
Hi all,

This competition data has ~2000 polymorphisms characterising each hybrid, but the previous G2F competition had around 400 000. Isn't genomic feature selection part of the fun? Any chance we could get the full genomic dataset?

Best,
Dylan

Maize GxE Prediction

unread,
Dec 9, 2024, 9:32:06 AM12/9/24
to Genomes to Fields (G2F) Genotype by Environment Prediction Competition
Hi Dylan,

After last year's competition there was significant feedback saying that the number of SNPs was far too high and a set of a few thousand would be preferred.  Many options were discussed but this is the one we landed on.

Xing Wu

unread,
Dec 18, 2024, 9:06:30 AM12/18/24
to Genomes to Fields (G2F) Genotype by Environment Prediction Competition
Hi, 

Only 2000 SNPs are not enough to make meaningful genomic predictions. Is it possible to provide the full genomic dataset as a separate resource for people who are interested in it (a separate file in the training / testing folder) 

Best

Xing

Maize GxE Prediction

unread,
Dec 18, 2024, 9:23:06 AM12/18/24
to Genomes to Fields (G2F) Genotype by Environment Prediction Competition
Hi Xing,

We understand that some participants would prefer a larger number of SNPs, but we are confident based on testing that good predictions can be made with this small set. Although it is not full rank there are many methods that can deal with this.

As mentioned previously, the decision to use this smaller set was made based on substantial feedback from our previous competition. At this point, it would be unfair to the participants who have already submitted predictions if we changed or added to the provided data.  

Thank you for your understanding and participation.

Mª Cinta Romay

unread,
Dec 18, 2024, 12:02:24 PM12/18/24
to Genomes to Fields (G2F) Genotype by Environment Prediction Competition

Hi Xing,


If you really want a denser dataset, the data for inbreds and hybrids for the 2014-2023 germplasm and previous competition is all publicly available at a higher SNP density, and as such it can be used for the competition. The raw reads for the 2024 germplasm are also publicly available under PRJNA1142968 on SRA, and you could call SNPs against known positions using the data. This is all information available in the README.


Thanks,


Cinta

DAYANE CRISTINA LIMA

unread,
Dec 19, 2024, 9:04:55 AM12/19/24
to Genomes to Fields (G2F) Genotype by Environment Prediction Competition
Genomes to Fields inbred genotypic data from 2014 to 2023: https://doi.org/10.25739/ragt-7213
2014-2023 G2F hybrids used in the past competition: https://doi.org/10.25739/tq5e-ak26
Reply all
Reply to author
Forward
0 new messages