Dear UCSC Genome Browser Team,
Good day. We are writing to seek your expert advice on a coordinate conversion problem I am facing with chicken genomic data.
Several years ago, I genotyped a population of chickens using the Axiom 600K SNP genotyping array, which is based on the GRCg6a reference genome. More recently, I have generated whole-genome sequencing data from the same population, aligned to the GRCg7b reference genome, and extracted variants (including SNPs). My goal is to identify which SNPs from the original 600K array are present in the new WGS dataset, so that I can integrate both datasets for downstream analyses such as genomic prediction and GWAS.
I am seeking your guidance on the most appropriate method to:
Lift the 580K SNP positions from the 600K array (GRCg6a coordinates) to the GRCg7b reference genome.
Identify overlapping SNPs between the lifted array positions and my WGS‑derived variant set.
As of now now I have my files formatted as seen below
600k SNP (GRCg6a based) file
1 310223 310224 AX-75442064
1 312099 312100 AX-75445978
1 313093 313094 AX-75447879
1 316523 316524 AX-75454925
1 317114 317115 AX-75456096
.map file of the WGS based SNPs (GRCg7a reference)
1 1:13695:A:G 0 13695
1 1:14293:C:T 0 14293
1 1:14304:A:T 0 14304
1 1:16286:C:T 0 16286
1 1:16506:C:T 0 16506
1 1:20203:C:T 0 20203
1 1:20211:A:G 0 20211
1 1:20737:A:T 0 20737
I have already attempted some conversions using existing chain file (galGal6ToGCF_016699485.2.over.chain), but the success rate of overlapping SNPs was quite low which made me to believe that i might have done something wrong.
Regards
Samkelo Motsa