Hello Brian.
Thank you for submitting your question, I think we can help you out.
First I’d like to comment that if you’re lifting your file from hg19 to hg38 just to view the file in the browser, you can select the hg19 assembly from the "assembly" drop-down at the top of the hgCustom page: http://genome.ucsc.edu/cgi-bin/hgCustom, and load your VCF onto hg19 directly, without lifting first.
As far as using crossmap, your reference genome used should be the sequence of the target assembly, in this case hg38. You can see some examples here (also see note #2 below example):
http://crossmap.sourceforge.net/#convert-vcf-format-files
This would mean that instead of using the “allHG19files.fa” you were putting together, you would use an equivalent for hg38 which you can find on our servers as “hg38.fa.gz":
http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz
This should allow crossmap to successfully lift your VCF to hg38.
There is another possibility if this doesn’t work, the sequence names in your VCF CHROM column may be formatted ‘1’, ‘2’, ‘3’, etc., while our files use the ‘chr1’, ‘chr2’, ‘chr3’ format. If so, you would simply need to change that first column to match.
If the problem persists or any other issues arise, please feel free to message us back and include a snippet of your VCF file as well to help us troubleshoot. Please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.
--
---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To post to this group, send email to gen...@soe.ucsc.edu.
Visit this group at https://groups.google.com/a/soe.ucsc.edu/group/genome/.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/c940f506-b607-20f0-d4e0-3c8485b8337d%40gmail.com.
For more options, visit https://groups.google.com/a/soe.ucsc.edu/d/optout.
Sorry, I didn't explain why I need the hg19 exome file in the
hg38 format. This is because the human genome file that matches it
was done later with the new build. I want to have both tracks in
one. So I have to convert one of them.
The first run gave me this. (I got the error when I concatenated
the hg19 files I downloaded. I had to unpack them and then use
bgzip. I think someone putting those out on your server made a
mistake and didn't use bgzip.)
[E::fai_build3_core] Cannot index files compressed with gzip,
please use bgzip ... Could not build fai index hg38.fa.gz.fai\n'
So I unzipped, then rezipped with bgzip. Same problem.
@ 2018-08-16 15:13:09: Read chain_file: hg19ToHg38.over.chain.gz
@ 2018-08-16 15:13:09: Creating index for hg38.fa.gz
@ 2018-08-16 15:13:31: Updating contig field ...
@ 2018-08-16 15:13:32: Total entries: 351912
@ 2018-08-16 15:13:32: Failed to map: 351912
I checked the format and it's chr(n) format.
Example of format:
#CHROM POS ID REF ALT QUAL FILTER INFO
FORMAT 1657261
chr1 13116 . T G 144 PASS
AC=4;AC1=4;AF=1.00;AF1=1;AN=4;DP=40;DP4=0,0,25,0;FQ=-66;FS=0.000;MLEAC=4;MLEAF=1.00;MQ0=0;QD=30.67;VDB=2.371346e-01;VQSLOD=-3.213e+01;culprit=MQ;set=Samtools-filterInHaplotypeCaller
GT:DP:GQ:PL 1/1:12:75:85,36,0
chr1 13118 . A G 144 PASS
AC=4;AC1=4;AF=1.00;AF1=1;AN=4;DP=41;DP4=0,0,25,0;FQ=-66;FS=0.000;MLEAC=4;MLEAF=1.00;MQ0=0;QD=26.61;VDB=2.297144e-01;VQSLOD=-3.170e+01;culprit=MQ;set=Samtools-filterInHaplotypeCaller
GT:DP:GQ:PL 1/1:12:75:85,36,0
VCF snippet attached.