Hello. I am unable to run elprep sfm, due to a bufio panic error during the elprep filter step. I did not have issues with either the vcf-to-elsites command, nor the fasta-to-elfasta commands. Please see below.
$ which go && go version
~/.conda/envs/WES_dev2/bin/go
go version go1.17.6 linux/amd64
$ which elprep && elprep
~/.conda/envs/WES_dev2/bin/elprep
elprep version 5.1.3 compiled with go1.17.6
The command I used:
elprep sfm $sam_path $bam_path
--output-type bam
--replace-read-group "ID:group1 LB:1 PL:illumina PU:unit1 SM:sample"
--mark-duplicates
--mark-optical-duplicates $dupmetrics
--remove-duplicates
--sorting-order coordinate
--bqsr $cal_tbl_path
--known-sites $dbsnp_elsites
--target-regions $bed_file
--reference $elref_path
--haplotypecaller $vcf_out
--timed
Output from my terminal:
elprep version 5.1.3 compiled with go1.17.6 - see http://github.com/exascience/elprep for more information.
2024/03/21 10:44:31 Created log file at /home/areese/logs/elprep/elprep-2024-03-21-10-44-31-633572959-EDT.log
2024/03/21 10:44:31 Command line: [elprep sfm /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/debug.reads_to_hg38.P14.sam /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/HTB-2D-TR1_pilotWES_reads_to_hg38.P14.processed.bam --output-type bam --replace-read-group ID:group1 LB:1 PL:illumina PU:unit1 SM:sample --mark-duplicates --mark-optical-duplicates /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/reads_to_hg38.P14.markedDuplicatesMetrics.txt --remove-duplicates --sorting-order coordinate --bqsr /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/bqsr_recalibration.tbl --known-sites /home/shared/databases/dbsnp/Homo_sapiens_assembly38.dbsnp138.elsites --target-regions /home/shared/projects/WES/hg38_Twist_ILMN_Exome_2.0_Plus_Panel_Combined_Mito.UCSC.bed --reference /home/shared/projects/RNA-seq/references/UCSC/latest/hg38.P14.elfasta --haplotypecaller /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/HTB-2D-TR1_pilotWES_reads_to_hg38.P14.vcf --timed]
2024/03/21 10:44:31 Executing command:
elprep sfm /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/debug.reads_to_hg38.P14.sam /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/HTB-2D-TR1_pilotWES_reads_to_hg38.P14.processed.bam --output-type bam --replace-read-group ID:group1 LB:1 PL:illumina PU:unit1 SM:sample --mark-duplicates --mark-optical-duplicates /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/reads_to_hg38.P14.markedDuplicatesMetrics.txt --optical-duplicates-pixel-distance 100 --remove-duplicates --bqsr /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/bqsr_recalibration.tbl --reference /home/shared/projects/RNA-seq/references/UCSC/latest/hg38.P14.elfasta --quantize-levels 0 --max-cycle 500 --known-sites /home/shared/databases/dbsnp/Homo_sapiens_assembly38.dbsnp138.elsites --target-regions /home/shared/projects/WES/hg38_Twist_ILMN_Exome_2.0_Plus_Panel_Combined_Mito.UCSC.bed --haplotypecaller /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/HTB-2D-TR1_pilotWES_reads_to_hg38.P14.vcf --sorting-order coordinate --timed --intermediate-files-output-prefix debug.reads_to_hg38.P14 --intermediate-files-output-type sam
2024/03/21 10:44:31 Splitting...
2024/03/21 10:53:23 Filtering (phase 1)...
2024/03/21 10:53:24 exit status 2
2024/03/21 10:53:23 Created log file at /home/areese/logs/elprep/elprep-2024-03-21-10-53-23-776329743-EDT.log
2024/03/21 10:53:23 Command line: [elprep filter /home/shared/repos/RNA-seq_develop/elprep-splits-f58c8369-975d-439f-8426-5b564ed24d3b/splits/debug.reads_to_hg38.P14-unmapped.sam /home/shared/repos/RNA-seq_develop/elprep-splits-processed-f58c8369-975d-439f-8426-5b564ed24d3b/debug.reads_to_hg38.P14-unmapped.sam --replace-read-group ID:group1 LB:1 PL:illumina PU:unit1 SM:sample --mark-duplicates --remove-duplicates --reference /home/shared/projects/RNA-seq/references/UCSC/latest/hg38.P14.elfasta --max-cycle 500 --known-sites /home/shared/databases/dbsnp/Homo_sapiens_assembly38.dbsnp138.elsites --sorting-order coordinate --timed --pg-cmd-line elprep sfm /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/debug.reads_to_hg38.P14.sam /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/HTB-2D-TR1_pilotWES_reads_to_hg38.P14.processed.bam --output-type bam --replace-read-group ID:group1 LB:1 PL:illumina PU:unit1 SM:sample --mark-duplicates --mark-optical-duplicates /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/reads_to_hg38.P14.markedDuplicatesMetrics.txt --optical-duplicates-pixel-distance 100 --remove-duplicates --bqsr /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/bqsr_recalibration.tbl --reference /home/shared/projects/RNA-seq/references/UCSC/latest/hg38.P14.elfasta --quantize-levels 0 --max-cycle 500 --known-sites /home/shared/databases/dbsnp/Homo_sapiens_assembly38.dbsnp138.elsites --target-regions /home/shared/projects/WES/hg38_Twist_ILMN_Exome_2.0_Plus_Panel_Combined_Mito.UCSC.bed --haplotypecaller /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/HTB-2D-TR1_pilotWES_reads_to_hg38.P14.vcf --sorting-order coordinate --timed --intermediate-files-output-prefix debug.reads_to_hg38.P14 --intermediate-files-output-type sam --bqsr-tables-only /home/shared/repos/RNA-seq_develop/elprep-tabs-f58c8369-975d-439f-8426-5b564ed24d3b/debug.reads_to_hg38.P14-unmapped.sam.elrecal --mark-optical-duplicates-intermediate /home/shared/repos/RNA-seq_develop/elprep-metrics-f58c8369-975d-439f-8426-5b564ed24d3b/debug.reads_to_hg38.P14-unmapped.sam --optical-duplicates-pixel-distance 100 --target-regions /home/shared/projects/WES/hg38_Twist_ILMN_Exome_2.0_Plus_Panel_Combined_Mito.UCSC.bed]
2024/03/21 10:53:24 bufio.Scanner: token too long
panic: bufio.Scanner: token too long
goroutine 1 [running]:
log.Panic({0xc0001c5458, 0xc002331890, 0xc002333750})
/opt/conda/conda-bld/elprep_1651164620400/_build_env/go/src/log/log.go:354 +0x65
github.com/exascience/elprep/v5/bed.ParseBed({0x7ffd3bb9ba8c, 0xc00011e030})
/opt/conda/conda-bld/elprep_1651164620400/work/bed/bed-files.go:57 +0x56b
github.com/exascience/elprep/v5/cmd.Filter()
/opt/conda/conda-bld/elprep_1651164620400/work/cmd/filter.go:738 +0x353f
main.main()
/opt/conda/conda-bld/elprep_1651164620400/work/main.go:67 +0x245
Looks like it is happening during bed file parsing. Does that need to be reformatted into an elprep specific file type as well?
Thanks.