THOR File Truncation Error and Stuck at Extension Sizes Step

12 views
Skip to first unread message

Phoebe Valdes

unread,
Jun 9, 2025, 1:15:16 PMJun 9
to RGT Users
Hello RGT Developers,

I am using your differential peak calling program, THOR for the first time which has been a great tool for my ChIP-seq analysis and I've been running into some issues. I either get stuck at the "Computing read extension sizes for ChIP-seq profiles" or get an error as the following:

Call DPs on whole genome.

Computing read extension sizes for ChIP-seq profiles

[W::hts_idx_load3] The index file is older than the data file: /tscc/lustre/ddn/scratch/prvaldes/ROSMAP/ChIP-seq/THOR/Cluster2vs.Cluster0.F/R1073074.bb.mapped.sorted.md.bam.bai

[W::hts_idx_load3] The index file is older than the data file: /tscc/lustre/ddn/scratch/prvaldes/ROSMAP/ChIP-seq/THOR/Cluster2vs.Cluster0.F/R1073074.bb.mapped.sorted.md.bam.bai

[W::hts_idx_load3] The index file is older than the data file: /tscc/lustre/ddn/scratch/prvaldes/ROSMAP/ChIP-seq/THOR/Cluster2vs.Cluster0.F/R1133844.bb.mapped.sorted.md.bam.bai

[W::hts_idx_load3] The index file is older than the data file: /tscc/lustre/ddn/scratch/prvaldes/ROSMAP/ChIP-seq/THOR/Cluster2vs.Cluster0.F/R1133844.bb.mapped.sorted.md.bam.bai

Traceback (most recent call last):

  File "/tscc/nfs/home/prvaldes/anaconda3/bin/rgt-THOR", line 8, in <module>

    sys.exit(main())

             ^^^^^^

  File "/tscc/nfs/home/prvaldes/anaconda3/lib/python3.11/site-packages/rgt/THOR/THOR.py", line 155, in main

    m, exp_data, func_para, init_mu, init_alpha, distr = train_HMM(region_giver, options, bamfiles, genome,

                                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/tscc/nfs/home/prvaldes/anaconda3/lib/python3.11/site-packages/rgt/THOR/THOR.py", line 63, in train_HMM

    exp_data = initialize(name=options.name, dims=dims, genome_path=genome, regions=train_regions,

               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/tscc/nfs/home/prvaldes/anaconda3/lib/python3.11/site-packages/rgt/THOR/dpc_help.py", line 435, in initialize

    exts, exts_inputs = _compute_extension_sizes(bamfiles, exts, inputs, exts_inputs, report)

                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/tscc/nfs/home/prvaldes/anaconda3/lib/python3.11/site-packages/rgt/THOR/dpc_help.py", line 396, in _compute_extension_sizes

    e, ext_data = get_extension_size(bamfile, start=start, end=end, stepsize=ext_stepsize)

                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/tscc/nfs/home/prvaldes/anaconda3/lib/python3.11/site-packages/rgt/THOR/get_extension_size.py", line 102, in get_extension_size

    read_length = math.ceil(get_read_size(filename))

                            ^^^^^^^^^^^^^^^^^^^^^^^

  File "/tscc/nfs/home/prvaldes/anaconda3/lib/python3.11/site-packages/rgt/THOR/get_extension_size.py", line 52, in get_read_size

    f = pysam.Samfile(filename, "rb")

        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "pysam/libcalignmentfile.pyx", line 748, in pysam.libcalignmentfile.AlignmentFile.__cinit__

  File "pysam/libcalignmentfile.pyx", line 958, in pysam.libcalignmentfile.AlignmentFile._open

  File "pysam/libchtslib.pyx", line 361, in pysam.libchtslib.HTSFile.check_truncation

OSError: no BGZF EOF marker; file may be truncated

Would you happen to know how this resolve these issues? 


Also I installed THOR from the RGT package version 1.0.2 using the following pip command in case installation could be causing my issues above:

(base) [prvaldes@login2 ~]$ pip install RGT
Collecting RGT
#Downloading RGT-1.0.2.tar.gz (36.8 MB)   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 36.8/36.8 MB 9.1 MB/s eta 0:00:00
Successfully installed Biopython-1.85 HTSeq-2.0.9 RGT-1.0.2 adjustText-1.3.0 fisher-0.1.14 hmmlearn-0.2.2 logomaker-0.8.7 matplotlib_venn-1.1.2 moods-python-1.9.4.1 natsort-8.4.0 pyBigWig-0.3.24 pyx-0.16

Finally these are the options I am using to run THOR: 

/tscc/nfs/home/prvaldes/anaconda3/bin/rgt-THOR THOR.C0.C2.F.config -n c2vsc0.F --merge --no-merge-bin --output-dir /tscc/nfs/home/prvaldes/scratch_new/ROSMAP/ChIP-seq/THOR/Output/Cluster2vs.Cluster0.F --report --pvalue=0.05

Thank you for your help!

Phoebe

Reply all
Reply to author
Forward
0 new messages