Sequenza v4.0.0 Error/Issue Debugging help required

90 views
Skip to first unread message

NILESH MUKHERJEE

unread,
Mar 11, 2025, 2:01:55 AMMar 11
to Sequenza User Group
Hello Francesco,

First of all, thank you for your work on Sequenza Project.
I am reaching out to talk about the errors / issues i faced while working on Sequenza v4.0.0

After troubleshooting, I have successfully installed it in my Ubuntu system.
Here are my system specs, in case needed:
# System Details Report
## Hardware Information:
- **Hardware Model:**                              ASUS PRIME B560M-A
- **Memory:**                                      64.0 GiB
- **Processor:**                                   11th Gen Intel® Core™ i5-11400F × 12
- **Graphics:**                                    NV106
- **Disk Capacity:**                               4.0 TB

## Software Information:
- **Firmware Version:**                            1410
- **OS Name:**                                     Ubuntu 24.04.2 LTS
- **OS Build:**                                    (null)
- **OS Type:**                                     64-bit
- **GNOME Version:**                               46
- **Windowing System:**                            X11
- **Kernel Version:**                              Linux 6.11.0-19-generic

I tried running the analysis on example file provided in the package.
But it lead to error. I have tried debugging and narrow down the issue.
The entire run: basic run on example file, along with debug runs are provided in the HTML file i have attached. 

Disclaimer: I am new to debugging R packages, and i wanted to provide you the entire run while making sure it was coherent, hence I made it into HTML.

Here is the entire Debug file breakdown, with line numbers for reference:
1-62: R session start in Valgrind with memory check tool, with session details
63-490: Sequenza example file run on default settings, leading to error #ERROR
491-627: traceback output of the error
628-631: checking if there is any data being assigned to containers$mutation.list or test variable, they had #ERROR Object not found
632-788: Debug #1 run on extract function with parallel as 1 to avoid parallel processing error if there is, turns out params was not found! (#ERROR)
789-791: Debug #2 run to check on extract_initialize_parameters to see what happened to params, got #ERROR Object not found
792-958: to understand what is happening inside sequenza.extract function
959-1142: Debug #3 
1143-1302: Inspection of all functions and objects in Sequenza namespace
1303-5739: Debug #4 FULL RUN

As you can see, I had to run the R session inside Valgrind Memory check terminal session because basic run was crashing and restarting R studio and was not allowing to me to run traceback()

Example 1 of R crashing in direct run: Segmentation fault / segfault
```
(base) bioinfo@bioinfo:~$ R

R version 4.4.3 (2025-02-28) -- "Trophy Case"
Copyright (C) 2025 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(sequenza)
> sessionInfo()
R version 4.4.3 (2025-02-28)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.12.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8  
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C      

time zone: Asia/Kolkata
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base    

other attached packages:
[1] sequenza_4.0.0

loaded via a namespace (and not attached):
 [1] vctrs_0.6.5        cli_3.6.4          rlang_1.1.5        stringi_1.8.4    
 [5] KernSmooth_2.23-26 mclust_6.1.1       generics_0.1.3     glue_1.8.0        
 [9] squash_1.0.9       pracma_2.4.4       hms_1.1.3          Runuran_0.40      
[13] grid_4.4.3         tibble_3.2.1       ks_1.14.3          tzdb_0.4.0        
[17] mvtnorm_1.3-3      lifecycle_1.0.4    stringr_1.5.1      compiler_4.4.3    
[21] dplyr_1.1.4        Rcpp_1.0.14        pkgconfig_2.0.3    pbapply_1.7-2    
[25] digest_0.6.37      lattice_0.22-5     iotools_0.3-5      R6_2.6.1          
[29] readr_2.1.5        tidyselect_1.2.1   seqminer_9.7       pillar_1.10.1    
[33] parallel_4.4.3     magrittr_2.0.3     Matrix_1.7-2       tools_4.4.3      
> test <- sequenza.extract("/home/bioinfo/R_libs/sequenza/extdata/example.seqz.txt.gz", assembly = "hg38")
Collecting GC information . done
                                                                             
Windows calculation results for chromosome 1:
  ratio entries: 498
  raw_ratio entries: 498
  BAF entries: 498
Segmenting depth ratios
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|

 *** caught segfault ***
address 0x70ec2a792fe8, cause 'invalid permissions'

 *** caught segfault ***
address 0x70ec2a792ff8, cause 'invalid permissions'

 *** caught segfault ***
address 0x70ec2a792fe8, cause 'invalid permissions'
Lost warning messages
Error: no more error handlers available (recursive errors?); invoking 'abort' restart
```
Example 2 of R crashing in direct run: R_Reprotect
```
(base) bioinfo@bioinfo:~$ R

R version 4.4.3 (2025-02-28) -- "Trophy Case"
Copyright (C) 2025 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(sequenza)
> sessionInfo()
R version 4.4.3 (2025-02-28)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.12.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8  
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C      

time zone: Asia/Kolkata
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base    

other attached packages:
[1] sequenza_4.0.0

loaded via a namespace (and not attached):
 [1] vctrs_0.6.5        cli_3.6.4          rlang_1.1.5        stringi_1.8.4    
 [5] KernSmooth_2.23-26 mclust_6.1.1       generics_0.1.3     glue_1.8.0        
 [9] squash_1.0.9       pracma_2.4.4       hms_1.1.3          Runuran_0.40      
[13] grid_4.4.3         tibble_3.2.1       ks_1.14.3          tzdb_0.4.0        
[17] mvtnorm_1.3-3      lifecycle_1.0.4    stringr_1.5.1      compiler_4.4.3    
[21] dplyr_1.1.4        Rcpp_1.0.14        pkgconfig_2.0.3    pbapply_1.7-2    
[25] digest_0.6.37      lattice_0.22-5     iotools_0.3-5      R6_2.6.1          
[29] readr_2.1.5        tidyselect_1.2.1   seqminer_9.7       pillar_1.10.1    
[33] parallel_4.4.3     magrittr_2.0.3     Matrix_1.7-2       tools_4.4.3      
> test <- sequenza.extract("/home/bioinfo/R_libs/sequenza/extdata/example.seqz.txt.gz", assembly = "hg38")
Collecting GC information . done
                                                                             
Windows calculation results for chromosome 1:
  ratio entries: 498
  raw_ratio entries: 498
  BAF entries: 498
Segmenting depth ratios
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Warning: Processing chromosome 1 failed: R_Reprotect: only -435 protected items, can't reprotect index -438

Windows calculation results for chromosome 2:
  ratio entries: 485
  raw_ratio entries: 485
  BAF entries: 485
Segmenting depth ratios
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Warning: Processing chromosome 2 failed: R_Reprotect: only -447 protected items, can't reprotect index -450

Windows calculation results for chromosome 3:
  ratio entries: 395
  raw_ratio entries: 395
  BAF entries: 395
Segmenting depth ratios
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Warning: Processing chromosome 3 failed: R_Reprotect: only -133 protected items, can't reprotect index -136

Windows calculation results for chromosome 4:
  ratio entries: 381
  raw_ratio entries: 381
  BAF entries: 381
Segmenting depth ratios
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Segmenting allele frequencies
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Warning: Processing chromosome 4 failed: R_Reprotect: only -235 protected items, can't reprotect index -238

Windows calculation results for chromosome 5:
  ratio entries: 361
  raw_ratio entries: 361
  BAF entries: 361
Segmenting depth ratios
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Segmenting allele frequencies
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Error in simpleError(msg, call) : could not find function "simpleError"
In addition: There were 50 or more warnings (use warnings() to see the first 50)

```
I also came across some more errors:
Error Ref#1: I have seen this error popping up near the end of a run
Screenshot from 2025-03-10 11-15-33.png
    Error Ref#2: I have seen this error happening sometimes right after GC Collecting information line.
    It used to happen a lot when I was trying to run scarHRD package which called the sequenza.extact internally and figured out that scarHRD has default value as "grch38" which I believe gets passed onto assembly variable in the sequenza.extract function, which is then used in the placeholder of the UCSC goldenpath url. When i do direct run on sequenza and provide assembly as "hg38", it works 8 out of 10 times and the 2 times it doesnt work, it gives the following error:
    Screenshot from 2025-03-10 11-24-40.png

    So far these are the information i have gathered. I believe:
    1. there is an internal function variable assignment error, probably the extract_initialize_parameters function may not be working as intended as params was coming NULL, which i think breaks the cascading process?
    2. When i tried local assignment of values to variables and proceeded forward with the sequenza.extract function execution in debug session, i noticed that when a chromosome was not having the required information, it was making the table NULL, which made the collected info on previous chromosomes NULL too. If I am not worng, you can see this happening in the Debug #4. I had info being collected for various chromosome and being added to containers$mutation.list but whenever the process came across a chromosome having a warning like:
    ```
    Warning: Processing chromosome 16 failed: cannot open the connection
    ```
    Tt made the previous collected info NULL too. 
    I believe the output of 
    containers$mutation.list at line 4060 shows what i am referring to.


    That is all I have to share. I apologise for the information dump.
    Please let me know if you need further information so that I can help in fixing this issue.
    Looking forward to your reply!
    Thank you for your time.

    Best Regards,
    Nilesh M.
    sequenza-extract_debug_console_record.html

    Francesco Favero

    unread,
    Mar 19, 2025, 7:43:16 PMMar 19
    to NILESH MUKHERJEE, Sequenza User Group
    Hi Nilesh,

    I’ve messed up adding some optimization working only in my laptop, which uses clang to compile the package.
    Unfortunately I haven’t realized it until now, when I had to run samples in our cluster :), and so I have to identify the breaking changes in the few commits I made last month.

    I’m fixing and testing on the latest version, hopefully it will be fully fixed in few days.

    Best regards

    Francesco

    <Screenshot from 2025-03-10 11-15-33.png>
      Error Ref#2: I have seen this error happening sometimes right after GC Collecting information line.
      It used to happen a lot when I was trying to run scarHRD package which called the sequenza.extact internally and figured out that scarHRD has default value as "grch38" which I believe gets passed onto assembly variable in the sequenza.extract function, which is then used in the placeholder of the UCSC goldenpath url. When i do direct run on sequenza and provide assembly as "hg38", it works 8 out of 10 times and the 2 times it doesnt work, it gives the following error:
      <Screenshot from 2025-03-10 11-24-40.png>

      So far these are the information i have gathered. I believe:
      1. there is an internal function variable assignment error, probably the extract_initialize_parameters function may not be working as intended as params was coming NULL, which i think breaks the cascading process?
      2. When i tried local assignment of values to variables and proceeded forward with the sequenza.extract function execution in debug session, i noticed that when a chromosome was not having the required information, it was making the table NULL, which made the collected info on previous chromosomes NULL too. If I am not worng, you can see this happening in the Debug #4. I had info being collected for various chromosome and being added to containers$mutation.list but whenever the process came across a chromosome having a warning like:
      ```
      Warning: Processing chromosome 16 failed: cannot open the connection
      ```
      Tt made the previous collected info NULL too. 
      I believe the output of 
      containers$mutation.list at line 4060 shows what i am referring to.


      That is all I have to share. I apologise for the information dump.
      Please let me know if you need further information so that I can help in fixing this issue.
      Looking forward to your reply!
      Thank you for your time.

      Best Regards,
      Nilesh M.

      --
      You received this message because you are subscribed to the Google Groups "Sequenza User Group" group.
      To unsubscribe from this group and stop receiving emails from it, send an email to sequenza-user-g...@googlegroups.com.
      To view this discussion visit https://groups.google.com/d/msgid/sequenza-user-group/fb6acf34-aef7-4523-b3e5-aa73b0b4a843n%40googlegroups.com.
      <Screenshot from 2025-03-10 11-15-33.png><sequenza-extract_debug_console_record.html><Screenshot from 2025-03-10 11-24-40.png>

      Reply all
      Reply to author
      Forward
      0 new messages