Hello Meren,
hmmscan works (output all the way down bellow).
However, I think I may have narrowed down the problem.
The segfaults are not really consistent (see output bellow):
I can't get anvi-gen-contigs-database (and other scripts) to behave properly.
It is almost as if
anvi-gen-contigs-database is calling something, a dependency or a library, and it has two (maybe more?) options, and every time it just randomly select one of the options it has.
If
anvi-gen-contigs-database works, and it creates the contig.db, I can use it downstream, but then I eventually also get the same behavior:
sometimes, it works, sometimes it just shows segmentation fault (sometimes it seems like it worked, and ends up showing the segfault message at the end).
I saw this behavior happening for:
anvi-gen-contigs-database
anvi-run-hmms
anvi-db-info
(and I tried adding --debug on the above scripts... same behavior, but more output... ).
however, for:
anvi-profile
anvi-import-taxonomy-for-layers
anvi-merge
although sometimes they throws the segfault directly, it seems a much rarer occurrence. Most often it seems like it runs, and just present segfault at the end:
* Happy 😇
✓ anvi-profile took 0:00:11.749114
Segmentation fault
or:
* t_genus (w/2 items)
* t_family (w/2 items)
* t_species (w/3 items)
* Happy ☘
Segmentation fault
And
anvi-import-functions
never seems to give segfaults.
Again, no python wizard here, but it doesn't seem to be an outside program (at least I can call version/help for them all from the command line), but something in they python is dealing with the code and dependencies/modules (isn't the conda env supposed to keep things nice and self-contained anyway?)
Hopefully these tests can help shed some light onto it.
Let me know what I need to do now and will do it.
Thanks!
Chris
### series of tests, running independently the program calls from ./run_mini_test.sh
### sometimes I try using different -num-threads, because it seemed it was related to the issue... but I'm not sure about that
cd ~/anvioTestOutput
export output_dir=/home/iaf/christian.hoffmann/anvioTestOutput/test-output
export files=/home/iaf/christian.hoffmann/github/anvio/anvio/tests/sandbox
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-gen-contigs-database -f $files/contigs.fa -o /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS.db -L 1000 --project-name "Contigs DB for anvi'o mini self-test" --num-threads 1
Segmentation fault
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-gen-contigs-database -f $files/contigs.fa -o /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS.db -L 1000 --project-name "Contigs DB for anvi'o mini self-test" --num-threads 24
Segmentation fault
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-gen-contigs-database -f $files/contigs.fa -o /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS.db -L 1000 --project-name "Contigs DB for anvi'o mini self-test" --num-threads 1
Segmentation fault
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-gen-contigs-database -f $files/contigs.fa -o /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS5.db -L 1000 --project-name "Contigs DB for anvi'o mini self-test" --num-threads 1
Input FASTA file .............................: /home/iaf/christian.hoffmann/github/anvio/anvio/tests/sandbox/contigs.fa
Name .........................................: Contigs DB for anvi'o mini self-test
Description ..................................: No description is given
Num threads for gene calling .................: 1
Finding ORFs in contigs
===============================================
Genes ........................................: /home/iaf/christian.hoffmann/tmp/tmpbi59hsct/contigs.genes
Amino acid sequences .........................: /home/iaf/christian.hoffmann/tmp/tmpbi59hsct/contigs.amino_acid_sequences
Log file .....................................: /home/iaf/christian.hoffmann/tmp/tmpbi59hsct/00_log.txt
CITATION
===============================================
Anvi'o will use 'prodigal' by Hyatt et al (doi:10.1186/1471-2105-11-119) to
identify open reading frames in your data. When you publish your findings,
please do not forget to properly credit their work.
Result .......................................: Prodigal (v2.6.3) has identified 51 genes.
CONTIGS DB CREATE REPORT
===============================================
Split Length .................................: 1,000
K-mer size ...................................: 4
Skip gene calling? ...........................: False
External gene calls provided? ................: False
Ignoring internal stop codons? ...............: False
Splitting pays attention to gene calls? ......: True
Contigs with at least one gene call ..........: 6 of 6 (100.0%)
Contigs database .............................: A new database, /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS5.db, has been created.
Number of contigs ............................: 6
Number of splits .............................: 38
Total number of nucleotides ..................: 57,030
Gene calling step skipped ....................: False
Splits broke genes (non-mindful mode) ........: False
Desired split length (what the user wanted) ..: 1,000
Average split length (what anvi'o gave back) .: 1,620
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> ls test-output/
CONTIGS5.db contigs.fa contigs-reformat-report.txt SAMPLE-01.bam SAMPLE-01.bam.bai SAMPLE-02.bam SAMPLE-02.bam.bai SAMPLE-03.bam SAMPLE-03.bam.bai
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-gen-contigs-database -f $files/contigs.fa -o /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS.db -L 1000 --project-name "Contigs DB for anvi'o mini self-test" --num-threads 1
Input FASTA file .............................: /home/iaf/christian.hoffmann/github/anvio/anvio/tests/sandbox/contigs.fa
Name .........................................: Contigs DB for anvi'o mini self-test
Description ..................................: No description is given
Num threads for gene calling .................: 1
Finding ORFs in contigs
===============================================
Genes ........................................: /home/iaf/christian.hoffmann/tmp/tmpwzmsq3pi/contigs.genes
Amino acid sequences .........................: /home/iaf/christian.hoffmann/tmp/tmpwzmsq3pi/contigs.amino_acid_sequences
Log file .....................................: /home/iaf/christian.hoffmann/tmp/tmpwzmsq3pi/00_log.txt
CITATION
===============================================
Anvi'o will use 'prodigal' by Hyatt et al (doi:10.1186/1471-2105-11-119) to
identify open reading frames in your data. When you publish your findings,
please do not forget to properly credit their work.
Result .......................................: Prodigal (v2.6.3) has identified 51 genes.
CONTIGS DB CREATE REPORT
===============================================
Split Length .................................: 1,000
K-mer size ...................................: 4
Skip gene calling? ...........................: False
External gene calls provided? ................: False
Ignoring internal stop codons? ...............: False
Splitting pays attention to gene calls? ......: True
Contigs with at least one gene call ..........: 6 of 6 (100.0%)
Contigs database .............................: A new database, /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS.db, has been created.
Number of contigs ............................: 6
Number of splits .............................: 38
Total number of nucleotides ..................: 57,030
Gene calling step skipped ....................: False
Splits broke genes (non-mindful mode) ........: False
Desired split length (what the user wanted) ..: 1,000
Average split length (what anvi'o gave back) .: 1,620
####### running on a contig.db that was created with no apparent errors:
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-run-hmms -c /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS6.db --num-threads 1 --just-do-it
WARNING
===============================================
Previous entries for "Ribosomal_RNAs" is being removed from "hmm_hits_info,
hmm_hits, hmm_hits_in_splits, genes_in_contigs, gene_functions"
WARNING
===============================================
Previous entries for "Bacteria_71" is being removed from "hmm_hits_info,
hmm_hits, hmm_hits_in_splits, genes_in_contigs, gene_functions"
WARNING
===============================================
Previous entries for "Archaea_76" is being removed from "hmm_hits_info,
hmm_hits, hmm_hits_in_splits, genes_in_contigs, gene_functions"
WARNING
===============================================
Previous entries for "Protista_83" is being removed from "hmm_hits_info,
hmm_hits, hmm_hits_in_splits, genes_in_contigs, gene_functions"
Target found .................................: AA:GENE
Target found .................................: RNA:CONTIG
HMM Profiling for Ribosomal_RNAs
===============================================
Kind .........................................: Ribosomal_RNAs
Alphabet .....................................: RNA
Context ......................................: CONTIG
Domain .......................................: N/A
HMM model path ...............................: /home/iaf/christian.hoffmann/tmp/tmpjcmyqv6y/Ribosomal_RNAs.hmm
Number of genes in HMM model .................: 12
Noise cutoff term(s) .........................: --cut_ga
Number of CPUs will be used for search .......: 1
HMMer program used for search ................: nhmmscan
Temporary work dir ...........................: /home/iaf/christian.hoffmann/tmp/tmp1ro2l74d
Log file for thread 0 ........................: /home/iaf/christian.hoffmann/tmp/tmp1ro2l74d/RNA_contig_sequences.fa.0_log
Number of raw hits ...........................: 0
* The HMM source 'Ribosomal_RNAs' returned 0 hits. SAD (but it's stil OK).
HMM Profiling for Protista_83
===============================================
Kind .........................................: singlecopy
Alphabet .....................................: AA
Context ......................................: GENE
Domain .......................................: eukarya
HMM model path ...............................: /home/iaf/christian.hoffmann/tmp/tmpjcmyqv6y/Protista_83.hmm
Number of genes in HMM model .................: 83
Noise cutoff term(s) .........................: -E 1e-25
Number of CPUs will be used for search .......: 1
HMMer program used for search ................: hmmscan
Temporary work dir ...........................: /home/iaf/christian.hoffmann/tmp/tmpdzy71m4f
Log file for thread 0 ........................: /home/iaf/christian.hoffmann/tmp/tmpdzy71m4f/AA_gene_sequences.fa.0_log
Number of raw hits ...........................: 0
* The HMM source 'Protista_83' returned 0 hits. SAD (but it's stil OK).
HMM Profiling for Bacteria_71
===============================================
Kind .........................................: singlecopy
Alphabet .....................................: AA
Context ......................................: GENE
Domain .......................................: bacteria
HMM model path ...............................: /home/iaf/christian.hoffmann/tmp/tmpjcmyqv6y/Bacteria_71.hmm
Number of genes in HMM model .................: 71
Noise cutoff term(s) .........................: --cut_ga
Number of CPUs will be used for search .......: 1
HMMer program used for search ................: hmmscan
Temporary work dir ...........................: /home/iaf/christian.hoffmann/tmp/tmpdzy71m4f
Log file for thread 0 ........................: /home/iaf/christian.hoffmann/tmp/tmpdzy71m4f/AA_gene_sequences.fa.0_log
Number of raw hits ...........................: 1
Number of weak hits removed ..................: 0
Number of hits in annotation dict ...........: 1
HMM Profiling for Archaea_76
===============================================
Kind .........................................: singlecopy
Alphabet .....................................: AA
Context ......................................: GENE
Domain .......................................: archaea
HMM model path ...............................: /home/iaf/christian.hoffmann/tmp/tmpjcmyqv6y/Archaea_76.hmm
Number of genes in HMM model .................: 76
Noise cutoff term(s) .........................: --cut_ga
Number of CPUs will be used for search .......: 1
HMMer program used for search ................: hmmscan
Temporary work dir ...........................: /home/iaf/christian.hoffmann/tmp/tmpdzy71m4f
Log file for thread 0 ........................: /home/iaf/christian.hoffmann/tmp/tmpdzy71m4f/AA_gene_sequences.fa.0_log
Number of raw hits ...........................: 1
Number of weak hits removed ..................: 0
Number of hits in annotation dict ...........: 1
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-run-hmms -c /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS6.db --num-threads 1 --just-do-it
Segmentation fault
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-db-info $output_dir/CONTIGS.db
Segmentation fault
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-db-info $output_dir/CONTIGS6.db
DB Info (no touch)
===============================================
Database Path ................................: /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS6.db
Description ..................................: [Not found, but it's OK]
Type .........................................: contigs
Version ......................................: 18
DB Info (no touch also)
===============================================
project_name .................................: Contigs DB for anvi'o mini self-test
contigs_db_hash ..............................: hashf0356823
split_length .................................: 1000
kmer_size ....................................: 4
num_contigs ..................................: 6
total_length .................................: 57030
num_splits ...................................: 38
gene_level_taxonomy_source ...................: None
gene_function_sources ........................: None
genes_are_called .............................: 1
external_gene_calls ..........................: 0
external_gene_amino_acid_seqs ................: 0
skip_predict_frame ...........................: 0
splits_consider_gene_calls ...................: 1
scg_taxonomy_was_run .........................: 0
scg_taxonomy_database_version ................: None
creation_date ................................: 1598448487.69134
* Please remember that it is never a good idea to change these values. But in some
cases it may be absolutely necessary to update something here, and a programmer
may ask you to run this program and do it. But even then, you should be
extremely careful.
AVAILABLE GENE CALLERS
===============================================
* 'prodigal' (51 gene calls)
AVAILABLE HMM SOURCES
===============================================
* 'Ribosomal_RNAs' (type: Ribosomal_RNAs; num genes: 12)
* 'Protista_83' (type: singlecopy; num genes: 83)
* 'Bacteria_71' (type: singlecopy; num genes: 71)
* 'Archaea_76' (type: singlecopy; num genes: 76)
Segmentation fault
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-db-info $output_dir/CONTIGS6.db
DB Info (no touch)
===============================================
Database Path ................................: /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS6.db
Description ..................................: [Not found, but it's OK]
Type .........................................: contigs
Version ......................................: 18
DB Info (no touch also)
===============================================
project_name .................................: Contigs DB for anvi'o mini self-test
contigs_db_hash ..............................: hashf0356823
split_length .................................: 1000
kmer_size ....................................: 4
num_contigs ..................................: 6
total_length .................................: 57030
num_splits ...................................: 38
gene_level_taxonomy_source ...................: None
gene_function_sources ........................: None
genes_are_called .............................: 1
external_gene_calls ..........................: 0
external_gene_amino_acid_seqs ................: 0
skip_predict_frame ...........................: 0
splits_consider_gene_calls ...................: 1
scg_taxonomy_was_run .........................: 0
scg_taxonomy_database_version ................: None
creation_date ................................: 1598448487.69134
* Please remember that it is never a good idea to change these values. But in some
cases it may be absolutely necessary to update something here, and a programmer
may ask you to run this program and do it. But even then, you should be
extremely careful.
AVAILABLE GENE CALLERS
===============================================
* 'prodigal' (51 gene calls)
AVAILABLE HMM SOURCES
===============================================
* 'Ribosomal_RNAs' (type: Ribosomal_RNAs; num genes: 12)
* 'Protista_83' (type: singlecopy; num genes: 83)
* 'Bacteria_71' (type: singlecopy; num genes: 71)
* 'Archaea_76' (type: singlecopy; num genes: 76)
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-db-info $output_dir/CONTIGS6.db
Segmentation fault
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-db-info $output_dir/CONTIGS6.db
DB Info (no touch)
===============================================
Database Path ................................: /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS6.db
Description ..................................: [Not found, but it's OK]
Type .........................................: contigs
Version ......................................: 18
DB Info (no touch also)
===============================================
project_name .................................: Contigs DB for anvi'o mini self-test
contigs_db_hash ..............................: hashf0356823
split_length .................................: 1000
kmer_size ....................................: 4
num_contigs ..................................: 6
total_length .................................: 57030
num_splits ...................................: 38
gene_level_taxonomy_source ...................: None
gene_function_sources ........................: None
genes_are_called .............................: 1
external_gene_calls ..........................: 0
external_gene_amino_acid_seqs ................: 0
skip_predict_frame ...........................: 0
splits_consider_gene_calls ...................: 1
scg_taxonomy_was_run .........................: 0
scg_taxonomy_database_version ................: None
creation_date ................................: 1598448487.69134
* Please remember that it is never a good idea to change these values. But in some
cases it may be absolutely necessary to update something here, and a programmer
may ask you to run this program and do it. But even then, you should be
extremely careful.
AVAILABLE GENE CALLERS
===============================================
* 'prodigal' (51 gene calls)
AVAILABLE HMM SOURCES
===============================================
* 'Ribosomal_RNAs' (type: Ribosomal_RNAs; num genes: 12)
* 'Protista_83' (type: singlecopy; num genes: 83)
* 'Bacteria_71' (type: singlecopy; num genes: 71)
* 'Archaea_76' (type: singlecopy; num genes: 76)
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-db-info $output_dir/CONTIGS6.db
DB Info (no touch)
===============================================
Database Path ................................: /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS6.db
Description ..................................: [Not found, but it's OK]
Type .........................................: contigs
Version ......................................: 18
DB Info (no touch also)
===============================================
project_name .................................: Contigs DB for anvi'o mini self-test
contigs_db_hash ..............................: hashf0356823
split_length .................................: 1000
kmer_size ....................................: 4
num_contigs ..................................: 6
total_length .................................: 57030
num_splits ...................................: 38
gene_level_taxonomy_source ...................: None
gene_function_sources ........................: None
genes_are_called .............................: 1
external_gene_calls ..........................: 0
external_gene_amino_acid_seqs ................: 0
skip_predict_frame ...........................: 0
splits_consider_gene_calls ...................: 1
scg_taxonomy_was_run .........................: 0
scg_taxonomy_database_version ................: None
creation_date ................................: 1598448487.69134
* Please remember that it is never a good idea to change these values. But in some
cases it may be absolutely necessary to update something here, and a programmer
may ask you to run this program and do it. But even then, you should be
extremely careful.
AVAILABLE GENE CALLERS
===============================================
* 'prodigal' (51 gene calls)
AVAILABLE HMM SOURCES
===============================================
* 'Ribosomal_RNAs' (type: Ribosomal_RNAs; num genes: 12)
* 'Protista_83' (type: singlecopy; num genes: 83)
* 'Bacteria_71' (type: singlecopy; num genes: 71)
* 'Archaea_76' (type: singlecopy; num genes: 76)
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-db-info $output_dir/CONTIGS6.db
DB Info (no touch)
===============================================
Database Path ................................: /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS6.db
Description ..................................: [Not found, but it's OK]
Type .........................................: contigs
Version ......................................: 18
DB Info (no touch also)
===============================================
project_name .................................: Contigs DB for anvi'o mini self-test
contigs_db_hash ..............................: hashf0356823
split_length .................................: 1000
kmer_size ....................................: 4
num_contigs ..................................: 6
total_length .................................: 57030
num_splits ...................................: 38
gene_level_taxonomy_source ...................: None
gene_function_sources ........................: None
genes_are_called .............................: 1
external_gene_calls ..........................: 0
external_gene_amino_acid_seqs ................: 0
skip_predict_frame ...........................: 0
splits_consider_gene_calls ...................: 1
scg_taxonomy_was_run .........................: 0
scg_taxonomy_database_version ................: None
creation_date ................................: 1598448487.69134
* Please remember that it is never a good idea to change these values. But in some
cases it may be absolutely necessary to update something here, and a programmer
may ask you to run this program and do it. But even then, you should be
extremely careful.
AVAILABLE GENE CALLERS
===============================================
* 'prodigal' (51 gene calls)
AVAILABLE HMM SOURCES
===============================================
* 'Ribosomal_RNAs' (type: Ribosomal_RNAs; num genes: 12)
* 'Protista_83' (type: singlecopy; num genes: 83)
* 'Bacteria_71' (type: singlecopy; num genes: 71)
* 'Archaea_76' (type: singlecopy; num genes: 76)
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-db-info $output_dir/CONTIGS6.db
Segmentation fault
###### anvi-import-functions tests
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-import-functions -c $output_dir/CONTIGS.db -i $files/example_interpro_output.tsv -p interproscan
Gene functions ...............................: 273 function calls from 11 sources (PRINTS, TIGRFAM, SUPERFAMILY, ProSitePatterns, PIRSF, Gene3D, Hamap, Coils, SMART, Pfam,
ProSiteProfiles) for 48 unique gene calls has been added to the contigs database.
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-import-functions -c /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS6.db -i $files/example_interpro_output.tsv -p interproscan
Gene functions ...............................: 273 function calls from 11 sources (Pfam, Gene3D, PIRSF, PRINTS, TIGRFAM, SUPERFAMILY, ProSiteProfiles, SMART, Coils,
ProSitePatterns, Hamap) for 48 unique gene calls has been added to the contigs database.
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-import-functions -c /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS6.db -i $files/example_interpro_output.tsv -p interproscan
WARNING
===============================================
Some of the annotation sources you want to add into the database are already in
the db. So anvi'o will REPLACE those with the incoming data from these sources:
Hamap, ProSiteProfiles, PRINTS, Gene3D, Coils, TIGRFAM, PIRSF, SMART,
SUPERFAMILY, Pfam, ProSitePatterns
Gene functions ...............................: 273 function calls from 11 sources (Hamap, ProSiteProfiles, PRINTS, Gene3D, Coils, TIGRFAM, PIRSF, SMART, ProSitePatterns,
SUPERFAMILY, Pfam) for 48 unique gene calls has been added to the contigs database.
#### hmm with external profile
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-run-hmms -c $output_dir/CONTIGS.db -H $files/external_hmm_profile --just-do-it
Segmentation fault
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-run-hmms -c $output_dir/CONTIGS.db -H $files/external_hmm_profile --just-do-it
HMM profiles .................................: 1 source been loaded: external_hmm_profile (2 genes)
Target found .................................: AA:GENE
HMM Profiling for external_hmm_profile
===============================================
Kind .........................................: external_hmm_genes
Alphabet .....................................: AA
Context ......................................: GENE
Domain .......................................: N/A
HMM model path ...............................: /home/iaf/christian.hoffmann/tmp/tmpf1_c_v6s/external_hmm_profile.hmm
Number of genes in HMM model .................: 2
Noise cutoff term(s) .........................: -E 1e-12
Number of CPUs will be used for search .......: 1
HMMer program used for search ................: hmmscan
Temporary work dir ...........................: /home/iaf/christian.hoffmann/tmp/tmpwtemymy6
Log file for thread 0 ........................: /home/iaf/christian.hoffmann/tmp/tmpwtemymy6/AA_gene_sequences.fa.0_log
Number of raw hits ...........................: 4
Number of weak hits removed ..................: 0
Number of hits in annotation dict ...........: 4
Segmentation fault
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-run-hmms -c $output_dir/CONTIGS.db -H $files/external_hmm_profile --just-do-it
Segmentation fault
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-run-hmms -c /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS6.db -H /home/iaf/christian.hoffmann/github/anvio/anvio/tests/sandbox/external_hmm_profile/ --just-do-it
Segmentation fault
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-run-hmms -c /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS6.db -H /home/iaf/christian.hoffmann/github/anvio/anvio/tests/sandbox/external_hmm_profile/ --just-do-it
HMM profiles .................................: 1 source been loaded: external_hmm_profile (2 genes)
Target found .................................: AA:GENE
HMM Profiling for external_hmm_profile
===============================================
Kind .........................................: external_hmm_genes
Alphabet .....................................: AA
Context ......................................: GENE
Domain .......................................: N/A
HMM model path ...............................: /home/iaf/christian.hoffmann/tmp/tmp1bq3dgqu/external_hmm_profile.hmm
Number of genes in HMM model .................: 2
Noise cutoff term(s) .........................: -E 1e-12
Number of CPUs will be used for search .......: 1
HMMer program used for search ................: hmmscan
Temporary work dir ...........................: /home/iaf/christian.hoffmann/tmp/tmpijgdkfzs
Log file for thread 0 ........................: /home/iaf/christian.hoffmann/tmp/tmpijgdkfzs/AA_gene_sequences.fa.0_log
Number of raw hits ...........................: 4
Number of weak hits removed ..................: 0
Number of hits in annotation dict ...........: 4
Segmentation fault
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-run-hmms -c /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS6.db -H /home/iaf/christian.hoffmann/github/anvio/anvio/tests/sandbox/external_hmm_profile/ --just-do-it --num-threads 1
Segmentation fault
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-run-hmms -c /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS6.db -H /home/iaf/christian.hoffmann/github/anvio/anvio/tests/sandbox/external_hmm_profile/ --just-do-it --num-threads 1
Segmentation fault
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-run-hmms -c /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS6.db -H /home/iaf/christian.hoffmann/github/anvio/anvio/tests/sandbox/external_hmm_profile/ --just-do-it --num-threads 1
HMM profiles .................................: 1 source been loaded: external_hmm_profile (2 genes)
WARNING
===============================================
Previous entries for "external_hmm_profile" is being removed from
"hmm_hits_info, hmm_hits, hmm_hits_in_splits, genes_in_contigs, gene_functions"
Target found .................................: AA:GENE
HMM Profiling for external_hmm_profile
===============================================
Kind .........................................: external_hmm_genes
Alphabet .....................................: AA
Context ......................................: GENE
Domain .......................................: N/A
HMM model path ...............................: /home/iaf/christian.hoffmann/tmp/tmpvibetvad/external_hmm_profile.hmm
Number of genes in HMM model .................: 2
Noise cutoff term(s) .........................: -E 1e-12
Number of CPUs will be used for search .......: 1
HMMer program used for search ................: hmmscan
Temporary work dir ...........................: /home/iaf/christian.hoffmann/tmp/tmp55i5q9qy
Log file for thread 0 ........................: /home/iaf/christian.hoffmann/tmp/tmp55i5q9qy/AA_gene_sequences.fa.0_log
Number of raw hits ...........................: 4
Number of weak hits removed ..................: 0
Number of hits in annotation dict ...........: 4
#### seqlite3 cmd
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> sqlite3 $output_dir/CONTIGS.db '.tables'
amino_acid_additional_data hmm_hits
collections_bins_info hmm_hits_in_splits
collections_info hmm_hits_info
collections_of_contigs kmer_contigs
collections_of_splits kmer_splits
contig_sequences nt_position_info
contigs_basic_info nucleotide_additional_data
gene_amino_acid_sequences scg_taxonomy
gene_functions self
genes_in_contigs splits_basic_info
genes_in_splits splits_taxonomy
genes_taxonomy taxon_names
#### anvi-profile
same thing happens here, somethings just throws segfault, but it seems more often it runs as this:
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> anvi-profile -W -i /home/iaf/christian.hoffmann/anvioTestOutput/test-output/SAMPLE-03.bam -o /home/iaf/christian.hoffmann/anvioTestOutput/test-output/SAMPLE-03 -c $output_dir/CONTIGS6.db --cluster --profile-SCVs
Contigs DB .........................: Initialized: /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS6.db (v. 18)
anvio ..............................: 6.2-master
profiler_version ...................: 34
sample_id ..........................: SAMPLE_03
description ........................: None
profile_db .........................: /home/iaf/christian.hoffmann/anvioTestOutput/test-output/SAMPLE-03/PROFILE.db
contigs_db .........................: True
contigs_db_hash ....................: hashf0356823
cmd_line ...........................: /home/iaf/christian.hoffmann/github/anvio/bin/anvi-profile -W -i /home/iaf/christian.hoffmann/anvioTestOutput/test-output/SAMPLE-03.bam -o /home/iaf/christian.hoffmann/anvioTestOutput/test-output/SAMPLE-03 -c /home/iaf/christian.hoffmann/anvioTestOutput/test-output/CONTIGS6.db --cluster --profile-SCVs
merged .............................: False
blank ..............................: False
split_length .......................: 1,000
min_contig_length ..................: 1,000
max_contig_length ..................: 9,223,372,036,854,775,807
min_mean_coverage ..................: 0
clustering_performed ...............: True
min_coverage_for_variability .......: 10
skip_SNV_profiling .................: False
skip_INDEL_profiling ...............: False
profile_SCVs .......................: True
min_percent_identity ...............: None
report_variability_full ............: False
WARNING
=====================================
Your minimum contig length is set to 1,000 base pairs. So anvi'o will not take
into consideration anything below that. If you need to kill this an restart your
analysis with another minimum contig length value, feel free to press CTRL+C.
input_bam ..........................: /home/iaf/christian.hoffmann/anvioTestOutput/test-output/SAMPLE-03.bam
output_dir .........................: /home/iaf/christian.hoffmann/anvioTestOutput/test-output/SAMPLE-03
num_reads_in_bam ...................: 24,870
num_contigs ........................: 3
num_contigs_after_M ................: 3
num_splits .........................: 35
total_length .......................: 56,709
New data for 'layers' in data group 'default'
=====================================
Data key "total_reads_mapped" ......: Predicted type: int
Data key "num_SNVs_reported" .......: Predicted type: int
Data key "total_reads_kept" ........: Predicted type: int
NEW DATA
=====================================
Database ...........................: profile
Data group .........................: default
Data table .........................: layers
New data keys ......................: total_reads_mapped, num_SNVs_reported, total_reads_kept.
New items order ....................: "tnf:euclidean:ward" (type newick) has been added to the database...
New items order ....................: "tnf-ab-cov:euclidean:ward" (type newick) has been added to the database...
* Happy 😇
✓ anvi-profile took 0:00:11.749114
Segmentation fault
##### HMMSCAN test
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> which hmmscan
/home/iaf/christian.hoffmann/miniconda3/envs/anvio-master/bin/hmmscan
(anvio-master) christian.hoffmann@scm01:~/anvioTestOutput> hmmscan -h
# hmmscan :: search sequence(s) against a profile database
(anvio-master) christian.hoffmann@scm00:~/anvioTestOutput/tutorial3.3> cat globins4.hmm fn3.hmm Pkinase.hmm > minifam
(anvio-master) christian.hoffmann@scm00:~/anvioTestOutput/tutorial3.3> hmmpress minifam
Working... done.
Pressed and indexed 3 HMMs (3 names and 2 accessions).
Models pressed into binary file: minifam.h3m
SSI index for binary model file: minifam.h3i
Profiles (MSV part) pressed into: minifam.h3f
Profiles (remainder) pressed into: minifam.h3p
(anvio-master) christian.hoffmann@scm00:~/anvioTestOutput/tutorial3.3> hmmscan minifam 7LESS_DROME
# hmmscan :: search sequence(s) against a profile database
# Copyright (C) 2019 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query sequence file: 7LESS_DROME
# target HMM database: minifam
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query: 7LESS_DROME [L=2554]
Accession: P13368
Description: RecName: Full=Protein sevenless; EC=2.7.10.1;
Scores for complete sequence (score includes all domains):
--- full sequence --- --- best 1 domain --- -#dom-
E-value score bias E-value score bias exp N Model Description
------- ------ ----- ------- ------ ----- ---- -- -------- -----------
5.6e-57 178.0 0.4 3.5e-16 47.2 0.9 9.4 9 fn3 Fibronectin type III domain
1.1e-43 137.2 0.0 1.7e-43 136.5 0.0 1.3 1 Pkinase Protein kinase domain
Domain annotation for each model (and alignments):
>> fn3 Fibronectin type III domain
# score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc
--- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ----
1 ? -1.3 0.0 0.33 0.5 61 74 .. 396 409 .. 395 411 .. 0.85
2 ! 40.7 0.0 2.6e-14 3.8e-14 2 84 .. 439 520 .. 437 521 .. 0.95
3 ! 14.4 0.0 4.1e-06 6.1e-06 13 85 .. 836 913 .. 826 914 .. 0.73
4 ! 5.1 0.0 0.0032 0.0048 10 36 .. 1209 1235 .. 1203 1259 .. 0.82
5 ! 24.3 0.0 3.4e-09 5e-09 14 80 .. 1313 1380 .. 1304 1386 .. 0.82
6 ? 0.0 0.0 0.13 0.19 58 72 .. 1754 1768 .. 1739 1769 .. 0.89
7 ! 47.2 0.9 2.3e-16 3.5e-16 1 85 [. 1799 1890 .. 1799 1891 .. 0.91
8 ! 17.8 0.0 3.7e-07 5.5e-07 6 74 .. 1904 1966 .. 1901 1976 .. 0.90
9 ! 12.8 0.0 1.3e-05 2e-05 1 86 [] 1993 2107 .. 1993 2107 .. 0.89
Alignments for each domain:
== domain 1 score: -1.3 bits; conditional E-value: 0.33
EES--TT-EEEEEE CS
fn3 61 ltgLepgteYefrV 74
l+ L p+t+Y+fr
7LESS_DROME 396 LEALIPYTQYRFRF 409
67899*******95 PP
== domain 2 score: 40.7 bits; conditional E-value: 2.6e-14
---CEEEEEEECTTEEEEEEE--S--SS--SEEEEEEEETTTCCGCEEEEEETTTSEEEEES--TT-EEEEEEEEEETTEE-E CS
fn3 2 saPenlsvsevtstsltlsWsppkdgggpitgYeveyqekgegeewqevtvprtttsvtltgLepgteYefrVqavngagegp 84
saP ++ + ++ l ++W p + +gpi+gY++++++++++ + e+ vp+ s+ +++L++gt+Y++ + +n++gegp
7LESS_DROME 439 SAPVIEHLMGLDDSHLAVHWHPGRFTNGPIEGYRLRLSSSEGNA-TSEQLVPAGRGSYIFSQLQAGTNYTLALSMINKQGEGP 520
78999999999*****************************9998.**********************************9997 PP
== domain 3 score: 14.4 bits; conditional E-value: 4.1e-06
CTTEEEEEEE--S.--SS--S.....EEEEEEEETTTCCGCEEEEEETTTSEEEEES--TT-EEEEEEEEEETTE.E-EB CS
fn3 13 tstsltlsWsppk.dgggpit.....gYeveyqekgegeewqevtvprtttsvtltgLepgteYefrVqavngag.egpe 85
++ + +sW++p+ ++ + + +Ye+e+ ++ ++++ +++ ++ + l+ L+p+ Y++rV+a+n +g g++
7LESS_DROME 836 GAQAAKISWKEPErNPYQSADaarswSYELEVLDVASQSAFSIRNIRGPI--FGLQRLQPDNLYQLRVRAINVDGePGEW 913
56677889999987443333223333899999999999955555566666..**********************965655 PP
== domain 4 score: 5.1 bits; conditional E-value: 0.0032
EEECTTEEEEEEE--S--SS--SEEEE CS
fn3 10 sevtstsltlsWsppkdgggpitgYev 36
++ + ++++W+p+++gg + ++Y++
7LESS_DROME 1209 DDGHWDDFHVRWQPSTSGGNHSVSYRL 1235
555678999*******99999999997 PP
== domain 5 score: 24.3 bits; conditional E-value: 3.4e-09
TTEEEEEEE--S...--SS--SEEEEEEEETTTCCGCEEEEEETTTSEEEEES--TT-EEEEEEEEEETT CS
fn3 14 stsltlsWsppk...dgggpitgYeveyqekgegeewqevtvprtttsvtltgLepgteYefrVqavnga 80
+ s++l+W++p+ + + + Y + ++ ++ + e ++ ++ ++ +++L+p+++Y+f+V+a+ +a
7LESS_DROME 1313 NVSAVLRWDAPEqgqEAPMQALEYHISCWVG-SEL-HEELRLNQSALEARVEHLQPDQTYHFQVEARVAA 1380
5689********76556667899******55.665.688888888888*****************98665 PP
== domain 6 score: 0.0 bits; conditional E-value: 0.13
EEEEES--TT-EEEE CS
fn3 58 svtltgLepgteYef 72
s++lt+L p t+Y++
7LESS_DROME 1754 SLNLTDLLPFTRYRV 1768
799**********98 PP
== domain 7 score: 47.2 bits; conditional E-value: 2.3e-16
----CEEEEEEECTTEEEEEEE--S--SS--SEEEEEEEETTTCC.......GCEEEEEETTTSEEEEES--TT-EEEEEEEEEETTEE-EB CS
fn3 1 psaPenlsvsevtstsltlsWsppkdgggpitgYeveyqekgege.......ewqevtvprtttsvtltgLepgteYefrVqavngagegpe 85
ps+P+n+sv+ ++++l++sW pp++ +++ ++Y++++q++ +ge ++ + ++ +t+ ++ ltg++pg+ Y+++Vqa+ + +++ +
7LESS_DROME 1799 PSPPRNFSVRVLSPRELEVSWLPPEQLRSESVYYTLHWQQELDGEnvqdrreWEAHERRLETAGTHRLTGIKPGSGYSLWVQAHATPTKSNS 1890
9*************************************99989998****97777777777777*******************988776665 PP
== domain 8 score: 17.8 bits; conditional E-value: 3.7e-07
EEEEEEECTTEEEEEEE--S--SS--SEEEEEEEETTTCCGCEEEEEETTTSEEEEES--TT-EEEEEE CS
fn3 6 nlsvsevtstsltlsWsppkdgggpitgYeveyqekgegeewqevtvprtttsvtltgLepgteYefrV 74
+l++ e ++ sl+l+W +p+ + ++e++++ e+ +++v +++t ++++ L+p t+Y+ r+
7LESS_DROME 1904 ELQLLELGPYSLSLTWAGT---PDPLGSLQLECRSSAEQ---LRRNVAGNHTKMVVEPLQPRTRYQCRL 1966
5788899************...8***********88555...79**********************986 PP
== domain 9 score: 12.8 bits; conditional E-value: 1.3e-05
----CEEEEEEECTTEEEEEEE--S--SS--SEEEEEEEETTTCC...........................GCEEEEEETTTS.EEEEES--TT- CS
fn3 1 psaPenlsvsevtstsltlsWsppkdgggpitgYeveyqekgege...........................ewqevtvprttt.svtltgLepgt 68
ps+P+ ++++ + + ++++W++ + +g+pi Y++e + ++ e+q+ + +tt+ s+ ++ L ++
7LESS_DROME 1993 PSQPGKPQLEHIAEEVFRVTWTAARGNGAPIALYNLEALQARSDIrrrrrrrrrnsggsleqlpwaeepvvvEDQWLDFCNTTElSCIVKSLHSSR 2088
89***********************9************7775554799999999999999999999999999999999999999999999999999 PP
EEEEEEEEEETTE.E-EBE CS
fn3 69 eYefrVqavngag.egpes 86
frV+a++ + +gp+s
7LESS_DROME 2089 LLLFRVRARSLEHgWGPYS 2107
9999999999554488876 PP
>> Pkinase Protein kinase domain
# score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc
--- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ----
1 ! 136.5 0.0 1.1e-43 1.7e-43 2 256 .. 2210 2479 .. 2209 2482 .. 0.85
Alignments for each domain:
== domain 1 score: 136.5 bits; conditional E-value: 1.1e-43
EEEEEEEEETTEEEEEEEETTTTE....EEEEEEEEHHHCCCCCCHHHHHHHHHHHHHSSSSB--EEEEEEETTEEEEEEE--TS-BHHHHHH... CS
Pkinase 2 elleklGsGsfGkVykakkkktgk....kvAvKilkkeeekskkektavrElkilkklsHpnivkllevfetkdelylvleyveggdlfdllk... 90
+ll+ lGsG+fG+Vy+++ k++ + +vA+K l+k ++ + +E++++ +++H+niv l++++ + +++ l++e++e+gdl ++l+
7LESS_DROME 2210 KLLRFLGSGAFGEVYEGQLKTEDSeepqRVAIKSLRKGASEFAELL---QEAQLMSNFKHENIVCLVGICFDTESISLIMEHMEAGDLLSYLRaar 2302
67899*********88776655444444********9998887764...4*******************************************998 PP
........HHHST-HHHHHHHHHHHHHHHHHHHHTTEE-S--SGGGEEEETTTEE.......EE--GTT.E..EECSS-C-S--S..-GGGS-HHH CS
Pkinase 91 ........kegklseeeikkialqilegleylHsngiiHrDLKpeNiLldkkgev.......kiaDFGlakkleksseklttlvg..treYmAPEv 169
ls e+ ++ +++g +yl +++++HrDL N+L++++ ki DFGla+ ++ks+ ++ g ++m+PE
7LESS_DROME 2303 atstqepqPTAGLSLSELLAMCIDVANGCSYLEDMHFVHRDLACRNCLVTESTGStdrrrtvKIGDFGLARDIYKSDYYRKEGEGllPVRWMSPES 2398
8887766555666************************************9554445999*************988887777766622679****** PP
HCCS-CTHHHHHHHHHHHHHHHHHH.SS-TTSSSHHCCTHHHHSSHHH......TTS.....HHHHHHHHHHT-SSGGGSTTHHHHHT CS
Pkinase 170 llkakeytkkvDvWslGvilyellt.gklpfsgeseedqleliekilkkkleedepkssskseelkdlikkllekdpakRltaeeilk 256
l + t+++DvW++Gv+++e+lt g+ p+ + ++ e++++++++ ++ p ++ e+l +l+ ++++dp +R++++++++
7LESS_DROME 2399 LV-DGLFTTQSDVWAFGVLCWEILTlGQQPYAAR---NNFEVLAHVKEGGRLQQ-PPMCT--EKLYSLLLLCWRTDPWERPSFRRCYN 2479
**.9999***************999899999999...55666655555443333.33344..89*******************99887 PP
Internal pipeline statistics summary:
-------------------------------------
Query sequence(s): 1 (2554 residues searched)
Target model(s): 3 (495 nodes)
Passed MSV filter: 2 (0.666667); expected 0.1 (0.02)
Passed bias filter: 2 (0.666667); expected 0.1 (0.02)
Passed Vit filter: 2 (0.666667); expected 0.0 (0.001)
Passed Fwd filter: 2 (0.666667); expected 0.0 (1e-05)
Initial search space (Z): 3 [actual number of targets]
Domain search space (domZ): 2 [number of targets reported over threshold]
# CPU time: 0.00u 0.00s 00:00:00.00 Elapsed: 00:00:00.03
# Mc/sec: 39.07
//
[ok]