Hi Leszek,
I have had the pleasure of trying your software Redunsdans, downloaded it a few weeks ago.
I have tryed to understand the reduction part specifically. For that I have created a very small contigs fasta file with 9 strings. These 9 strings I then copied and just renamed the name of the individual strings.
In an attempt to test if Redundans would prune out Node 11 to Node 18 I ran Redundans with the following parameters:
>redundans.py --noscaffolding --nogapclosing -f contigs-double8.fa --identity 0.51 --overlap 0.8 --minLength 50 --log double8.log -o double8
[ERROR] Empty FastA file encountered: double8/contigs.reduced.fa !
I assume I get the ERROR message because the .fa file is too small
> wc double8/contigs.reduced.fa
19 19 651 double8/contigs.reduced.fa
However the file 'double8/contigs.reduced.fa' contains both NODE\_8 and NODE\_18
much to my surprise. The other contigsd are reduced as I would expect. I have tried the same exercise with nine duplicated contigs. Also in this case was the last contig present in twice (NODE\_9 and NODE\_19) after the reduction.
Any thought to what is going on?
Kind regards
Nina
My contigs file (contigs-double8.fa):
>NODE_1
ACGTATAGGGTGTCGAGCACATATCAATATGTCATGAACTGAGAACCTTTACCTTTTTGG
AGTAAGTACCCCTTTTGGGGAGAAGAA
>NODE_2
TTGCATTGAGAGGCGGTATGTTTTTCCAAGATTCTCAAGTAACAAGATTTTTAGTCTAGG
>NODE_3
AATGCTTGAATTTTAAAATTCACTGCAATTAAAGTAAGAGAAGGGTTGTGTACATCAAAA
>NODE_4
AATTACTTTTAAAATCATAAAGGTTGATAATCAGAAGTCAAAGTACTATACTTTTGCTTA
>NODE_5
AAATTAGACGAATTTCACGAGATAAAAATAGCTACATGCCTTTCCGCATTAATGCAAGAA
>NODE_6
TGAAAAACTACCTTGCAATACAGAATCCTAACTATAATCTAGATCCAATCAGTTTGAGGT
>NODE_7
GCGCATTTGCATTGCGTAGCCAAGGGATTAATGACAAGTATACACCAAAATAGAACGCAC
>NODE_8
GCGATAGCTTCAATACTCCAATCGAAAATGAATGGCGTGTTTTTATTCAACAAAGTCTTA
>NODE_11
ACGTATAGGGTGTCGAGCACATATCAATATGTCATGAACTGAGAACCTTTACCTTTTTGG
AGTAAGTACCCCTTTTGGGGAGAAGAA
>NODE_12
TTGCATTGAGAGGCGGTATGTTTTTCCAAGATTCTCAAGTAACAAGATTTTTAGTCTAGG
>NODE_13
AATGCTTGAATTTTAAAATTCACTGCAATTAAAGTAAGAGAAGGGTTGTGTACATCAAAA
>NODE_14
AATTACTTTTAAAATCATAAAGGTTGATAATCAGAAGTCAAAGTACTATACTTTTGCTTA
>NODE_15
AAATTAGACGAATTTCACGAGATAAAAATAGCTACATGCCTTTCCGCATTAATGCAAGAA
>NODE_16
TGAAAAACTACCTTGCAATACAGAATCCTAACTATAATCTAGATCCAATCAGTTTGAGGT
>NODE_17
GCGCATTTGCATTGCGTAGCCAAGGGATTAATGACAAGTATACACCAAAATAGAACGCAC
>NODE_18
GCGATAGCTTCAATACTCCAATCGAAAATGAATGGCGTGTTTTTATTCAACAAAGTCTTA
The reduces contig output file (double8/contigs.reduced.fa) :
>NODE_8
GCGATAGCTTCAATACTCCAATCGAAAATGAATGGCGTGTTTTTATTCAACAAAGTCTTA
>NODE_7
GCGCATTTGCATTGCGTAGCCAAGGGATTAATGACAAGTATACACCAAAATAGAACGCAC
>NODE_6
TGAAAAACTACCTTGCAATACAGAATCCTAACTATAATCTAGATCCAATCAGTTTGAGGT
>NODE_5
AAATTAGACGAATTTCACGAGATAAAAATAGCTACATGCCTTTCCGCATTAATGCAAGAA
>NODE_4
AATTACTTTTAAAATCATAAAGGTTGATAATCAGAAGTCAAAGTACTATACTTTTGCTTA
>NODE_3
AATGCTTGAATTTTAAAATTCACTGCAATTAAAGTAAGAGAAGGGTTGTGTACATCAAAA
>NODE_2
TTGCATTGAGAGGCGGTATGTTTTTCCAAGATTCTCAAGTAACAAGATTTTTAGTCTAGG
>NODE_11
ACGTATAGGGTGTCGAGCACATATCAATATGTCATGAACTGAGAACCTTTACCTTTTTGG
AGTAAGTACCCCTTTTGGGGAGAAGAA
>NODE_18
GCGATAGCTTCAATACTCCAATCGAAAATGAATGGCGTGTTTTTATTCAACAAAGTCTTA