Hello,
I need to edit the headers of a fasta file. Specifically, I need to add a description to the header. For example, to the below fasta file (with ~50,000 sequences) I need to change the header to have: >comp1_c0_seq1 len=262 path=[2229:0-261] to have >comp1_c0_seq1 comp1_c0
In other words, I need to keep the SEQ_NAME (comp1_c0_seq1) add a space then add that SEQ_NAME but without the _seq1. I can keep the len= and path= or not.
I ran:
read_fasta -i test.fna | split_vals -k SEQ_NAME -d ' ' | rename_keys -k SEQ_NAME_0,SEQ_NAME | write_fasta -o out.fna -x
which effectively deleted the len= and path= but now i need to add the extra portion. I have the SEQ_NAME and additional descriptions in a separate file. Will biopieces do this and if so which tool?
Thanks for the great tools!
Seth
>comp1_c0_seq1 len=262 path=[2229:0-261]
GAGATCTCTTTTTACTTAACGCTTAAACATTGAGATGTCAGGATAAGAGGAAGAACTGCA
GGCAGATTTTCAAGACGCCTCCTGGCAATCTGTTTGCTGTCAAAGTTAGAAACTATCAGA
ATAGTTAGAAACTATTGCTATTGGTAGTACATTATCACTAAAGGGGGCTTCTTTTTGCAT
ACCCCTTTGTCTTATGAAAAGGCTTGAACCCACCCTTCTTCATTCTTTAATTGGGAGGGG
GGAAAGAAGTGAAGAATTACTG
>comp3_c0_seq1 len=390 path=[37:0-389]
CCGTGCTTTTCCTTTAAGTGCACTACTTCAAAGAAATTTGGCTGAGTGGGCTTGGCTTTT
TTTAGACAATCTGTTATTGTGCTTTCAACTAAAAAGACACTGAATAAATTATAGATGCTG
GGTTCAGAGCTAAAAAGCAAATGAGCTTATTTGGTGGCTTCAT