On Thu, May 4, 2017 at 2:58 PM, Luke McKay <mcg...@gmail.com> wrote:Quick question, what is the easiest way to extract a list of genes from each bin? For example, let's say I want to make a concatenated ribosomal protein tree and I have a list of 16 ribo proteins of interest. Is there a simple way to retrieve this information quickly from a handful of specified bins?--Dr. Luke McKayPostdoctoral FellowNASA Astrobiology ProgramDepartment of Land Resources and Environmental Sciences (815 LJH)Center for Biofilm Engineering (313 Barnard Hall)Montana State University
tar -zxvf INFANTGUTTUTORIAL.tar.gz && cd INFANT-GUT-TUTORIAL
anvi-import-collection additional-files/collections/merens.txt -p PROFILE.db -c CONTIGS.db -C merens
anvi-get-sequences-for-hmm-hits -p PROFILE.db \-c CONTIGS.db \
-C merens \-o OUTPUT.fa \--hmm-source Campbell_et_al \--gene-names Ribosomal_L27,Ribosomal_L28,Ribosomal_L3 \--return-best-hit \--get-aa-sequences \--concatenate
cat OUTPUT.fa>P_rhinitidis|genes:Ribosomal_L3,Ribosomal_L27,Ribosomal_L28|separator:XXX------------------------------------------MKYLVGKKIGMTQI----------FDEEGTVTPVSVIEVEPNVVVQKKTIESDGYNAIQVATQEVKEK--------KLNKPQKGHLDKAGVGYKKHLSEFRTDDVD-SYNLG---------------------------------------------------DEIKVD-IFEVAEHVDVVGTSKGKGTAGVIKRHNFGRGRETHG-SKFHRMPGGMGAASYPGKVFKNHRMAGKMGNERVTVQNLEIVRI------------------------------DTDKNLILVKGAIPGPKKGTVKIKSTVKLTK-------------------------------------------XXX-MIKFDLLLFSS----------------------KKGAGSSKNGRDSNSKRLGVKRGDGQFVLAGNILVRQRGTKIHPGENVMKGSDDTLFATADGVLRF----------------------------TTKGKGG---------------------------------------------------------------------KKFANVY----------------------------------------------------------------------------------------------------------------------------------------------------------------------------VEEKVEAXXX-------------------------------------------------------------------------------------MAKRCEICGKEKTFGNKISFSHSRSNRSWSPNLRKVKAIVN--GSPKRIYVCTRCLRS-----------------------------------------------------------------------------GKVERAI-------------------------------------->C_albicans|genes:Ribosomal_L3,Ribosomal_L27,Ribosomal_L28|separator:XXXMSHRKYEAPRHGSLGFLPRKRAAKQRGRVKSFPKDVKSKPVALTAFLGYKAGMTTIVRDLDRPGSKMHKREVVEAATVVDTPPMVVVGVV-----GYVETPRGLRSLTTVWAEHLSEEVRRRFYKNWYKSKKKAFTKYSGKYATDAKQVETELARIKKYASVVRVLAHTQIKKTPLSQKKAHLAEIQINGGSVSDKVDWAKEHFEKEVSVDSVFEQDEMIDVIAVTKGHGFEGVTHRWGTKKLPRKT--HRGLRKVACIG-AWHPANVNWTVARAGQNGYHHRTSINHKVYRVGKGTDEANGATEFDRTKKTINPMGGFVRYGNVNNDFVLLKGSIPGVKKRVVTLRKSLYVDTSRRAVEKVNLKWIDTASRFGKGRFQTPAEKHAFMGTLKKDLENXXX-MSSFVKGLFSHTRKSIDLTSNPLHTSIQIRTAKKRVSGSRTNNKDSAGRRLGPKKNEGHFVNPGQIIMRQRGTKIHPGDNVKIGVDHTIFAVEPGYVRYYFDPFHPLRKYVGVSLKKNLKLPRPHFEPRLRRFGYVQITDPIEAQEEEASQSRKEMLAQPELEKLKEKKLNEKIQFIESTKTALVNEFGFDSEPSSKQLEDASERLYNIYQLRASGQLLSEARIQTTFNTLYDLKLQAQKNNIDSLPNLLNEAKEFITRIDSIVGIEPTGELFKNLTKEEQLNLQKEISSELDTLYQTKALEKDYRIEAKKLINTPGVFEPLQREELMAKYLPQVLPMDYPGSIIEISDSDSKNKNKKLSENIVIQRIFDETTRKVKLIGRPKEAFASAXXXMNVFRGLISIPRISCVSQIYSARQLSSTLPLSTKRTYDKFYKITKQLQPIDKNVYEIGQERPDNISIPKDLPEFPKYEYEPRFFKRQNRGLYGGLQRKRSKSCSEYLNKTLRAHRPNAQWTKLWSETLNKRLRLRVATRVLKTISKEGGLDQYLLKSTPARVKTMGLKAWQLRYRILQEREQKQRGNVTLLDGTTKPIQYISSNGLKFHATKDAMLSELYEAVQRDSYYPIKPFHFERDYSWLSYEEIVKKLEQYNWDFSELATK>F_magna|genes:Ribosomal_L3,Ribosomal_L27,Ribosomal_L28|separator:XXX------------------------------------------MKSILGKKIGMTQI----------FNEDGSVVPVTVIEAGPMVVTQIKTKEKEGYNAIQVGYIEKKEK--------HVNQPMRGHFGKAGVSFKKHLQEFRIGDDE-QFNLG---------------------------------------------------DEIKSD-IFQDGDVVDVIGISKGKGTQGAIVRHNYSRGPMGHG-SKSHRVAGARSAGSYPARVFKGRKGSGKMGHDRVTVQNLKIVKV------------------------------DNERNLLLIKGAVPGNKGGVVTVREAIKSK--------------------------------------------XXXMMIKLDLQLFSS----------------------KKGVSSTKNGRDSESKRLGTKKGDGQYVLAGNILVRQRGTKIHPGNNVGKGGDDTLFTKIDGVVKF----------------------------ERIGKN----------------------------------------------------------------------RKQVSVY-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------PKEAXXX------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------->Aneorococcus_sp|genes:Ribosomal_L3,Ribosomal_L27,Ribosomal_L28|separator:XXX------------------------------------------MKSIFTTKVGMTQV----------IDEDGVVTPVTVLKADENVVVQVKTEETDGYNAVQIGYMDKKEK--------NVKKPVKGHFDKAGASYKRYLKEVNYGNDPIELAVG---------------------------------------------------DKLAVD-IFEAGEVVDVVATSKGKGTQGAI------------------------------------------------------------------------------------------------------------------------------------------------------------------XXX------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------XXX------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------->E_facealis|genes:Ribosomal_L3,Ribosomal_L27,Ribosomal_L28|separator:XXX-----------------------------------------MTKGILGKKVGMTQI----------FTESGELIPVTVVEATPNVVLQVKTVETDGYEAIQVGYQDKREV--------LSNKPAKGHVAKANTAPKRFIKEFKNVELG-EYEVG---------------------------------------------------KEIKVD-VFQAGDVVDVTGTTKGKGFQGAIKRHGQSRGPMSHG-SRYHRRPGSMG-PVAPNRVFKNKRLAGRMGGDRVTIQNLEVVKV------------------------------DVERNVILIKGNIPGAKKSLITIKSAVKAK--------------------------------------------XXXMLLTMNLQLFAH----------------------KKGGGSTSNGRDSESKRLGAKSADGQTVTGGSILYRQRGTKIYPGVNVGIGGDDTLFAKVDGVVRF----------------------------ERKGRD----------------------------------------------------------------------KKQVSVY-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------PVANXXX-------------------------------------------------------------------------------------MAKVCYFTGRKTSSGNNRSHAMNSTKRTVKPNLQKVRVLID--GKPKKVWVSTRALKS-----------------------------------------------------------------------------GKIERV--------------------------------------->S_aureus|genes:Ribosomal_L3,Ribosomal_L27,Ribosomal_L28|separator:XXX-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------XXX-MLKLNLQFFAS----------------------KKGVSSTKNGRDSESKRLGAKRADGQFVTGGSILYRQRGTKIYPGENVGRGGDDTLFAKIDGVVKF----------------------------ERKGRD----------------------------------------------------------------------KKQVSVY-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------AVAEXXX-------------------------------------------------------------------------------------MGKQCFVTGRKASTGNRRSHALNSTKRRWNANLQKVRILVD--GKPKKVWVSARALKS-----------------------------------------------------------------------------GKVTRV--------------------------------------->S_epidermidis|genes:Ribosomal_L3,Ribosomal_L27,Ribosomal_L28|separator:XXX-----------------------------------------MTKGILGRKIGMTQV----------FGENGELIPVTVVEASQNVVLQKKTEEVDGYNAIQVGFEDKQAYKKGSKSNKYANKPAEGHAKKADTAPKRFIREFRNVNVD-EYEVG---------------------------------------------------QEVSVD-TFETGDIIDVTGVSKGKGFQGAIKRHGQGRGPMAHG-SHFHRAPGSVGMASDASKVFKGQKMPGRMGGNTVTVQNLEVVQV------------------------------DTENSVILVKGNVPGPKKGLVEITTSIKKGNK------------------------------------------XXX-MLKLNLQFFAS----------------------KKGVSSTKNGRDSESKRLGAKRADGQYVSGGSILYRQRGTKIYPGENVGRGGDDTLFAKIDGVVKF----------------------------ERKGRD----------------------------------------------------------------------KKQVSVY-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------AVAEXXX-------------------------------------------------------------------------------------MGKQCFVTGRKASTGNHRSHALNANKRRWNANLQKVRILVD--GKPKKVWVSARALKS-----------------------------------------------------------------------------GKVTRV--------------------------------------->P_avidum|genes:Ribosomal_L3,Ribosomal_L27,Ribosomal_L28|separator:XXX------------------------------------MTNERTVKGVLGTKLGMTQL----------WDEHNKLVPVTVIQAGPCVVTQVRTPETDGYSAVQLGIGAVKAK--------KVTKPEAGHFEKAGVTPRRHLVELRTADAS-EYTLG---------------------------------------------------QEITAD-VFSESDFVDVTGTSKGKGTAGVMKRHGFGGLRATHGVHRKHRSPGSIGGCSTPGKVIKGLRMAGRMGAERVTVQNLQVHSV------------------------------DAERGIMLVRGAVPGPKGSLLVVRSAAKKAAKNGDAA-------------------------------------XXX---------MAH----------------------KKGASSSRNGRDSNAQRLGVKRFGGQLVNAGEIIVRQRGTHFHPGDGVGRGGDDTLFALRDGNVEF---------------------------GTRRG------------------------------------------------------------------------RKIVNVN---------------------------------------------------------------------------------------------------------------------------------------------------------------------------PVEVPVEAXXX-------------------------------------------------------------------------------------MSRRCQVRGTKPGFGNNVSHSQRHTKRRWNPNIQKKRYWVPSLGRQVTLTLTPKAMKEIDRRG---------------------------------------------------------------VDVVIAEMLARGEKI--------------------------------------->S_hominis|genes:Ribosomal_L3,Ribosomal_L27,Ribosomal_L28|separator:XXX-----------------------------------------MTKGILGRKIGMTQV----------FGENGELIPVTVVEANQNVVLQKKTEEVDGYNAIQVGFADKQAYKKDAKSNKYANKPAEGHAKKAGAAPKRFIREFRNVNVD-EYEVG---------------------------------------------------QEVTVD-TFEAGDIIDVTGTSKGKGFQGAIKRHGQGRGPMAHG-SHFHRAPGSVGMASDASRVFKGQKMPGRMGGNTVTVQNLEVVQV------------------------------DTDNNVILVKGNVPGPKKGFVEIKSSIKKGNK------------------------------------------XXX-MLKLNLQFFAS----------------------KKGVSSTKNGRDSESKRLGAKRADGQFVTGGSILYRQRGTKIYAGENVGRGGDDTLFAKIDGVVRF----------------------------ERKGRD----------------------------------------------------------------------KKQVSVY-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------AVAEXXX------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------->L_citreum|genes:Ribosomal_L3,Ribosomal_L27,Ribosomal_L28|separator:XXX-----------------------------------------MTKGILGRKVGMTQV----------FTESGELIAVTAVEATPNVVLQVKNIATDGYNAIQLGYQDKRTV--------LSNKPEQGHASKANTTPKRYVREVRDAEG--EFNAG---------------------------------------------------DEIKVD-TFQAGDYVDVTGITKGHGFQGAIKKLGQSRGPMAHG-SRYHRRPGSMGAII--NRVFKGKLLPGRMGNNKRTMQNVAIVHV------------------------------DVENNLLLLKGNVPGANKSLLTIKSTVKVN--------------------------------------------XXX------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------XXX-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Luke$ anvi-get-sequences-for-hmm-hits -c contigs.db -p PROFILE.db/PROFILE.db -C tSNE_5000 -b C000137_1 --gene-names Ribosomal_L15,Ribosomal_S10,Ribosomal_L2,Ribosomal_L3,Ribosomal_L4,Ribosomal_L18,Ribosomal_L6,Ribosomal_S8,Ribosomal_L5,Ribosomal_L24,Ribosomal_L14,Ribosomal_S17,Ribosomal_S3,Ribosomal_L22,Ribosomal_S19,Ribosomal_10 --return-best-hit --hmm-source Rinke_et_al --get-aa-sequences -o C137_RiboProt_Concat_update.fa --concatenate