Sorry to say probably not a new issue. I saw that MAKER is an easy-to-use genome annotation pipeline designed to be usable by small research groups with little bioinformatics experience. That statement describes me to a t. I am new to Whole Genome Sequencing and associated analysis, and have a limited skill set with computers. I have a a 1.52 GB, 75k unitigs, non-photosynthetic vascular plant assembly for which I would like to predict functional genes. I have tried to follow the tutorial that you have posted for Winter School 2018 to see if I can duplicate your installation and implementation for my HPC. I doubt that you will be impressed, but I recently received a “certificate” for completing our HPC Linux course path the University of Wyoming. . I realize now how much I don’t know and how much work it will take to complete an annotation, but I am needing to generate at least some preliminary data to include in an upcoming grant submittal (~2 months).
Sincerely,
Steve Miller
Botany
University of Wyoming
_______________________________________________
maker-devel mailing list
maker...@yandell-lab.org
http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org
Thanks, Jason!
Yes thank you Jason for responding! Yes I have tried the Galaxy approach and as you predicted my dataset is too large to run an annotation locally on my laptop - about 157 x too large by my calculations. I would be happy to try your funannotate tool, but my dataset is in Fasta, and I would have to run BUSCO to get my Fasta data into a different format? Because of the size of my WGS, that would likely have to be done on the HPC?
Although I do not have all the experience necessary to pull this off quickly, I am thinking about my experimental design and the steps that I will have to accomplish.My experimental design is this:I have a non-photosynthetic plant. I want to eventually find out how this plant functions differently from its fully photosynthetic cousin and if all related non-photosynthetic plants function in the same way. There are two annotated genomes of closely related plants in NCBI: 1. another fairly closely related non-photosynthetic plant and 2. a fairly closely related fully photosynthetic plant. I do not have a clue as to how good either these annotations are.
For the annotation portion of my WGS, I can understand using the fairly closely related fully photosynthetic plant to train my annotation gene prediction, assuming that all functional genes required for photosynthesis and metabolism are present in this plant. Once my WGS is annotated, I can then move on to compare the functional genes in both non-photosynthetic plants.
So, these are the steps I must figure out to run this analysis on my HPC:1. installation of necessary MAKER related bits of software? MAKER version 2.31.10 is installed on my HPC, but I don’t know if all the associated tools are also installed (e.g. Augustus)2. upload my assembled WGS in fasta format (from my limited knowledge I would need to use GlobusFTP?)3. upload the annotated WGS of the fully photosynthetic plant from NCBI (again in Fasta using Globus FTP?)4. upload the annotated WGS of the other non-photosynthetic plant (again in Fasta using Globus FTP?)5. Run MAKER, using one or the other fully annotated WGS to train MAKER (or Augustus) to predict the genes from my non-photosynthetic plant6. Use the ugly MAKER output data in GFF3 along with reports from InterProScan and a BLAST report of homology and script maker_map_ids to make pretty graphics
Thanks again for any help you can provide. I wish there was an upcoming workshop I could attend to get into this process in a hurry!
On Oct 6, 2021, at 11:11 AM, Mark Yandell <myan...@genetics.utah.edu> wrote:
◆ This message was sent from a non-UWYO address. Please exercise caution when clicking links or opening attachments from external sources.
On Oct 6, 2021, at 11:11 AM, Mark Yandell <myan...@genetics.utah.edu> wrote:
◆ This message was sent from a non-UWYO address. Please exercise caution when clicking links or opening attachments from external sources.