_______________________________________________
maker-devel mailing list
maker...@box290.bluehost.com
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
On Aug 28, 2017, at 9:24 AM, Emmanuel Nnadi <een...@gmail.com> wrote:
Hi Ence,Thanks for your reply,
This is the step and error receivedemmannamekasMBP:src emmannaemeka$ ./build install Installing MAKER... Building MAKER Skip /Users/emmannaemeka/desktop/Gpm/maker/src/../perl/config-darwin-thread-multi-2level-5.018002 (unchanged) The build status is ============================================================================= STATUS MAKER v2.31.9 ============================================================================== PERL Dependencies: VERIFIED External Programs: VERIFIED External C Libraries: VERIFIED MPI SUPPORT: DISABLED MWAS Web Interface: DISABLED MAKER PACKAGE: CONFIGURATION OK
Nnadi Nnaemeka EmmanuelDepartment of Microbiology,Faculty of Natural and Applied Science,Plateau State University, Bokkos, Plateau State, Nigeria.
On Aug 28, 2017, at 10:00 AM, Emmanuel Nnadi <een...@gmail.com> wrote:
Hi DanielThe reply isemmannamekasMBP:maker emmannaemeka$ MAKER -ctl-bash: MAKER: command not found
Nnadi Nnaemeka EmmanuelDepartment of Microbiology,Faculty of Natural and Applied Science,Plateau State University, Bokkos, Plateau State, Nigeria.
emmannamekasMBP:src emmannaemeka$ ./build install
Installing MAKER...
Building MAKER
Skip /Users/emmannaemeka/desktop/Gpm/maker/src/../perl/config-darwin-thread-multi-2level-5.018002 (unchanged)
The build status is
=============================================================================
STATUS MAKER v2.31.9
==============================================================================
PERL Dependencies: VERIFIED
External Programs: VERIFIED
External C Libraries: VERIFIED
MPI SUPPORT: DISABLED
MWAS Web Interface: DISABLED
MAKER PACKAGE: CONFIGURATION OKemmannamekasMBP:maker emmannaemeka$ MAKER -ctl
-bash: MAKER: command not found
/usr/bin:/bin:/usr/sbin:/sbin:/opt/X11/bin:/Users/emmannaemeka/Desktop/Gpm/maker/bin/maker
1. Does it mean that PATH has been exported?
secondly,
I tried to run
the command maker -h, which maker, maker -CTL
nothing returned.
2. how do i start up maker?
3. Do I need to be in maker directory to start maker?
Thanks
Nnadi Nnaemeka Emmanuel
Department of Microbiology,
Faculty of Natural and Applied Science,
Plateau State University, Bokkos, Plateau State, Nigeria.
Publications: https://www.researchgate.net/profile/Emmanuel_Nnadi/publications
On Sep 1, 2017 10:50 PM, "Carson Holt" <cars...@gmail.com> wrote:It would need to be a new run. You won't be able to use the updated contig names with the old run.--CarsonSent from my iPhoneHi carsonThanks for the tipperl -ane 's/1_S7_R1_001_\(paired\)_trimmed_\(paired\)_//g; print' genome.fastaIt worked well however, when i ran it, it removed 1_S7_R1_001_\(paired\)_trimmed_\(paired\)_,I have ran maker with 1_S7_R1_001_\(paired\)_trimmed_\(paired\)_,1. How can I effect the change when maker has produced some files from the the old sequence?I have spent more than 24 hours running maker and it has produced some folders already.How can I make this change?Thanks
Nnadi Nnaemeka EmmanuelDepartment of Microbiology,Faculty of Natural and Applied Science,Plateau State University, Bokkos, Plateau State, Nigeria.
On Fri, Sep 1, 2017 at 4:54 PM, Carson Holt <cars...@gmail.com> wrote:BLAST which is used by MAKER can not handle really long contig names. MAKER tries to get around this by adding a secondary tag to the fasta header when long names are detected. Even then it would be better to change the IDs of your contigs to avoid downstream failures.I would recommend removing '1_S7_R1_001_(paired)_trimmed_(paired)_’ from each contig name.Example command to do that —>perl -ane 's/1_S7_R1_001_\(paired\)_trimmed_\(paired\)_//g; print' genome.fasta—CarsonOn Aug 30, 2017, at 3:54 PM, Emmanuel Nnadi <een...@gmail.com> wrote:Hi CarsonThanks for your response its been helpfulPlease bear with me as I work through this1. Please how do I generate EST for my novel sequences?2. I am currently running maker without EST and protein sequences is it wrong? Can it predict properly?3. One error in the contig just returned this valueFastaDB::_cleanIndexAndCompact(): Fasta file contains a sequence identifier which is too long ( max id length = 50 )at /usr/local/bin/RepeatMasker line 1464.FastaDB::_cleanIndexAndCompact(): Fasta file contains a sequence identifier which is too long ( max id length = 50 )at /usr/local/bin/RepeatMasker line 1464.FastaDB::_cleanIndexAndCompact(): Fasta file contains a sequence identifier which is too long ( max id length = 50 )at /usr/local/bin/RepeatMasker line 1464.ERROR: RepeatMasker failed--> rank=NA, hostname=emmannaemekas-MacBook-Pro.localERROR: Failed while doing repeat maskingERROR: Chunk failed at level:0, tier_type:1FAILED CONTIG:1_S7_R1_001_(paired)_trimmed_(paired)_contig_2ERROR: Chunk failed at level:2, tier_type:0FAILED CONTIG:1_S7_R1_001_(paired)_trimmed_(paired)_contig_2examining contents of the fasta file and run log
Nnadi Nnaemeka EmmanuelDepartment of Microbiology,Faculty of Natural and Applied Science,Plateau State University, Bokkos, Plateau State, Nigeria.
On Wed, Aug 30, 2017 at 4:12 PM, Carson Holt <cars...@gmail.com> wrote:You can query valid species names using the queryTaxonomyDatabase.pl script that comes with RepeatMasker. Try not to be too specific. In general you should use the genus rather than the species for example (or even use all of RepBase).Example —>perl …/RepeatMasker/util/queryTaxonomyDatabase.pl -species “drosophila"—CarsonOn Aug 30, 2017, at 9:05 AM, Emmanuel Nnadi <een...@gmail.com> wrote:Hi Carson,ThanksI was able to start using maker.However I am working with a plant Genome novel. I had set the repeatmasking to1. Dcotrep a names from the repbase release but maker returned it back as not known to repeat maskerHow can I use specific known genomes for repeat masking
ThanksNnadi Nnaemeka Emmanuel
Department of Microbiology,
Faculty of Natural and Applied Science,
Plateau State University, Bokkos, Plateau State, Nigeria.
Publications: https://www.researchgate.net/profile/Emmanuel_Nnadi/publications
On Aug 29, 2017 4:26 PM, "Carson Holt" <cars...@gmail.com> wrote:MAKER will read the genome= options from the maker_opts.ctl file in your current directory or the maker_opts.ctl you specified on the command line. The error means you have left the value empty. Perhaps you did not save the changes you made or you did not specify the location of the maker_opts.ctl file to use.You can check the contents of the file using cat. Example —> cat maker_opts.ctl—CarsonOn Aug 29, 2017, at 5:11 AM, Emmanuel Nnadi <een...@gmail.com> wrote:Hi Carson,Thanks a lot for yesterday. I was able to resolve the issue of running maker and i followed the commands in the tutorial.I however encountered another problemwhen I ran the command nano -c maker_opts.ctlIt gave the following 1_S7_assembly.fa I specified the name of the genome but when I ran maker in another tab it gave#-----Genome (these are always required)genome=1_S7_assembly.fa #genome sequence (fasta file or fasta embeded in GFF3 file)organism_type=eukaryotic #eukaryotic or prokaryotic. Default is eukaryotic#-----Re-annotation Using MAKER Derived GFF3maker_gff= #MAKER derived GFF3 fileest_pass=0 #use ESTs in maker_gff: 1 = yes, 0 = noaltest_pass=0 #use alternate organism ESTs in maker_gff: 1 = yes, 0 = noprotein_pass=0 #use protein alignments in maker_gff: 1 = yes, 0 = norm_pass=0 #use repeats in maker_gff: 1 = yes, 0 = nomodel_pass=0 #use gene models in maker_gff: 1 = yes, 0 = nopred_pass=0 #use ab-initio predictions in maker_gff: 1 = yes, 0 = noother_pass=0 #passthrough anyything else in maker_gff: 1 = yes, 0 = no#-----EST Evidence (for best results provide a file for at least one)est= #set of ESTs or assembled mRNA-seq in fasta formataltest= #EST/cDNA sequence file in fasta format from an alternate organismest_gff= #aligned ESTs or mRNA-seq from an external GFF3 filealtest_gff= #aligned ESTs from a closly relate species in GFF3 format#-----Protein Homology Evidence (for best results provide a file for at least one)protein= #protein sequence file in fasta format (i.e. from mutiple oransisms)protein_gff= #aligned protein homology evidence from an external GFF3 file#-----Repeat Masking (leave values blank to skip repeat masking)model_org=all #select a model organism for RepBase masking in RepeatMaskerrmlib= #provide an organism specific repeat library in fasta format for RepeatMaskerrepeat_protein=/Users/emmannaemeka/Desktop/Gpm/maker/data/te_proteins.fasta #provide a fasta file of transposable element proteins for RepeatRunnerrm_gff= #pre-identified repeat elements from an external GFF3 fileprok_rm=0 #forces MAKER to repeatmask prokaryotes (no reason to change this), 1 = yes, 0 = nosoftmask=1 #use soft-masking rather than hard-masking in BLAST (i.e. seg and dust filtering)I ran maker command on another tab and it returned the followingSTATUS: Parsing control files...ERROR: You have failed to provide a value for 'genome' in the control files.
--> rank=NA, hostname=emmannamekasMBPQuestions1. Specifying the genome location, do I need to run maker on the same tab or open another bash tab?2. My genome is novel and do not have proteins, how do I generate protein fast for the de novo sequence and EST?Thanks
Nnadi Nnaemeka EmmanuelDepartment of Microbiology,Faculty of Natural and Applied Science,Plateau State University, Bokkos, Plateau State, Nigeria.
On Mon, Aug 28, 2017 at 6:47 PM, Carson Holt <cars...@gmail.com> wrote:Here is a class on how to use MAKER taught a couple of years back —> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_GMOD_Online_Training_2014There is also a linked video as well as an amazon image of the class material where you can run the image in the cloud and follow along.Thanks,CarsonOn Aug 28, 2017, at 11:43 AM, Emmanuel Nnadi <een...@gmail.com> wrote:Hi Carson,Thanks a lotI ran this command maker -h it returned the followingThe last thing I wish to ask you, how can I load my genome fine and being annotation?ThanksemmannamekasMBP:maker emmannaemeka$ maker -hMAKER version 2.31.9Usage:maker [options] <maker_opts> <maker_bopts> <maker_exe>Description:MAKER is a program that produces gene annotations in GFF3 format usingevidence such as EST alignments and protein homology. MAKER can be used toproduce gene annotations for new genomes as well as update annotationsfrom existing genome databases.The three input arguments are control files that specify how MAKER shouldbehave. All options for MAKER should be set in the control files, but afew can also be set on the command line. Command line options provide aconvenient machanism to override commonly altered control file values.MAKER will automatically search for the control files in the currentworking directory if they are not specified on the command line.Input files listed in the control options files must be in fasta formatunless otherwise specified. Please see MAKER documentation to learn moreabout control file configuration. MAKER will automatically try andlocate the user control files in the current working directory if thesearguments are not supplied when initializing MAKER.It is important to note that MAKER does not try and recalculated data thatit has already calculated. For example, if you run an analysis twice onthe same dataset you will notice that MAKER does not rerun any of theBLAST analyses, but instead uses the blast analyses stored from theprevious run. To force MAKER to rerun all analyses, use the -f flag.MAKER also supports parallelization via MPI on computer clusters. Justlaunch MAKER via mpiexec (i.e. mpiexec -n 40 maker). MPI support must beconfigured during the MAKER installation process for this to work thoughOptions:-genome|g <file> Overrides the genome file path in the control files-RM_off|R Turns all repeat masking options off.-datastore/ Forcably turn on/off MAKER's two deep directorynodatastore structure for output. Always on by default.-old_struct Use the old directory styles (MAKER 2.26 and lower)-base <string> Set the base name MAKER uses to save output files.MAKER uses the input genome file name by default.-tries|t <integer> Run contigs up to the specified number of tries.-cpus|c <integer> Tells how many cpus to use for BLAST analysis.Note: this is for BLAST and not for MPI!-force|f Forces MAKER to delete old files before running again.This will require all blast analyses to be rerun.-again|a recaculate all annotations and output files even if nosettings have changed. Does not delete old analyses.-quiet|q Regular quiet. Only a handlful of status messages.-qq Even more quiet. There are no status messages.-dsindex Quickly generate datastore index file. Note that thiswill not check if run settings have changed on contigs-nolock Turn off file locks. May be usful on some file systems,but can cause race conditions if running in parallel.-TMP Specify temporary directory to use.-CTL Generate empty control files in the current directory.-OPTS Generates just the maker_opts.ctl file.-BOPTS Generates just the maker_bopts.ctl file.-EXE Generates just the maker_exe.ctl file.-MWAS <option> Easy way to control mwas_server for web-based GUIoptions: STOPSTARTRESTART-version Prints the MAKER version.-help|? Prints this usage statement.
Emmanuel,
Look for anything that will help calculate basic assembly metrics, such as N50, NG50, L50, etc.; these almost always give overall assembly size, and total scaffolds/contigs. For instance I’ve used this:
http://korflab.ucdavis.edu/datasets/Assemblathon/Assemblathon2/Basic_metrics/assemblathon_stats.pl
(it requires FALite, which is here: http://korflab.ucdavis.edu/Unix_and_Perl/FAlite.pm )
The Broad also has GAEMR (http://software.broadinstitute.org/software/gaemr/ ), but I haven’t tested it myself (I’ve heard it’s a bit finicky).
Also, see this: https://www.biostars.org/p/237591/ , which has a few more options.
chris