FIMO Output

78 views
Skip to first unread message

Prashant Kumar

unread,
Oct 25, 2016, 7:03:37 PM10/25/16
to MEME Suite Q&A
Hi All,

I used the command line version of FIMO.

fimo --oc . --verbosity 1 --bgfile E_coli_K12_MG1655.fna.bfile --thresh 1.0E-5 x.meme E_coli_K12_MG1655.fna

I don't get the same output  as the website.I don't see the gene names for each matched result.

I downloaded the Ecoli .fna file from NCBI. Is FIMO using a different version?

Thanks,
Prashant

CharlesEGrant

unread,
Oct 25, 2016, 7:45:40 PM10/25/16
to MEME Suite Q&A
Which database category did you select on the web form? For E Colil K12 MG1655 we provide both the full genome (under Genbank Bacteria Genomes and Proteins), and upstream sequences (under Upstream Sequences Prokaryotic). For all the online databases the source information can be found in the "Sequence Databases" section under the "Databases" menu in the side bar. 

The full genome wouldn't be annotated with gene names, so I suspect you were looking at the "Upstream Sequences" category. We download the Prokaryotic upstream sequences from RSAT

Prashant Kumar

unread,
Oct 25, 2016, 9:00:35 PM10/25/16
to MEME Suite Q&A
Thanks a lot

Prashant Kumar

unread,
Oct 26, 2016, 12:06:25 PM10/26/16
to MEME Suite Q&A
I downloaded the file from http://rsat-tagc.univ-mrs.fr/rsat/

and ran FIMO with the same command line syntax: I get output with the gene names this time.

But the website output is different and bigger than what I got with the command line syntax.

CharlesEGrant

unread,
Oct 26, 2016, 3:50:43 PM10/26/16
to MEME Suite Q&A
Did you set the p-value threshold on the website to 1e-05 as you did your command line?

If so, please attach copies of the two FIMO outputs. That would help us diagnose the problem.

Prashant Kumar

unread,
Oct 26, 2016, 4:37:44 PM10/26/16
to MEME Suite Q&A
Yes, I had set the p-value <= 1e-5

Website output:
#pattern name	sequence name	start	stop	strand	score	p-value	q-value	matched sequence
1	YP_026182.1|ascF	161	172	+	16.3761	2.59e-07	0.248	GTGAAACCGGTT
1	NP_417194.2|ascG	88	99	-	16.3761	2.59e-07	0.248	GTGAAACCGGTT
1	NP_417370.1|xerD	112	123	+	16.1743	4.6e-07	0.248	GTGAAACAGGAT
1	YP_026182.1|ascF	89	100	-	16.1651	5.1e-07	0.248	GTGAAACCGGTC
1	NP_417194.2|ascG	160	171	+	16.1651	5.1e-07	0.248	GTGAAACCGGTC
1	NP_416820.1|dedA	198	209	+	16.1651	5.1e-07	0.248	GTGAAACCGGTC
1	NP_417877.1|malT	272	283	+	15.6514	1.21e-06	0.442	GTGAAACAGTTT
1	YP_026218.1|malP	329	340	-	15.6514	1.21e-06	0.442	GTGAAACAGTTT
1	NP_417274.1|queF	61	72	+	15.5963	1.55e-06	0.501	GTGAAACATGTC
1	NP_418354.1|tpiA	34	45	-	15.2385	2.75e-06	0.687	GTGAAACAGTAT
1	NP_418175.3|yieL	132	143	+	15.2385	2.75e-06	0.687	GTGAAACAGTAT
1	NP_414879.3|lacI	77	88	+	15.1835	3.14e-06	0.687	GTGAAACCAGTA
1	YP_026189.1|yghJ	626	637	+	15.1835	3.14e-06	0.687	GTGAAACCTGAT
1	NP_415055.1|purK	1	12	+	15.0734	3.32e-06	0.687	ATGAAACAGGTT
1	NP_417020.1|fdx	72	83	+	15.0275	3.68e-06	0.687	GTGAAACCATTC
1	NP_414653.1|ampE	103	114	+	14.8716	4.29e-06	0.687	GTGAAACATTTT
1	NP_415901.2|ydbL	108	119	+	14.8716	4.29e-06	0.687	GTGAAACATTTT
1	NP_418735.2|fimI	248	259	-	14.8624	4.56e-06	0.687	ATGAAACCGGTT
1	NP_415276.1|gpmA	258	269	-	14.8624	4.56e-06	0.687	GTGAAACGGTTT
1	NP_416410.1|otsA	131	142	+	14.8073	5.31e-06	0.687	GTGAAACAGGGA
1	NP_417227.1|ispD	169	180	+	14.8073	5.31e-06	0.687	GTGAAACGTGTC
1	NP_415279.1|galT	74	85	-	14.7706	5.42e-06	0.687	GTGAAACCAGAA
1	NP_417074.1|grcA	485	496	+	14.7706	5.42e-06	0.687	GTGAAACCAGAA
1	NP_414698.1|erpA	219	230	-	14.6147	6.4e-06	0.713	GTGAAACCATAC
1	NP_416743.1|glpT	91	102	+	14.6055	6.72e-06	0.713	GTGAAACGTGAT
1	NP_417418.1|galP	114	125	-	14.6055	6.72e-06	0.713	GTGAAACGTGAT
1	NP_416744.1|glpA	171	182	-	14.6055	6.72e-06	0.713	GTGAAACGTGAT
1	NP_414790.1|insI1	222	233	+	14.4587	7.13e-06	0.713	GTGAAACAATTA
1	b4587|insN	222	233	+	14.4587	7.13e-06	0.713	GTGAAACAATTA
1	b1577|ydfE	19	30	-	14.4495	7.73e-06	0.713	ATGAAACCGGAT
1	NP_415006.1|htpG	146	157	+	14.2936	8.37e-06	0.713	GTGAAACAGCTT
1	NP_415419.2|ycaM	24	35	-	14.2477	9.05e-06	0.713	GTGAAACAGTAA
1	NP_414865.1|prpB	39	50	-	14.2477	9.05e-06	0.713	ATGAAACAAGAC
1	NP_416307.1|yoaF	77	88	+	14.2477	9.05e-06	0.713	ATGAAACAAGAC
1	NP_416308.4|yeaP	95	106	-	14.2477	9.05e-06	0.713	ATGAAACAAGAC
1	NP_414864.1|prpR	189	200	+	14.2477	9.05e-06	0.713	ATGAAACAAGAC
1	NP_416130.3|manA	359	370	-	14.2477	9.05e-06	0.713	GTGAAACGATAT
1	NP_416300.2|yeaJ	146	157	-	14.1927	9.58e-06	0.716	GTGAAACGAGAA
1	NP_417027.1|trmJ	236	247	+	14.1927	9.58e-06	0.716	GTGAAACCCGAC

Command Line Output:
#pattern name   sequence name   start   stop    strand  score   p-value q-value matched sequence
1       947154|ascF        89      100     -       16.7196 3.4e-08 0.0183  GTGAAACCGGTC
1       947305|ascG        160     171     +       16.7196 3.4e-08 0.0183  GTGAAACCGGTC
1       947305|ascG        88      99      -       16.6168 1.32e-07        0.0355  GTGAAACCGGTT
1       947154|ascF        161     172     +       16.6168 1.32e-07        0.0355  GTGAAACCGGTT
1       947921|malT        61      72      +       15.2617 2.37e-06        0.426   GTGAAACAGTTT
1       947922|malP        118     129     -       15.2617 2.37e-06        0.426   GTGAAACAGTTT
1       948409|tpiA        34      45      -       14.8505 4.61e-06        0.608   GTGAAACAGTAT
1       948035|yhjC       312     323     +       14.6822 5.26e-06        0.608   GTGAAACCCGGC
1       946704|glpT        91      102     +       14.5327 6.2e-06 0.608   GTGAAACGTGAT
1       947434|galP        91      102     -       14.5327 6.2e-06 0.608   GTGAAACGTGAT
1       946713|glpA        171     182     -       14.5327 6.2e-06 0.608   GTGAAACGTGAT
1       946113|ydfE 19      30      -       14.3738 7.57e-06        0.68    ATGAAACCGGAT
1       946290|yeaJ        146     157  

CharlesEGrant

unread,
Oct 27, 2016, 1:29:09 PM10/27/16
to meme-...@googlegroups.com
I'm not sure what you downloaded. It is similar, but clearly not exactly the same set of sequences as our online database.

The MEME Suite web site uses the RSAT programatic API to download the upstream sequences. The database download source code we use can be found in the MEME Suite source distribution under the Projects/meme-trunk/website/src/au/edu/uq/imb/memesuite/updatedb directory. We don't provide assistance in running this program locally though. 

However, you should be able to use the RSAT web interface to get the same data.

Go to the RSAT Prokaryotes web page. Click on the "Sequence Tools", choose "retreive sequence", select your organism, click "All" in the "Genes" radio button, click the "CDS" radio button for the "Reference feature type", select "upstream" as the sequence type, set "from" to -1000 and "to" to "200". Finally, set the "Sequence label" to "gene identifier + name", then click on the "GO" button.


Reply all
Reply to author
Forward
0 new messages