GOMo pipeline clarification

Gianluca Mattei

unread,

Aug 8, 2024, 2:03:46 PM8/8/24

to MEME Suite Q&A

Hi all,

Im am using MEME suite to find out if enriched motif can alter specific pathways.

I am here to ask if the commands I am using is fine. I have some doubts av=bout the go-term database file use d for GOMo input. Where I can find it? Is the one I downloaded with GOMo the one I should use or is it just an example file?

Thanks

G.

This is the commands I am using:

# to obtain the file with 0-order Markov Model

fasta-get-markov -m 0 ${FILE%.bed}.fasta > ${FILE%.bed}.xml

# to obtain the scoring files

ama --oc /output/Ama/$(basename ${FILE%.bed})_JASPAR2022_CORE -pvalues /database/motif_databases/JASPAR/JASPAR2022_CORE_non-redundant_v2.meme ${FILE%.bed}.fasta ${FILE%.bed}.xml

# running enrichment

gomo --oc /output/Gomo/$(basename ${FILE%.bed})_JASPAR2022_CORE --motifs /database/motif_databases/JASPAR/JASPAR2022_CORE_non-redundant_v2.meme /database/gomo_databases/mammal_homo_sapiens_1000_199.na.csv /output/Ama/$(basename ${FILE%.bed})_JASPAR2022_CORE/ama.xml

tlawb...@gmail.com

unread,

Aug 9, 2024, 8:30:12 PM8/9/24

to MEME Suite Q&A

Hi G.,

You can download the actual database used by GOMo here:
https://meme-suite.org/meme/doc/download.html
under Databases, click on "GOMo Databases".

Cheers,

T.

Gianluca Mattei

unread,

Aug 14, 2024, 7:19:59 AM8/14/24

to MEME Suite Q&A

Hi,

Thanks for the answer.

The files from that link are those I was using.

The point is that for all my analyses, which are different experiments and different parameters, GOMo did not return anithing (for a total of ~200 tries)

Since this is the command written in the tutorial:

gomo --oc gomo_example_output_files --dag go.dag --motifs dpinteract_subset.meme bacteria_escherichia_coli_k12_1000_199.na.csv ama.xml

I am using mammal_homo_sapiens_1000_199.na.csv, anyway this file seems to use uniprot ID. The pipeline (markov --> ama --> gomo) and other pipeline in MEME suite always returns symbols, could be this the problem?

Thanks

Gianluca

cegrant

unread,

Sep 9, 2024, 9:01:16 PM9/9/24

to MEME Suite Q&A

Hi Gianluca,

The GO term database (the first required argument to GOMO) maps sequences to the GO terms they have been annotated with, using the sequence name. The sequence names in the GO term database have to correspond with the sequence names in the FASTA file submitted to AMA. You are using mammal_homo_sapiens_1000_199.na.csv for your GO term database, but that was built for the sequences in the provided mammal_homo_sapiens_1000_199.na. But your sequence file is generated from Homo_sapiens_assembly38.fasta. This isn't going to work. None of your sequence names will match the annotated names in mammal_homo_sapiens_1000_199.na.cvs, so you get no results.

We provide the GO term database corresponding to the sequences databases we provide for the GOMO application. If you are going provide your own sequence file, you'll have to generate your own GO term database too. This might be problematic for you since your sequence names are going to just be chromosome names and coordinates as generated from the BED file. Annotations are typically in terms of gene or protein names.

Reply all

Reply to author

Forward