Number of sequences in database

24 views
Skip to first unread message

James

unread,
Nov 22, 2012, 1:33:32 PM11/22/12
to slimfinder...@googlegroups.com
I've been using SLiMSearch3 for looking at motif conservation and have found it very helpful. One question I have: how many sequences are in the human database that is being queried? For instance, for one of my motifs it reports 2562 instances in 2101 proteins, but I'd like to know out of how many total proteins this refers to, to find out if the motif I have is enriched in my dataset, relative to the whole proteome.

I can't seem to find this information. In the SLiMSearch2 paper in mentions the Uniprot release v1.37, but I'm not sure where to get this or if there is a new release that is being searched.

Thanks for the help.

James

Niall Haslam

unread,
Nov 22, 2012, 2:47:42 PM11/22/12
to slimfinder...@googlegroups.com
Hi James,

Thanks for your comments.

The uniprot release used has 20284 proteins, however, due to some filtering for sequences that it is impossible to align there are effectively only 20258 proteins in the final search database. If you want I can send you on a list of the UniProt Ids?

Niall




James

--
You received this message because you are subscribed to the Google
Groups "SLiMFinder Users" group.
To post to this group, send email to
slimfinder...@googlegroups.com
To unsubscribe from this group, send email to
slimfinder-user-...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/slimfinder-user-group?hl=en
To visit the SLiMFinder webserver goto:
http://bioware.ucd.ie/~compass/

James

unread,
Nov 22, 2012, 2:53:51 PM11/22/12
to slimfinder...@googlegroups.com
Thanks Niall, that's perfect, I just needed to know the total number of proteins.

James



On Thursday, November 22, 2012 2:47:43 PM UTC-5, Niall wrote:
Hi James,

Thanks for your comments.

The uniprot release used has 20284 proteins, however, due to some filtering for sequences that it is impossible to align there are effectively only 20258 proteins in the final search database. If you want I can send you on a list of the UniProt Ids?

Niall
On 22 November 2012 18:33, James <knig...@gmail.com> wrote:
I've been using SLiMSearch3 for looking at motif conservation and have found it very helpful. One question I have: how many sequences are in the human database that is being queried? For instance, for one of my motifs it reports 2562 instances in 2101 proteins, but I'd like to know out of how many total proteins this refers to, to find out if the motif I have is enriched in my dataset, relative to the whole proteome.

I can't seem to find this information. In the SLiMSearch2 paper in mentions the Uniprot release v1.37, but I'm not sure where to get this or if there is a new release that is being searched.

Thanks for the help.


James

--
You received this message because you are subscribed to the Google
Groups "SLiMFinder Users" group.
To post to this group, send email to
slimfinder...@googlegroups.com
To unsubscribe from this group, send email to
Reply all
Reply to author
Forward
0 new messages