Size of data universe in IPA v5 KB?

8 views
Skip to first unread message

Yannick Pouliot

unread,
Jun 12, 2007, 6:34:06 PM6/12/07
to Ingenuity Customer Support, stanf...@googlegroups.com

Hello. I'm preparing a class that features IPA and am looking for the number of assertions contained in the IPA knowledge base, and how these numbers have changed since June 2006 if possible, specifically:

 

1. total number of interactions

2. gene coverage

3. number of metabolic and signaling pathways

 

Might you have these numbers?

 

Cheers,

 

Yannick

 

_____________________

Yannick Pouliot, PhD MBA

Bioresearch Informationist

Lane Medical Library

Knowledge Management Center

Information Resources and Technology

Stanford University

ypou...@stanford.edu

http://lane.stanford.edu/contacts/pouliot.html

F: 650-725-2238

 

lhi...@ingenuity.com

unread,
Jun 14, 2007, 1:59:26 PM6/14/07
to StanfordIPA
//
Question 1: What are the total number of interactions in Ingenuity
Pathways Analysis (IPA)?

Answer 1:
Ingenuity Pathways Analysis accesses knowledge stored in the Ingenuity
Pathways Knowledge Base. This knowledge base houses millions of
relationships between proteins, genes, complexes, cells, tissues,
drugs and diseases, including millions of pathway interactions
extracted from literature and systematic capture of canonical pathway
relationships. All relationships are manually curated and modeled by
a team of Ph.D. scientists, and are supported by experimental evidence
published in the peer-reviewed literature.

The collection of Interactions for a particular gene or gene product
can be viewed using the Neighborhood Explorer feature in IPA by first
searching for a gene, then clicking on the Neighborhood Explorer link
within the Gene View page. Interactions between molecules (proteins,
genes, drugs, chemicals, etc.) can be searched for and explored using
specific interaction search criteria on the Pathway canvas view within
IPA.

For a more complete description of the types of interaction types in
the Ingenuity Pathways Knowledge Base, click here:
http://www.ingenuity.com/products/pathways_knowledge.html

//

Question 2: What is the gene coverage in the Ingenuity Pathways
Knowledge Base?

Answer 2:
Ingenuity Pathways Analysis includes coverage of all of the NCBI
EntrezGene Human, Mouse and Rat gene identifiers, and mapping
capability for various other types of identifiers to the EntrezGene
identifiers (which IPA uses as a primary identifier). The specific
type of knowledge covered and available within IPA for each gene
varies depending on the source material available in the knowledge
base.

In addition to the expert extracted primary literature findings for a
gene, the Ingenuity Pathways Knowledge Base includes many other types
of content, including major NCBI databases (EntrezGene, RefSeq, OMIM
disease associations), FDA approved and clinical trial drugs, Gene
Ontology annotations, normal gene expression for various tissues from
the Genome Novartis Foundation Body Atlas, KEGG and LIGAND metabolic
pathways, cell signaling pathways, and more.

Detailed information regarding coverage for each gene within IPA can
be accessed through the unique GeneView page for each gene (see link
below for an example). On the Gene View page within IPA, aggregate
knowledge for each gene is displayed along with categorized literature
findings from the knowledge base, organized according to categories
from the Ingenuity Ontology. These Categorized Literature Findings
appear at the bottom of each Gene View page, and are linked to the
citation and PubMed Identifier for the primary source article.

e.g.
Ingenuity Gene View: VEGFA (7206 categorized literature findings)
https://analysis.ingenuity.com/pa/api/v2/geneview?geneidtype=entrezgene&geneid=7422&applicationname=Entrez
Note: Please login using your IPA username & password to view the Gene
View Page

//

Question 3: What are the total number of metabolic and signaling
pathways within IPA?

Answer 3:
Ingenuity Pathways Analysis includes a Library of consensus metabolic
and cell signaling pathways from primary sources such as KEGG/LIGAND,
annual reviews, journal publications, textbooks, etc. The Ingenuity
library of Canonical Pathways is available to view within your IPA
user account. Each IPA Library pathway has been expertly modeled by a
PhD Biologist to represent the pathway as reported in the primary
reference source (cited for each pathway).

Ingenuity's goal in assembling the Ingenuity Pathway Library is to
represent the well studied and well described metabolic and signaling
pathways for the organisms currently supported in IPA (Human, Mouse &
Rat). The pathways within the IPA Library include a representative
view of the relevant biology from the cited source material. Pathways
with multiple inputs and end points have been represented as unified
pathways representing a biological process, rather than being broken
apart. The goal of this approach is to maximize the representative
biological view presented to the biologist. The library has been
constructed without consideration of total number of pathways. The
definition and concept of a pathway varies by source and author.
Ingenuity's goal is to accurately model canonical, or well accepted,
pathways from highly reliable primary sources.

In addition to the library of metabolic and signaling pathways
available within IPA, users can create and share their own MyPathways
representing knowledge of the pathway particular to their field of
research. This feature can also be used to model pathways from
various other public and private sources to which the individual
research has access (e.g. BioCarta, Science STKE, Reactome, Cancer
Cell Maps, HumanCyc, GenMapp, etc). Once modeled within an IPA user
account, these MyPathways can be used in the identical way to the pre-
existing pathways for enrichment scoring of newly analyzed datasets,
to overlay expression results, etc. We encourage users to build and
share their own library of pathways using the modeling tools and
sharing capabilities within IPA.

//


On Jun 12, 3:34 pm, "Yannick Pouliot" <ypoul...@stanford.edu> wrote:
> Hello. I'm preparing a class that features IPA and am looking for the number
> of assertions contained in the IPA knowledge base, and how these numbers
> have changed since June 2006 if possible, specifically:
>
> 1. total number of interactions
>
> 2. gene coverage
>
> 3. number of metabolic and signaling pathways
>
> Might you have these numbers?
>
> Cheers,
>
> Yannick
>
> _____________________
>
> Yannick Pouliot, PhD MBA
>
> Bioresearch Informationist
>
> Lane Medical Library
>
> Knowledge Management Center
>
> Information Resources and Technology
>
> Stanford University
>

> <mailto:ypoul...@stanford.edu> ypoul...@stanford.edu
>
> <http://lane.stanford.edu/contacts/pouliot.html>http://lane.stanford.edu/contacts/pouliot.html
>
> F: 650-725-2238

Yannick Pouliot

unread,
Jun 14, 2007, 4:51:19 PM6/14/07
to StanfordIPA
Thanks Lucas. Hum, I note the absence of hard, definitive numbers for
all three questions as they pertain to the state of the KB at time X.
Are these not disclosed? Being able to characterize the growth of the
database in general terms is what I'm seeking.

Cheers,

Yannick

> Ingenuity Gene View: VEGFA (7206 categorized literature findings)https://analysis.ingenuity.com/pa/api/v2/geneview?geneidtype=entrezge...

> > F: 650-725-2238- Hide quoted text -
>
> - Show quoted text -

Lucas Hickey

unread,
Jun 15, 2007, 12:02:07 PM6/15/07
to stanf...@googlegroups.com, ypou...@stanford.edu
Hi Yannick,

To track growth of the knowledgebase, we typically use the metric of number of literature findings, arising from our on-going expert extraction activity. The number of unique findings in our system with our current content release (5.1 - June 2007) is now > 1.7 million (~1.717M). This is an increase of ~ 200k - 300k findings over the past year, since June 2006.

Each finding is extracted from the full text of peer-reviewed journal publications by a trained PhD Biologist, most with >5 years of modeling experience, and affiliation with a top Academic Research Institution in the US, UK or Australia. Findings are structured using an extraction and reporting protocol and a web-based finding modeling and entry tool. This highly controlled process ensures capture and accurate representation of the findings and context of each finding as substantiated with experimental evidence presented by the authors. This protocol ensures accurate representation of the authors scientifically valid observations, while minimizing false associations or speculative findings arising from extraction of non-substantiated findings, or erroneous relationships characteristic of alternative text-mining approaches.

Interaction findings, reporting an experimentally observed interaction between two molecules substantiated by evidence presented in the full text article, comprise a percentage of the total number of literature findings in our KB (approx. 57%).

Hopefully this will serve as an adequate proxy for measure of Knowledgebase growth moving forward, and provide some additional context for readers regarding the origination and quality (believability) of content reported in the Ingenuity KB, accessed via IPA.


Thank you Yannick.

Lucas


- Lucas Hickey, Ingenuity Systems, www.ingenuity.com, office (650) 381-5056, mobile (408) 891-5705

Cheers,

Yannick

----+----
This email message (and any attached document) contains information from Ingenuity Systems Inc. which may be considered confidential by Ingenuity, or which may be privileged or otherwise exempt from disclosure under law, and is for the sole use of the individual or entity to whom it is addressed. Any other dissemination, distribution or copying of this message is strictly prohibited. If you receive this message in error, please notify me and destroy the attached message (and all attached documents) immediately.

Reply all
Reply to author
Forward
0 new messages