Hello Maggi,
The GenMAPP databases uses information from Ensembl for Agilent links,
so if a particular Agilent ID is not mapped according to Ensembl the
ID will not import with the GenMAPP database. This is most likely the
case for the subset of IDs in your data that don't import.
You say that you checked the expression dataset for the exceptions.
How did you check the IDs? Duplicated entries in the input file will
not be exceptions, they will all be imported as long as there is a
match in the database.
The MAPPFinder results you got seem reasonable. If you are not getting
as many hits as you expect, try setting a slightly less stringent
criteria for the MAPPFinder analysis.
It is not a problem that your dataset is quite large, in fact this is
often better to have a large dataset than a dataset that is too small.
Regards,
Kristina
On Jul 7, 1:48 am, maggi <m.franc...@gmail.com> wrote:
> Hello,
> I am new to GenMAPP MAPPFinder. I need some help understanding the
> process of input data import into GenMAPP.
> I have installed the GenMAPP 2. I am using Gene Databases Hs-
> Std_20070817.gdb
> My expression data set contains four columns.First column is Agilent
> probe Ids and second column is system code as Ag,Third colums contains
> p value. Fourth column as regulation (Up/Down) for criteria selection.
> After reading the recommendation for the Input data I understand that
> I have to upload entire probe information on the array (background
> information or measured gene information). For instance Iam using
> agilent human whole genome array consisting of 43373 probes
> representing some of the genes multiple times.
> After loading the expression data I get exception of 11038 ids saying
> gene not found in agilent or related system. But I checked some of the
> genes from the exception file to see if the genes are present in the
> expression dataset. I found that most of the genes which are marked as
> exceptions are present in the expression data set. Does this means
> that exception ids are actually multiple gene entries in the dataset?
> After running the GenMAPP and MAPFinder I get the results with the
> following summary:-
> My calculation summary in the MAPPFinder result looks like this:
> Calculation Summary:
> 320 probes met the [Regulation] = "Down" criteria.
> 287 probes meeting the filter linked to a Ensembl ID.
> 211 genes meeting the criterion linked to a GO term.
> 43373 Probes in this dataset
> 31648 Probes linked to a Ensembl ID.
> 14993 Genes linked to a GO term.
> The z score is based on an N of 14993 and a R of 211 distinct genes in
> the GO.
> Am I doing something wrong?
> Do I have to eliminate the multiple gene entries before importing the
> expression data to GenMAPP?
> Is this a large dataset to analyze?
> Any help would be welcome.
> Thank you for your help,
> Maggi.