Web Images Videos Maps News Shopping Gmail more »
Recently Visited Groups | Help | Sign in
Google Groups Home
Group info
Recent pages and files
Combine sub-arrays for analysis    

New method

 

[Version 4/10/06+] We can combine two sub-arrays at "Open group" without using external data files as below. (Discussion for combing SNP sub arrays)

  1. First analyze the data of each sub-array separately through normalization and model-based signal to obtain DCP files.
  2. Then at "Analysis/Open group/Other information", specify the 2nd sub-array CDF file at "Subarray CDF" and uncheck "Open group/Options/Load probe data in memory". dChip will assume the CEL/DCP file name of the 2nd sub-array the same as the 1st sub-array, except with "_2" in the end. For example, 01X_298B_x.dcp and 01x_298B_x_2.dcp will match each other.
  3. After "Open group", the genome information files containing the probe sets of both sub-arrays should be used at "Analysis/Chromosome". For the group name at "Open group", using the group name of the first subarray is fine, since this step just reads in additional signal and call values from the 2nd subarray, without altering DCP files.
  4. [V10/17/07+] The same sample information file used for the first subarray can be used for the second subarray, since dChip will add "_2" to the array names in sample information file and try to match them to the array names in a group.

 

Old method

Since arrays for each sub-array type are normalized among themselves, the expression values for each probe set are comparable across samples. We can row-wise combine the probe sets of the sub-chips using the following steps.

 

1. Open a group of subA arrays (clear the “Gene Information File” in the “Analysis/Open Group/other information” dialog to make the output not having gene descriptions, or delete the gene description columns in the export file to confirm to the “Get external data” format ). In the “Export data” dialog, select arrays to be exported, check “has absolute call” and “has standard error”, click OK to export the data (assuming the output file is 11kA.xls).

 

2. Do the same for a group of subB arrays, but make sure the columns (arrays) correspond to 11kA.xls (assuming the output file is 11kB.xls).

 

For the second sub-array, you can check “Tools/Export expression value/Append to this file” to append the output data to an existing data file of the first sub-array. The array list file used should be the same or have the same array ordering for the existing file and the data to be exported. Afterwards, open the file in Excel and delete the “gene, Accession, LocusLink, Description” columns if there is any, and save as text file.

 

3. Open 11ka.xls and 11kb.xls, copy all the data in 11kb.xls except the first “array name” row, and paste it into 11kA.xls file starting from the last blank row. Delete the “gene, Accession, LocusLink, Description” columns if there is any. Save 11ka.xls using Text (tab delimited) format, ignoring warning message by clicking “Yes”.

 

4. Close groups in dChip. Select “Analysis/Get External Data”, choose “data file” to be 11ka.xls, check “has absolute call” and “has standard error”. The data will be read in.

 

5. Do clustering or comparing samples as usual. Currently the maximum number of genes dChip 1.1 can read in is 23000 (65000 in dChip 1.3). So you may need to first use “Compare Samples”, “Filter genes” or “Clustering/Export Selected” to get a subset of gene names. Then use such files as “Gene list file” in the “Tools/Export Data/Expression values” dialog to export only the data for the filtered genes for each sub chip type, and then combine data files for dChip to read in.

 

6. Gene information files for the two sub-chips can be combined row-wise (save as tab-delimited text format) and specified at “Analysis/Get external data/Other information/Gene information file”. A better way is to use the “Tools/Make gene information” function, which can accept a combined CSV files of sub-arrays made by using Windows “Command Prompt” command such as “copy HG-U133?_annot.csv HG-U133.csv” (do not replace “?” by A or B; “?” represents any character in the command.). Directly copying and pasting the two CSV files in text editor or Excel can change the CSV file format and should not be used.

Version: 
Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google