Subarna Sinha
unread,Oct 14, 2011, 9:10:10 PM10/14/11Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to dchip-s...@googlegroups.com
Dear Cheng,
I have a few questions about using dChip for inferring copy number from
Affymetrix SNP6 arrays for matched normal-tumor samples.
1. I have about 200 matched samples for tumor and normal SNP 6.0 data. I
wasn't sure if I could analyze all pairs at once. So I tried first to
match 1 pair (say, [a_tumor, a_normal]) and then to match a small set of
10 pairs (which also contains [a_tumor, a_normal]). I find the copy
number results for [a_tumor] are different depending on whether 1 pair
is used for the analysis or 10 pairs are used. Is this difference
expected? If so, what is your recommendation for running the 200
samples? Will I be able to run all 200 samples in dChip in one run?
2. Related to question 1, when you have matched tumor-normal samples, do
you use information from other sample pairs to derive copy number
information of a given tumor-normal sample pair? My understanding based
on what is written in the manual was you shouldn't but maybe I am wrong.
3. What kind of inference method would you recommend with 1 paired
sample versus multiple paired samples? I tried 'Median Smoothing'
versus 'HMM' and I found them to perform differently for 1 paired sample
versus 10 paired samples.
4. I am often getting fractional copy number after both HMM and Median
Smoothing, mainly with Median Smoothing. Is that expected? I was expecting copy number data to be
integers. I am getting numbers like 0.04, 15.19, etc. That seems really
strange.
5. Is there a description of how the output of copy number analysis
should be inferred?
I am using the latest dChip version. Below, you can find the dChip
parameters I am using.
CDF_FILE=C:\Users\subar\COMMON\Project\GenomeWideSNP_6.cdf
READ_DAT=0
READ_CEL=1
READ_DCP=0
DATA_PATH=C:\Users\subar\COMMON\Project\sample_subset
WORKING_DIR=C:\Users\subar\COMMON\Software\DCHIP
GOSURFER_DIR=C:\Users\subar\COMMON\Project
USE_UNNORM=0
MAS5_SIGNAL=0
OPTION_PAGE=3
SAMPLE_INFO_FILE=sample_info_subset.txt
GENE_NAME_FILE=
GENOME_INFO_FILE=C:\Users\subar\COMMON\Project\combined_genome_wide_snp6.txt
REF_GENE_FILE=None C:\Users\subar\COMMON\Project\refFlat.txt
CYTOBAND_FILE=
MASK_FILE=
DETECT_SINGLE_OUTLIER=1
NORM_SMOOTH_METHOD=0
PROBE_SEQ_FILE=
BACKGROUND_METHOD=1
NO_REP_ARRAY_OUTLIER=0
ALTERNATE_TWO_VIEW=0
WHICH_ONLINE_DATABASE=0
DIRECT_GO_ONLINE_DATABASE=0
MODEL_METHOD=1
DIST_MEASURE=0
LINKAGE_METHOD=0
OUTLIER_IN_RANGE=1
USE_CV_FILTER=1
CV_LOWER=0.5
CV_UPPER=1000
USE_CALL_FILTER=1
FILTER_PRESENCE=20
USE_EXPR_FILTER=0
EXPR_FILTER_VALUE=20
EXPR_FILTER_PCT=50
USE_REP_FILTER=0
REP_FILTER_LOW=0
REP_FILTER_HIGH=0.5
FILTER_INPUT_LIST=
FILTER_INPUT_USE=1
FILTER_OUTPUT_LIST=C:\Users\subar\COMMON\Software\DCHIP\dChip
association.xls
ANOVA_FACTOR=None (overall score)
ANOVA_PVALUE=0.05
CLUSTER_GENE=1
CLUSTER_SAMPLE=0
ARRAY_LIST_FILE=C:\Users\subar\COMMON\Software\DCHIP\dChip_array_list.txt
USE_STD_SEP=1
STANDARDIZE_COL=0
EXTERNAL_DATA_FILE=
MBEI_MEMORY=500
SHOW_PROFILE=0
ADD_NEW_COLOR=1
STANDARDIZE_ROW=1
GREEN_RED=0
DISPLAY_RANGE=5
SAMPLE_NAME_VISIBLE=1
SIG_SAMPLE_PVALUE=0.01
SIG_GENE_PVALUE=0.001
STORE_DISTANCE=1
CLUSTER_SHOW_PROBE_SET=0
GENE_LIST_FILE=
ALLOW_MISSING=0
HAS_STD=0
HAS_BOTH=1
SKIP_ROW_END=1
EXPORT_FILE_FORMAT=0