23andme says (https://www.23andme.com/you/faqwin/dataaccuracy/) their technology
has over 99.9% repeatability. That means, if you get genotyped twice,
99.9% of the results will be the same.
That does not mean 99.9% accuracy. Particular DNA sequences might be biased
to be likely to be repeatably incorrect. This was the case with
Affymetrix arrays.
I'm not familiar with the 23andme technology.
This page:
http://www.chromosomechronicles.com/2010/03/27/analysis-of-23andmes-genotyping-high-accuracy-of-illumina-platform-confirmed-by-comparing-siblings/
claims that the SNP calls are 99.15% accurate.
There are about 1 million SNP calls in the 23andme data.
This would mean there are about 8500 incorrect SNP calls.
423586 of the 956734 SNPs in the report are listed in dbSNP.
I don't know why over half of the given rs numbers are not listed in
dbSNP; that shouldn't happen.
Only about 70,000 of the SNPs on the chip are non-synonymous
changes in protein (other changes may be important, but we seldom
know the importance of specific SNPs outside genes).
Only 3370 of the SNPs on the chip can be looked up in dbSNP,
and are in a named gene that has an entry in OMIM.
So we expect to find about (1-.9915) x 3370 = 28.6 false positives
(false genotypes reported at SNP locations).
Most alleles are homozygous;
so about 3/4 of these false positives will mismatch the disease allele.
Some that match, will have a heterozygous match to a recessive allele.
2603/3370 have dominance info; 2/3 of these are dominant.
If 1/4 of genes are heterozygous, screening out recessives will remove about
2603/3370 x 1/3 x 1/4 x 28.6 = 1.84 of the false positives.
So we expect 28.6 x 1/4 - 1.84 = 5.3 false positive disease indications.
(Note dominance should be applied per mutation rather than per gene,
but I don't have that info.)
I found zero implicated disease mutations in my data.
(This was not due to bugs in the program filtering out all results.)
Sequencing a dozen genes with indicated SNPs
and checking the accuracy of the SNPs in them
is something that would be cheap and worth doing.
Neither 23andme nor Illumina provides any data
on the accuracy of the SNP chip! Only on reproducability.
(23andme doesn't even tell you exactly which Illumina chip they use.)
This is the process I used:
0. Download your file from 23andme.
1. Download dbSNP from NCBI at ftp://ftp.ncbi.nih.gov/snp/.
You only need database/shared_data/SnpFunctionCode.bcp.gz
and database/organisms/human_9606/database/organism_data/b132_SNPContigLocusId_37_1.bcp.gz
.
2. Read the explanation of SNPContigLocusId at
http://www.ncbi.nlm.nih.gov/projects/SNP/snp_db_table_description.cgi?t=SNPContigLocusId
3. Build a database from b132_SNPContigLocusId_37_1.bcp.
Save snp_id, locus_symbol (gene name when in a gene), protein_acc,
fxn_class, allele.
Index by snp_id.
4. Download OMIM from ftp://ftp.ncbi.nih.gov/repository/OMIM/ARCHIVE/ .
5. Build a DB from OMIM/ARCHIVE/genebank and omim.txt.
Read dominance info from omim.txt.
Read other info from genebank.
Read genebank.key for description.
Save locus, gene, and disorder.
(Join the 3 disorder fields together, throwing out fields that are
just a space.)
Parse the gene field into individual gene symbols.
6. Parse your 23andme file.
For each rs# in the file, strip off the 'rs' and look up the number in
your SNPContigLocusId db.
Retrieve the locus_symbol, allele, and fxn_class.
If fxn_class is in the range 41-44 (bad mutations that destroy the protein),
look up the locus_symbol in omim.
If you find it, retrieve the associated dominance and disorder.
Check the genotype from your 23andme data against
the allele of the SNP and the dominance info (if any).
If it is possible for the individual to have a disease caused by this
mutation, print out the SNP info, including disease;
and add the associated diseases to your list of possible diseases.
Here is Perl code to build the databases.
You'll need to install all referenced Perl modules, and also DBD::SQLite,
which is magically invoked without being referenced.
#!/usr/bin/perl -w
# Construct an SQLite DB from the NCBI dbSNP SNPContigLocusId table
# See http://www.ncbi.nlm.nih.gov/projects/SNP/snp_db_table_description.cgi?t=SNPContigLocusId
# Construct an SQLite DB from OMIM
# Copyright 2011 by Phil Goetz
use strict;
use Carp qw(cluck);
# This awesome line makes errors give a stack trace:
local $SIG{__WARN__} = \&Carp::cluck;
use DBI;
use Getopt::Long qw(:config no_ignore_case no_auto_abbrev);
use lib 'lib';
use dbiSqlite; # library in 'lib/dbiSqlite.pm' with more DB functions
## Declare and define all constants
my $CliFile = "b132_SNPContigLocusId_37_1.bcp";
my $CliDir = "/data/gene/SNP/dbSNP/organisms/human_9606/database/organism_data";
my $Db; # Target database
my $Dbdir = 'data'; # directory to create databases in
my $DbSnp = 1;
my $Omim = 1;
my $OmimDir = '/data/gene/OMIM/ARCHIVE';
my $Test = 0;
## Read command-line arguments
GetOptions(
'dbdir:s' => \$Dbdir,
'dbsnp!' => \$DbSnp,
'cli:s' => \$CliFile,
'clidir:s' => \$CliDir,
'omim!' => \$Omim,
'omimdir' => \$OmimDir,
'test!' => \$Test,
);
mkdir($Dbdir) if !-d $Dbdir;
&makeDb($Dbdir, 'snp', \&dbSnp) if $DbSnp;
&makeDb($Dbdir, 'omim', \&omim) if $Omim;
exit(0);
########################
sub makeDb {
my ($dbdir, $db, $subRef) = @_;
my $dbfile = "$dbdir/$db";
print "Making $dbfile\n";
if (-e $dbfile) {
unlink($dbfile);
}
# Open connection to new db
my $dbh = &connectSqlite($dbdir, $db)
or die "Could not create SQLite DB $dbfile";
$subRef->($dbh);
print "Committing $dbfile\n";
$dbh->commit() or die "Could not commit inserts into $db";
$dbh->disconnect;
}
# $snpH: Database handle
sub dbSnp {
my ($snpH) = @_;
# Create table
my $cmd = "CREATE TABLE locusid (" .
"snp_id INTEGER, " .
"contig_acc VARCHAR(15), " .
#"contig_ver INTEGER, " .
"asn_from INTEGER, " .
"asn_to INTEGER, " .
"locus_id INTEGER, " .
"locus_symbol VARCHAR(50), " .
"mrna_acc VARCHAR(17), " .
"protein_acc VARCHAR(17), " .
"fxn_class INTEGER, " .
"reading_frame INTEGER, " .
"allele VARCHAR(25), " . # nucleotide present in SNP
"residue VARCHAR(5), " .
"aa_position INTEGER, " .
#"build_id VARCHAR(10), " .
"ctg_id INTEGER, " .
"mrna_pos INTEGER, " .
"codon VARCHAR(3) " .
")";
&makeTable($snpH, $cmd);
my $InsertCLI = "INSERT INTO locusid ( " .
"'snp_id', 'contig_acc', 'asn_from', 'asn_to', " .
"'locus_id', 'locus_symbol', 'mrna_acc', 'protein_acc', " .
"'fxn_class', 'reading_frame', 'allele', 'residue', " .
"'aa_position', 'ctg_id', 'mrna_pos', 'codon' " .
") VALUES ( ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ? )";
&makeSnp($snpH, "$CliDir/$CliFile", $InsertCLI);
&makeIndex($snpH, 'locusid', 'snp_id');
&makeIndex($snpH, 'locusid', 'locus_symbol');
&makeIndex($snpH, 'locusid', 'protein_acc');
&makeIndex($snpH, 'locusid', 'asn_from');
}
sub makeSnp {
my ($snpH, $file, $insert) = @_;
print "Reading $file\n";
open(my $IN, "<$file") or die "Could not open $file";
my $count = 0;
while ((my $line = <$IN>) && (!$Test || $count++ < 1000)) {
chomp($line);
my ($snp_id, $contig_acc, $contig_ver, $asn_from, $asn_to,
$locus_id, $locus_symbol, $mrna_acc, $mrna_ver,
$protein_acc, $protein_ver, $fxn_class, $frame, $allele,
$residue, $aapos, $build_id, $ctg_id, $mrna_pos0, $mrna_pos1, $codon)
= split(/\t/, $line);
$mrna_acc = "$mrna_acc.$mrna_ver" if $mrna_acc;
$protein_acc = "$protein_acc.$protein_ver" if $protein_acc;
&sql($snpH, $insert,
$snp_id, $contig_acc, $asn_from, $asn_to,
$locus_id, $locus_symbol, $mrna_acc,
$protein_acc, $fxn_class, $frame, $allele,
$residue, $aapos, $ctg_id, $mrna_pos0, $codon);
}
close($IN);
print "Done reading $file\n";
}
# Make OMIM DB
# $omimH: Database handle
sub omim {
my ($omimH) = @_;
my $cmd = "CREATE TABLE genemap (" .
"id INTEGER, " . # how we link our tables
"date INTEGER, " .
"loc VARCHAR(15), " .
"status VARCHAR(1), " .
"mim INTEGER, " .
"dominance VARCHAR(1), " . # 'd', 'r', or undef/NULL
"disorder VARCHAR, " .
# mim is NOT unique
#"CONSTRAINT id_uniq UNIQUE (mim))";
"CONSTRAINT id_uniq UNIQUE (id) )";
&makeTable($omimH, $cmd);
$cmd = "CREATE TABLE gene (" .
"id INTEGER, " .
"mim INTEGER, " .
"gene VARCHAR)";
&makeTable($omimH, $cmd);
&makeOmim($omimH, 'genemap');
&makeIndex($omimH, 'genemap', 'id');
&makeIndex($omimH, 'gene', 'id');
&makeIndex($omimH, 'gene', 'gene');
}
sub makeOmim {
my ($dbh, $table) = @_;
# First, read the omim.txt file to try to pick up dominant/recessive info
# Sadly, this is not a very machine-readable file
my $file = "$OmimDir/omim.txt";
my $count = 0;
my $mim;
my %dominance; # $dominance{$mim} = 'd' for dominant; 'r' for recessive
print "Reading $file\n";
open(my $OMIM, "<$file") or die "Could not open $file";
while ((my $line = <$OMIM>) && (!$Test || $count++ < 10000)) {
chop($line) while ($line =~ m/[\n\r]$/);
if ($line eq '*FIELD* NO') {
$line = <$OMIM>; # MIM number is on next line
chop($line) while ($line =~ m/[\n\r]$/);
$mim = $line;
}
elsif ($line eq 'INHERITANCE:') {
$line = <$OMIM>; # MIM number is on next line
chop($line) while ($line =~ m/[\n\r]$/);
if ($line =~ m/dominant/i) {
$dominance{$mim} = 'd';
}
elsif ($line =~ m/recessive/i) {
$dominance{$mim} = 'r';
}
elsif ($line =~ m/isolated cases/i) {
$dominance{$mim} = 'i';
}
elsif ($line =~ m/multifactorial/i) {
$dominance{$mim} = 'm';
}
elsif ($line =~ m/somatic/i) {
$dominance{$mim} = 's';
}
elsif ($line =~ m/X-linked/i) {
$dominance{$mim} = 'x';
}
elsif ($line =~ m/Y-linked/i) {
$dominance{$mim} = 'y';
}
elsif ($line =~ m/mitochondrial/i) {
$dominance{$mim} = 'M';
}
else {
print "Unknown inheritance: $line\n";
}
}
}
close($OMIM);
# Now read genemap
$file = "$OmimDir/$table";
my $insert = "INSERT INTO $table ('id', 'date', 'loc', 'status',
'mim', 'dominance', 'disorder')" .
" VALUES ( ?, ?, ?, ?, ?, ?, ? )";
my $insertGene = "INSERT INTO gene ( 'id', 'mim', 'gene' ) VALUES (?, ?, ?)";
print "Reading $file\n";
open(my $IN, "<$file") or die "Could not open $file";
my $id = 1;
while (my $line = <$IN>) {
chop($line) while ($line =~ m/[\n\r]$/);
# Disorder is split arbitrarily into up to 3 fields
my ($num, $month, $day, $year, $loc, $gene, $status, $title, $foo,
$mim, $method, $comments, $thirteen, $diso14, $diso15, $diso16)
= split(/\|/, $line);
my @disos;
foreach my $diso ($diso14, $diso15, $diso16) {
# Skip empty fields, some of which have a space
push(@disos, $diso) if ($diso && length($diso) > 1);
}
my $disorder = join('', @disos);
my $date = "$year$month$day";
if ($loc eq 'Chr.X') {
if ($dominance{$mim} && $dominance{$mim} ne 'x') {
print "NOTE: $mim is on $loc yet says dominance=$dominance{$mim}\n";
}
else {
$dominance{$mim} = 'x'; # not sure how to interpret this
print "NOTE: Calling $mim dominance='x'\n";
}
}
# Insert most of data into main table
&sql($dbh, $insert, $id, $date, $loc, $status, $mim,
$dominance{$mim}, $disorder);
# Insert one row per gene, so we can look up by gene
# $gene from omim text file may be "GJA9, CX59"
my @genes = split(/,\s*/, $gene);
foreach my $g (@genes) {
# Should possibly insert lc($g) (lowercase)
&sql($dbh, $insertGene, $id, $mim, $g);
}
$id++;
}
close($IN);
}
----------------------------------------------
Here is the Perl file to parse the data, parseSnps ('parsnips').
Remove all lines that say '&track', '&trace', '&trackBegin', or '&trackEnd'.
It might not run right away; I am testing a slightly different version
that uses my libraries.
Usage: perl parseSnps -dbdir <directory for SQLite dbs> [-o outfile]
-snpdir <directory with genome file>
[-syn|-nosyn] [-omim|-noomim] [-ud|-noud] [-us|-nous]
where
-syn: Print info on synonymous substitutions
-omim: Skip SNPs not matched to genes in OMIM
-ud: Print info on heterozygous SNPs in genes of unknown dominance
-us: Print info on SNPs of fxn_class other than 41-44
Note that dominance information is not reliable; it is taken on a
per-gene basis,
when it can be different for different SNPs. Possibly dominance
should not be checked for that reason.
#!/usr/bin/perl -w
# Read a 23andme SNP file.
# Use the sqlite DB made from NCBI SNPContigLocusId
# See http://www.ncbi.nlm.nih.gov/projects/SNP/snp_db_table_description.cgi?t=SNPContigLocusId
# Construct:
# a new version of the reference genome with those SNPs substituted into it
# a new file that is the original SNP file, plus gene ID, gene name,
# a field indicating whether the SNP is synonymous or not,
# and a field indicating any entry in OMIM for that SNP
# Copyright 2011 by Phil Goetz
use strict;
use Carp qw(cluck);
# This awesome line makes errors give a stack trace:
local $SIG{__WARN__} = \&Carp::cluck;
use DBI;
use English;
use File::Basename;
my $Bin;
BEGIN {
$Bin = &dirname($0);
}
use lib "$Bin/lib";
use Getopt::Long qw(:config no_ignore_case no_auto_abbrev);
use dbiSqlite;
select STDOUT; $OUTPUT_AUTOFLUSH = 1;
## Declare and define all constants
my $Db; # Target database
my $Dbdir = 'data';
my $Omim = 1; # 1 => output only genes with disease from OMIM
my $OutFile = 'genome.txt';
my $SnpFile = 'genome_Your_Name_Full_Date.txt';
my $SnpDir ='directory/containing/SnpFile';
my $Syn = 0; # 1 => output synonymous substitutions
my $Test = 0;
my $UnknownDom = 1; # 1 => output SNPs with unknown dominance
my $UnknownSnp = 0; # 0 => output only SNPs with definitely-bad fxn_class
## Read command-line arguments
GetOptions(
'dbdir:s' => \$Dbdir,
'debug|d!' => \$DEBUG,
'omim!' => \$Omim,
'out|o:s' => \$OutFile,
'snpdir:s' => \$SnpDir,
'snps:s' => \$SnpFile,
'syn!' => \$Syn,
'test!' => \$Test,
'ud!' => \$UnknownDom,
'us!' => \$UnknownSnp,
);
my %Diseases; # hash of all implicated diseases
# Open connection to new db
my $snpH = &connectSQLite($Dbdir, 'snp') or die "Could not open DB snp";
my $omimH = &connectSQLite($Dbdir, 'omim') or die "Could not open DB omim";
# NOTE: 'locus_symbol' = gene symbol for genes
my $qSnpId = "SELECT contig_acc, asn_from, locus_id, locus_symbol,
protein_acc, allele, fxn_class FROM locusid WHERE snp_id=?";
my $qOmim = "SELECT gm.disorder, gm.dominance FROM gene g, genemap gm
WHERE g.gene=? AND g.id=gm.id";
my $snpFile = "$SnpDir/$SnpFile";
open(my $SNPS, "<$snpFile") or die "Could not open $snpFile";
open(my $OUT, ">$OutFile") or die "Could not write to $OutFile";
print $OUT "# Output from parseSnp, omim=$Omim syn=$Syn ud=$UnknownDom
us=$UnknownSnp\n";
my ($count, $found, $unfound, $ocount, $dcount) = (0, 0, 0, 0, 0);
while ((my $line = <$SNPS>) && (!$Test || $count++ < 1000)) {
chop($line) while ($line =~ m/[\n\r]$/);
my ($allele, $asn_from, $contig_acc, $fxn_class, $locus_id, $gene,
$protein_acc, $disease);
# = ('', '', '', '', '', '');
if ($line =~ m/^[rsi]+(\d+)\t/) {
my ($snpid, $chromo, $posnt, $genotype) = split(/\t+/, $line);
# Some snpid are eg i21234, which is a 23andme-unique identifier
# Someday this program should look up genes by chromosome and position
my $rsnum;
my $expressed = 0; # -1 => no, 0 => unknown, 1 => yes
if ($snpid =~ m/^rs(\d+)$/) {
$rsnum = $1;
# Look up rs# in dbSNP
($contig_acc, $asn_from, $locus_id, $gene, $protein_acc, $allele,
$fxn_class) =
&colSql($snpH, $qSnpId, $rsnum);
if ($gene) {
$found++;
$protein_acc = '' if $protein_acc eq '.'; # fix bug in DB
if ($genotype =~ m/$allele/) {
# Genotype has at least 1 copy of SNP allele
# Look up diseases and dominance in OMIM
my ($disease, $dom) = &colSql($omimH, $qOmim, $gene);
if ($disease) {
$ocount++; # count of SNPs matched to a disease in Omim
$disease =~ s/\t//; # fix bug in DB
#&track("disease for $gene = $disease");
if (length($allele) == 1) {
# Guess whether the disease phenotype is expressed
if ($genotype eq "$allele$allele") {
$expressed = 1; # homozygous
}
# Most entries in genemap lack an entry in omim.txt,
# therefore lack dominance info
elsif ($dom && $dom eq 'r') {
$expressed = -1; # recessive heterozygous
}
# Expressed if dominant (just guessing about x)
$expressed = 1 if $dom && ($dom eq 'd' or $dom eq 'x');
$dcount++ if ($expressed != 0); # dominance count
}
else {
&trace("NOTE: rs$snpid in $gene: Cannot determine expression of
allele $allele");
}
}
}
}
else {
#print "NOTE: Did not find $snpid\n";
$unfound++;
}
}
# Print only if SNP is non-synonymous, or if !$Syn
# fxn_class: see database/shared_data/SnpFunctionCode.bcp for complete list
# 3 => synonymous SNP
# 6 => intron
# 8 => cds-reference: ?
# 9 => unknown
# 41 => STOP-GAIN
# 42 => missense
# 43 => STOP-LOSS
# 44 => frameshift
if ($Syn || !defined($fxn_class) || $fxn_class != 3) {
# Print only if linked to a disease in OMIM, or if !$Omim
if (!$Omim || $disease) {
&track("$snpid\t$gene\t$genotype\t$allele\t$expressed\t$fxn_class")
if $allele;
# If $Omim and !$UnknownSnp, print only if fxn_class is serious
if ($UnknownSnp ||
# If $UnknownSnp, skip dominance test
($fxn_class > 40 && $fxn_class < 45 &&
($expressed > 0 || ($UnknownDom && $expressed > -1)))) {
# Indicate this person may have this disease
my @diseases = split(/;\s+/, $disease);
foreach my $dis (@diseases) {
$dis =~ s/[{}]//g;
$dis =~ s/^\s+//g;
$dis =~ s/\s+$//g;
$Diseases{$dis}++;
}
no warnings 'uninitialized';
print $OUT
"$snpid\t$chromo\t$posnt\t$genotype\t$allele\t$locus_id\t$gene\t$protein_acc\t$fxn_class\t$disease\n";
use warnings 'uninitialized';
}
}
}
}
else {
print $OUT "$line\n";
}
}
print $OUT "Found $found SNPs in dbSNP; $ocount matched genes in OMIM;
$dcount had dominance info.\n";
print $OUT "Did not find $unfound SNPs.\n";
# Print implicated diseases
while (my ($disease, $count) = each (%Diseases)) {
print $OUT "disease\t$disease\t$count\n";
}
close($SNPS);
close($OUT);
$snpH->disconnect;
$omimH->disconnect;
exit(0);
----------------------------------------------
Here is the library module they both use.
#!/usr/bin/perl -w
# Methods for using SQLite via DBI
# Copyright 2011 by Phil Goetz
use strict;
use DBI;
use DBI::Const::GetInfoType; # eg $dbh->get_info($GetInfoType{SQL_DBMS_NAME})
use Exporter 'import';
our @EXPORT = qw( &connectSqlite &makeTable &makeIndex &sql &colSql );
# NOTE: SQLite will create the DB if it doesn't exist
sub connectSqlite {
my($dbdirFull, $db, $cache) = @_;
my $dbh;
my $dbargs = {AutoCommit => 0, PrintError => 1};
$dbh = DBI->connect("DBI:SQLite:dbname=$dbdirFull/$db", "", "", $dbargs);
if ($cache) {
# Convert bytes to pages. Each page is about 1500 bytes.
$cache = $cache / 1500;
}
else {
$cache = 500000; # 750M for SQLite
}
my $cmd = "PRAGMA cache_size = $cache";
$dbh->do($cmd) or die "FATAL ERR: Can't do $cmd: $DBI::errstr";
# Tell SQL to hold temporary tables in memory
$cmd = 'PRAGMA temp_store = MEMORY';
$dbh->do($cmd) or die "FATAL ERR: Can't do $cmd: $DBI::errstr";
# Set a non-zero busy timeout, so we can catch 'db locked' with timeout
# NOTE: SQLite is notorious for locking the DB for many seconds to
write a single row
$dbh->func(10000, 'busy_timeout'); # 10 seconds
die "FATAL ERR: Did not connect to DB=$db: $DBI::errstr" if !$dbh;
return $dbh;
}
sub makeTable($$) {
my ($dbh, $cmd) = @_;
print "$cmd\n";
$dbh->do($cmd) or die "Could not do($cmd)";
$dbh->commit() or die "Could not commit $cmd into DB";
}
sub makeIndex($$$) {
my ($dbh, $table, $field) = @_;
my $cmd = "CREATE INDEX ${table}_${field}_index ON $table($field)";
if ( !$dbh->do($cmd) ) {
if ($DBI::errstr !~ m/index ${table}_${field}_index already exists/) {
die "Could not $cmd: $DBI::errstr";
}
else { print "NOTE: $DBI::errstr\n"; }
}
$dbh->commit() or die "Could not commit $cmd into DB";
}
# Wrap prepare, execute, and fetchrow_array
sub sql {
my( $dbh, $query, @args ) = @_;
my @results = ();
my $statementHandle = $dbh->prepare($query);
@results = &_executeSql($statementHandle, $query, @args);
return @results;
}
# Do a query for which we want only a single column in the results
# Return a list of the values from that column
# If given a multi-column query, will append all values of all columns
into @result
sub colSql {
my( $dbh, $query, @args ) = @_;
my @rows = &sql($dbh, $query, @args);
my @result;
foreach my $row (@rows) {
push(@result, @$row);
}
return @result;
}
# Returns a list of list references
sub _executeSql {
my( $statementHandle, $query, @args ) = @_;
$statementHandle->execute(@args)
or die "Failed query=($query) args=(@args)\n" . $statementHandle->errstr;
my @results = ();
my @row;
if ($query =~ m/^SELECT/i) {
while (@row = $statementHandle->fetchrow_array) {
my @local = @row;
push(@results, \@local);
}
}
#release the statement handle resources
$statementHandle->finish;
return(@results); # Returns 0 if $query wasn't a SELECT query
}
423586 of the 956734 SNPs in the report are listed in dbSNP.
I don't know why over half of the given rs numbers are not listed in
dbSNP; that shouldn't happen.
As of late 2010 they are using the Illumina HumanOmniExpress Plus
which is basically just a custom version of the HumanOmniExpress.
http://www.illumina.com/products/human_omni_express.ilmn
23andme added a bunch of extra SNPs to it - 20,000 or so iirc.
-cory
How on earth did 23andme get the price down to around $100?
They must be losing money.
The spreadsheet link was bad, I think. Here's another try:
http://bit.ly/publicgenetic