Hello Alva,
Thank you for your question about finding items from dbSNP with minor allele frequency under 1%. You will need to look in another table for those SNPs, as the snp*Common (snp135Common, snp141Common, etc.) tables only include SNPS with minor allele frequency >= 1%. You can see a description of each of the SNP tracks clicking on the track's name from our main browser page at http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19. Please note that the most recent version of dbSNP data currently available on our site is SNP141, but we will be releasing SNP142 for display soon.
The "by frequency" that you sometimes find in the 13th column is a reference to how the SNP was validated - it does not correspond to the actual frequency of the allele. The data in many of the columns are explained on the track description page (e.g., http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=snp141). You can also select a table with the UCSC Table Browser (http://genome.ucsc.edu/cgi-bin/hgTables) and then click the "describe table schema" button for more information.
The frequency counts that you are looking for can be found in the 24th and 25th columns. The 24th column, labeled "alleleNs", is the number of reported counts of each allele. The 25th column is the computed frequencies based on those counts. Please note that some SNPs have very low reported counts, so the associated frequencies are likely to be inaccurate. For example, the following SNP is listed with frequencies of 50% for each allele:
| 585 | chr1 | 10256 | 10257 | rs111200574 | 0 | + | A | A | A/C | genomic | single | unknown | 0.5 | 0 | near-gene-5 | exact | 1 | | 1 | BUSHMAN, | 2 | A,C, | 1.000000,1.000000, | 0.500000,0.500000, |
As I noted above, the snp135Common table will only contain SNPs with a minor allele frequency >= 1%. There are other tables that contain all SNPs, including ones with a frequency < 1%. These tables do not have "Common" in their names. For example, you may be interested in using the "snp135" table (or, for more recent data, "snp141"). You can then apply your own filter to the table to find the SNPS you are looking for. We recommend that you download the collection of SNP data either directly from dbSNP or from the table dumps on our download server at http://hgdownload.soe.ucsc.edu. For example, you can download the snp141 table from http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/snp141.txt.gz (note that this compressed file is 1.7GB). Then filter that table yourself based on what you find in the frequency data. You will probably need a program to do the filtering for you, as there are more than 60 million SNPs described in that table. If you do not have your own program for this, you may find the online tools at Galaxy (https://usegalaxy.org) to be useful.
I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu or genome...@soe.ucsc.edu. Questions sent to those addresses will be archived in publicly-accessible forums for the benefit of other users. If your question contains sensitive data, you may send it instead to genom...@soe.ucsc.edu.
--
Jonathan Casper
UCSC Genome Bioinformatics Group
--