Dear
Over the three days, I have been trying to figure out how to analyze my ChIP seq data- reading review articles, searching on-line and UCSC web info. I am not really sure I understand any of these having no Bioinfo background. Here is the question:
I have sequence data in excel as well as bed files with: chr start end log p etc (attached file). These are also selected over the preimmune IgG.
Now I like to find the binding sites/sequences.
Can I do that on UCSC Genome Browser? What are the steps? How can download those data?
Can I do on Windows or need Unix/Linux or like environment on Windows (Cygwin)?
Sorry for so many questions. If you could direct me to the specific UCSC links or advise me if other better user friendly programs available, that will be highly appreciated.
Thanks in advance.
Rafiq islam
Biochemistry
Northwest Missouri State University
Dear Rafiq,
Thank you for using the UCSC Genome Browser and your question about ChIP-seq peak calling.
You may wish to ask questions at a bioinformatics forum like https://www.biostars.org/, it is not immediately clear at which stage of the process you may be with your data.
It may be of interest to look at other external reference sites as well, such as the ENCODE project's Transcription Factor ChIP-seq Pipeline page,https://www.encodeproject.org/chip-seq/transcription_factor/, and related software information: https://www.encodeproject.org/search/?type=Software&limit=all
If you have finalized BED regions, which appears to be what is in the files you have attached, you can use those to perform various queries on the UCSC Genome Browser.
By loading a custom track, such as the following by pasting the data on the hg19 Custom Track Page, http://genome.ucsc.edu/cgi-bin/hgCustom?db=hg19, accessible under the top blue bar "My Data" an "Custom Tracks" you can view data:
track name=uniqueNameBed1 description="chip5v-4v_1" chr1 4588690 4589065 chr1 28320368 28320820 chr1 28700424 28701069 chr1 31415317 31415763 chr1 31501278 31501618
With this loaded on the browser you can use tools such as the Data Integrator to pull out other significant data in the Browser. Going to the top "Tools" bar and selecting "Data Integrator" you can set the ""track group" to "Custom Tracks" and add the "track" uniqueNameBed1 by clicking the "Add" button.
If you were interested in getting Transcription Factor Binding Spots that associate to these regions next change the "track group" from "Custom Tracks" to "Regulation" and the "track" to "ENCODE Regulation -Txn Factor ChIP (wgEncodeRegTfbsClusteredV3)" and click "Add".
At this point if you click "Get output" you would get all fields from this table that intersect with your custom track. To help modify that output, first click "Choose fields" under "Output options" and then for "Txn Factor ChIP only select the first five fields: "chrom chromStart chromEnd name score" and then click "Done".
At the very top of the page, be sure you have "region to annotate" set to "genome" and then click "Get output".
This will result in information such as the following:
# hgIntegrator: database=hg19 region=genome Thu Jul 21 16:50:38 2016 #ct_uniqueNameBed1_6074.chrom ct_uniqueNameBed1_6074.chromStart ct_uniqueNameBed1_6074.chromEnd wgEncodeRegTfbsClusteredV3.chrom wgEncodeRegTfbsClusteredV3.chromStart wgEncodeRegTfbsClusteredV3.chromEnd wgEncodeRegTfbsClusteredV3.name wgEncodeRegTfbsClusteredV3.score chr1 4588690 4589065 chr1 28320368 28320820 chr1 28320680 28320956 MAX 186 chr1 28700424 28701069 chr1 28699794 28700504 SETDB1 249 chr1 28700424 28701069 chr1 28699816 28700679 KAP1 312 chr1 28700424 28701069 chr1 28700735 28701179 POLR2A 307 chr1 28700424 28701069 chr1 28700803 28701133 EP300 130 chr1 31415317 31415763 chr1 31501278 31501618 chr1 31500919 31501509 IKZF1 216 chr1 31501278 31501618 chr1 31500943 31501307 TEAD4 255 chr1 31501278 31501618 chr1 31501026 31501362 TBL1XR1 218 chr1 31501278 31501618 chr1 31501322 31501672 SIRT6 108
Please know too that you can search our MLQ archives for previously answered questions where you may find much valuable information:https://groups.google.com/a/soe.ucsc.edu/forum/?hl=en&fromgroups#!search/chipseq$20peak$20
Thank you again for your inquiry and using the UCSC Genome Browser. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead togenom...@soe.ucsc.edu.
All the best,
Brian Lee
UCSC Genomics Institute
--