are there published GC and mappablility files for MOSAiCS in one-sample analysis?

126 views
Skip to first unread message

hongxu

unread,
Mar 18, 2012, 7:26:36 AM3/18/12
to MOSAiCS User Group
Hi Dongjun and everyone in this forum,

I'm trying to use MOSAiCS to analyze our data in one-sample analysis
mode. In my reading of the MOSAiCS pipeline, it does need mappability
score file and GC content score to find peaks in one-sample data
analysis. May I ask are there any published GC and mappablility files
for MOSAiCS in one-sample analysis for a read of ~30bp? or could you
provide scripts to get the GC and mappablility files?

Thanks.

Best,
Hong Xu

Dongjun Chung

unread,
Mar 18, 2012, 6:21:33 PM3/18/12
to MOSAiCS User Group
Hi Hong,

Yes, if you want to do one-sample analysis, you need corresponding
mappability, GC content, and sequence ambiguity score. Which genome
are you working on?

If it is hg18 or mm9, you can find the corresponding files here:

http://www.stat.wisc.edu/~keles/Software/mosaics/download.html

and if it is hg19, you can find the corresponding files here:

http://www.stat.wisc.edu/~chungdon/hg19/

If you are working on other genomes, we can generate the corresponding
files for you. The MOSAiCS companion website (http://www.stat.wisc.edu/
~keles/Software/mosaics/) also provides scripts to generate these
files.

Best,
Dongjun

hongxu

unread,
Mar 19, 2012, 11:20:24 AM3/19/12
to MOSAiCS User Group
Hi Dongjun,

Thanks for your reply.The genome we used for reads mapping is hg18.

May I ask you a further question? It seems that the function
"mosaicsRunAll()" of the latest version of mosaic(1.2.5) doesn't
support standard BED format. Will mosaics will adress this issue in
the future release?

Best
Hong Xu
> > Hong Xu- 隐藏被引用文字 -
>
> - 显示引用的文字 -

Dongjun Chung

unread,
Mar 23, 2012, 5:45:55 AM3/23/12
to MOSAiCS User Group
Hi Hong,

Then, I think that you can use the files we uploaded to our MOSAiCS
companion webpage.

In the most recent version, "mosaicsRunAll()" (and also
"constructBins()") is supposed to work well with standard BED file
format. Can you please provide the first few lines of your BED file?
Then, I can test the functions with your files and in this case, it
should be much easier for me to fix the problem.

Thanks,
Dongjun

xu hong

unread,
Mar 23, 2012, 7:05:18 AM3/23/12
to mosaics_u...@googlegroups.com
Hi Dongjun,

We have already analyzed our data step by step with MOSAiCS functions.
The result is good for our conveniece.

The first lines of our data you may want to test are:
chr17 56966142 56966161 U0 0 +
chr2 135974326 135974345 U0 0 +
chr8 34493762 34493781 U0 0 +
chr10 21029741 21029760 U0 0 -
chr6 86878728 86878747 U0 0 -
chr13 67787147 67787166 U0 0 +
chr6 86757684 86757703 U0 0 -
chr14 77174899 77174918 U0 0 -
chr11 34155037 34155056 U0 0 +
chr3 78920690 78920709 U0 0 +
chr12 121438617 121438636 U0 0 -
chr2 211657334 211657353 U0 0 +
chr2 144318083 144318102 U0 0 -
chr7 130773223 130773242 U0 0 +
chr3 17512553 17512572 U0 0 -
chr5 106044449 106044468 U0 0 -
chr3 160207425 160207444 U0 0 +
chr14 38603321 38603340 U0 0 -
chr13 53690843 53690862 U0 0 +
chr8 139371280 139371299 U0 0 +
chr2 108791183 108791202 U0 0 +
chr10 103098771 103098790 U0 0 -

Best
Hong Xu

2012/3/23, Dongjun Chung <dongju...@gmail.com>:

hongxu

unread,
Mar 23, 2012, 7:38:25 AM3/23/12
to MOSAiCS User Group
You mentioned in Bioconductor website(http://www.bioconductor.org/
packages/2.10/bioc/html/mosaics.html) that "Format of the aligned read
file of ChIP sample to be processed. Currently, mosaicsRunAll permits
the following aligned read file formats: "eland_result" (Eland
result), "eland_extended" (Eland extended), "eland_export" (Eland
export), "bowtie" (default Bowtie), and "sam" (SAM)."

Maybe one solution of our ChIP-Seq data analysis is to re-map our
reads to the genome using ELAND or bowtie.

Thanks.

Best
Hong Xu
> 2012/3/23, Dongjun Chung <dongjun.ch...@gmail.com>:
> >> > - 显示引用的文字 -- 隐藏被引用文字 -
>
> - 显示引用的文字 -

Dongjun Chung

unread,
Mar 23, 2012, 3:02:21 PM3/23/12
to MOSAiCS User Group
Hi Hong,

Both "mosaicsRunAll()" and "constructBins()" functions in the newer
version of mosaics actually support standard BED file format. Such
changes were not properly reflected in the help document of
"mosaicsRunAll()" function before. Sorry about this. I recognized this
days ago and updated the help document. I briefly checked whether
"mosaicsRunAll()" works well with the BED file you uploaded and could
not find any problem. Please also try BED files for "mosaicsRunAll()"
later for your convenience. :)

Thanks,
Dongjun

hongxu

unread,
Mar 23, 2012, 9:51:41 PM3/23/12
to MOSAiCS User Group
Thanks. I'll try mosaicsRunAll() in verison 1.3.2. ^_^

Best
Hong Xu

hongxu

unread,
Apr 11, 2012, 8:35:54 PM4/11/12
to MOSAiCS User Group
Hi Dongjun,

I guess you have updated mosaics user manual in bioconductor 2.10.
However,when I run mosaicsRunAll() for minimum run in mosaics version
1.3.4, I still met a bug, could you help to fix the bug?
Thanks.

Here is the bug reported by my analysis:
> mosaicsRunAll(chipFile="sample.bed",chipFileFormat="bed",controlFile="control.bed",controlFileFormat="bed",binfileDir = "/home/hongxu/R-2.14.2/bin/bin/",p
eakFile ="/home/hongxu/R-2.14.2/bin/R-2.14.2/bin/
sample_versus_control_peak_list.txt",peakFileFormat="txt")
Error in mosaicsRunAll(chipFile = "ARallcombined_seq.bed",
chipFileFormat = "bed", :
Please specify 'chipDir'!

Best
Hong Xu
> > > - 显示引用的文字 -- Hide quoted text -
>
> - Show quoted text -

Dongjun Chung

unread,
Apr 16, 2012, 12:42:03 AM4/16/12
to MOSAiCS User Group
Hi Hong,

I tried to reproduce your problem with the BED files I have but I
could not. And in the mosaics version 1.3.4 or higher, we should not
have the error "Please specify 'chipDir'!". In order to figure out the
problem, can you please provide more details regarding this error? It
might be helpful if you can provide some examples from your BED files.

Thanks,
Dongjun

xu hong

unread,
Apr 16, 2012, 2:06:45 PM4/16/12
to mosaics_u...@googlegroups.com
Hi Dongjun,

Thanks for your reply. We didn't choose mosaics this time for our
manuscript to submit, but we cited it. I think you can find our
chipseq data after the acceptance of our paper if it has a chance to
be published.

BR,
Hong Xu

2012/4/16, Dongjun Chung <dongju...@gmail.com>:

Dongjun Chung

unread,
Apr 16, 2012, 9:49:24 PM4/16/12
to mosaics_u...@googlegroups.com
Hi Hong,

Hope that you still enjoyed mosaics. :) Please update me when your data is publicly available. Good luck on your paper!

Thanks,
Dongjun

2012/4/16 xu hong <tomcat...@gmail.com>

xu hong

unread,
Apr 17, 2012, 2:50:59 AM4/17/12
to mosaics_u...@googlegroups.com
Hi Dongjun,
 
mosaics is a good chipseq tool, we will always enjoy it. And I think you would find our chipseq data in the website when our paper is close to be published if it has a chance to be. ^_^ 
 
Thanks.
 
Best,
HX 

hongxu

unread,
Apr 20, 2012, 12:53:35 AM4/20/12
to MOSAiCS User Group
Hi Dongjun and everyone in this forun,

Our chipseq data is public at http://www.systemsbiozju.org/data/chipseq/.
I think you could bench-mark in both one-sample analysis and two-
sample analysis.

Thanks.

Best,
Hong Xu

On Apr 17, 2:50 pm, xu hong <tomcathon...@gmail.com> wrote:
> Hi Dongjun,
>
> mosaics is a good chipseq tool, we will always enjoy it. And I think you
> would find our chipseq data in the website when our paper is close to be
> published if it has a chance to be. ^_^
>
> Thanks.
>
> Best,
> HX
>
> 在 2012年4月17日 上午9:49,Dongjun Chung <dongjun.ch...@gmail.com>写道:
>
>
>
> > Hi Hong,
>
> > Hope that you still enjoyed mosaics. :) Please update me when your data is
> > publicly available. Good luck on your paper!
>
> > Thanks,
> > Dongjun
>
> > 2012/4/16 xu hong <tomcathon...@gmail.com>
>
> >> Hi Dongjun,
>
> >> Thanks for your reply. We didn't choose mosaics this time for our
> >> manuscript to submit, but we cited it. I think you can find our
> >> chipseq data after the acceptance of our paper if it has a chance to
> >> be published.
>
> >> BR,
> >> Hong Xu
>
> >> 2012/4/16, Dongjun Chung <dongjun.ch...@gmail.com>:
> >> > Hi Hong,
>
> >> > I tried to reproduce your problem with the BED files I have but I
> >> > could not. And in the mosaics version 1.3.4 or higher, we should not
> >> > have the error "Please specify 'chipDir'!". In order to figure out the
> >> > problem, can you please provide more details regarding this error? It
> >> > might be helpful if you can provide some examples from your BED files.
>
> >> > Thanks,
> >> > Dongjun
>
> >> > On Apr 11, 7:35 pm, hongxu <tomcathon...@gmail.com> wrote:
> >> >> Hi Dongjun,
>
> >> >> I guess you have updated mosaics user manual in bioconductor 2.10.
> >> >> However,when I run mosaicsRunAll() for minimum run in mosaics version
> >> >> 1.3.4, I still met a bug, could you help to fix the bug?
> >> >> Thanks.
>
> >> >> Here is the bug reported by my analysis:>
>
> >> mosaicsRunAll(chipFile="sample.bed",chipFileFormat="bed",controlFile="contr-ol.bed",controlFileFormat="bed",binfileDir
> >> >> > - Show quoted text -- Hide quoted text -

Dongjun Chung

unread,
Apr 20, 2012, 1:15:11 AM4/20/12
to mosaics_u...@googlegroups.com
Hi Hong,

Thanks for the link to your chipseq data!

I guess that ARallcombined.bed.zip and IGGallcombined.bed.zip might be AR (androgen receptor?) ChIP and IgG control samples, respectively. Am I right?

I believe that they will be useful for benchmarks. Thanks!

Dongjun

2012/4/19 hongxu <tomcat...@gmail.com>

xu hong

unread,
Apr 20, 2012, 3:32:30 AM4/20/12
to mosaics_u...@googlegroups.com
Hi Dongjun,
 
Yes, you are right. AR is androgen receptor, and IgG is the corresponding control.
 
Best,
Hong Xu

xu hong

unread,
Apr 20, 2012, 11:47:13 PM4/20/12
to mosaics_u...@googlegroups.com
Hi Dongjun,

I have found that there is some bugs in the IgG control. For example:
chr10 124308450 124308469 U0 0+
chr9 84745735 84745754 U0 0-
chr4 48807287 48807306 U0 0+
chr11 44872405 44872424 U0 0-
chr1 155315247 155315266 U0 0+
chr5 66377557 66377576 U0 0+
chr17 12136161 12136180 U0 0-
chr10 72287406 72287425 U0 0-
chr9 128245432 128245451 U0 0+
chrX 69450797 69450816 U0 0-
chr18 57347234 57347253 U0 0+
chrX 121107869 121107888 U0 0+
chr6 124163274 124163293 U0 0+
chr14 104849304 104849323 U0 0+
chr11 117469927 117469946 U0 0+
chr3 3666275 3666294 U0 0-

And I have asked our web administrator to have corrected this.

Thanks.

Best
HX

2012/4/20, xu hong <tomcat...@gmail.com>:


> Hi Dongjun,
>
> Yes, you are right. AR is androgen receptor, and IgG is the corresponding
> control.
>
> Best,
> Hong Xu
>
> 在 2012年4月20日 下午1:15,Dongjun Chung <dongju...@gmail.com>写道:
>
>> Hi Hong,
>>
>> Thanks for the link to your chipseq data!
>>
>> I guess that

>> ARallcombined.bed.zip<http://www.systemsbiozju.org/data/chipseq/ARallcombined.bed.zip>and
>> IGGallcombined.bed.zip<http://www.systemsbiozju.org/data/chipseq/IGGallcombined.bed.zip>might

Reply all
Reply to author
Forward
0 new messages