estimated run time for deadzones

50 views
Skip to first unread message

Alex Koeppel

unread,
Aug 15, 2016, 11:03:37 AM8/15/16
to RSEG Users
Hello all,

I'm running RSEG  and I need to make a deadzones file for mm10 for 60bp reads.  I gathered from the documentation that I could speed things up by increasing the value of the -prefix option from the default of 5 up to 24 (I have plenty of memory).  
What is a reasonable expectation for the run time to be for creating the deadzone file?  I ask because I launched it Friday and left it running over the weekend but I still have no output, no text written to stdout, nor any error messages of any kind.

My command was:

deadzones -s fa -k 60 -prefix 24 -o deadzones-mm10-k60.bed $GENOME  , where $GENOME is the path to the directory containing my chromosome fasta files for mm10 (chr1.fa, chr2.fa etc.)

As is, the script seems to be using ~5G RAM.  I have more than that so I could up -prefix higher, but I'm somewhat hesitant to kill it in case it's almost done.  Aside from memory, is there any other limit on how high it is advisable to set -prefix?

Any help or advice would be appreciated.  

Thanks,

Alex

Moshe Olshansky

unread,
Aug 15, 2016, 10:13:22 PM8/15/16
to rseg-s...@googlegroups.com
Hi Alex,

Which version of rseg are you using?
I have not used rseg for a long time but from memory it never took me more than overnight to build the deadzones with the default value of prefix (5).
For 24-mer there are 4^24 > 10^14 possibilities. Of course only a tiny proportion of them is present at mm10 but I do not know how does rseg account for this.

Best regards,
Moshe.

P.S. I was using rseg-0.4.8


From: Alex Koeppel <afko...@gmail.com>
To: RSEG Users <rseg-s...@googlegroups.com>
Sent: Tuesday, 16 August 2016, 1:03
Subject: [rseg-users] estimated run time for deadzones

--
You received this message because you are subscribed to the Google Groups "RSEG Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rseg-support...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Song, Qiang

unread,
Aug 15, 2016, 11:25:57 PM8/15/16
to RSEG Users
Hi Alex,

It took about 24 hours when we generated the deadzone files for mm9 with 36bp reads.
I have no empirical estimation about 60bp reads.

Actually, given 60bp reads, they effect of deadzones may not be as dramatic as shorter reads when we developed RSEG.
You can try to run rseg or rseg-diff without deadzones while you wait for the deadzone job to finish.

Qiang


--
You received this message because you are subscribed to the Google Groups "RSEG Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rseg-support+unsubscribe@googlegroups.com.

Alex Koeppel

unread,
Aug 16, 2016, 8:41:15 AM8/16/16
to RSEG Users
Sorry I should have mentioned I'm using rseg-0.4.9

The deadzones script is still running this morning with no results or stdout messages so I think I'm going to have to kill it. 

I went ahead and ran rseg-diff without the deadzone correction, but I'd still like to try re-running it with if possible.   


Andrew D. Smith

unread,
Aug 16, 2016, 10:31:21 AM8/16/16
to rseg-s...@googlegroups.com
I think you can get some idea about the required time by turning on the verbose output.

Andrew
> To unsubscribe from this group and stop receiving emails from it, send an email to rseg-support...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups "RSEG Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to rseg-support...@googlegroups.com.

Wu, Tao

unread,
Aug 16, 2016, 10:54:19 AM8/16/16
to rseg-s...@googlegroups.com

Hi Andrew,


Just want to say, RSEG is great!


Wonder, any plan to update the version?


Thanks!


Tao



From: rseg-s...@googlegroups.com <rseg-s...@googlegroups.com> on behalf of Andrew D. Smith <andr...@usc.edu>
Sent: Tuesday, August 16, 2016 10:31:16 AM
To: rseg-s...@googlegroups.com
Subject: Re: [rseg-users] estimated run time for deadzones
 
I think you can get some idea about the required time by turning on the verbose output.

Andrew

> On Aug 15, 2016, at 8:25 PM, Song, Qiang <chian...@gmail.com> wrote:
>
> Hi Alex,
>
> It took about 24 hours when we generated the deadzone files for mm9 with 36bp reads.
> I have no empirical estimation about 60bp reads.
>
> Actually, given 60bp reads, they effect of deadzones may not be as dramatic as shorter reads when we developed RSEG.
> You can try to run rseg or rseg-diff without deadzones while you wait for the deadzone job to finish.
>
> Qiang
>
>
> On Mon, Aug 15, 2016 at 8:03 AM, Alex Koeppel <afko...@gmail.com> wrote:
> Hello all,
>
> I'm running RSEG  and I need to make a deadzones file for mm10 for 60bp reads.  I gathered from the documentation that I could speed things up by increasing the value of the -prefix option from the default of 5 up to 24 (I have plenty of memory). 
> What is a reasonable expectation for the run time to be for creating the deadzone file?  I ask because I launched it Friday and left it running over the weekend but I still have no output, no text written to stdout, nor any error messages of any kind.
>
> My command was:
>
> deadzones -s fa -k 60 -prefix 24 -o deadzones-mm10-k60.bed $GENOME  , where $GENOME is the path to the directory containing my chromosome fasta files for mm10 (chr1.fa, chr2.fa etc.)
>
> As is, the script seems to be using ~5G RAM.  I have more than that so I could up -prefix higher, but I'm somewhat hesitant to kill it in case it's almost done.  Aside from memory, is there any other limit on how high it is advisable to set -prefix?
>
> Any help or advice would be appreciated. 
>
> Thanks,
>
> Alex
>
> --
> You received this message because you are subscribed to the Google Groups "RSEG Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to rseg-support...@googlegroups.com.

>
>
> --
> You received this message because you are subscribed to the Google Groups "RSEG Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to rseg-support...@googlegroups.com.


--
You received this message because you are subscribed to the Google Groups "RSEG Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rseg-support...@googlegroups.com.

Andrew D. Smith

unread,
Aug 16, 2016, 10:58:21 AM8/16/16
to rseg-s...@googlegroups.com
Lots of “plans," for quite a while now, but nothing concrete.

I hope sometime this semester we can make an update. In the meantime, I’m very interested in hearing your desired features, or areas of improvements!

Cheers,
Andrew

> On Aug 16, 2016, at 7:54 AM, Wu, Tao <tao...@yale.edu> wrote:
>
> Hi Andrew,
>
> Just want to say, RSEG is great!
>
> Wonder, any plan to update the version?
>
> Thanks!
>
> Tao
>
> For more options, visit https://groups.google.com/d/optout.

Moshe Olshansky

unread,
Aug 18, 2016, 11:43:32 PM8/18/16
to rseg-s...@googlegroups.com
Hi Alex,

I am now quite convinced that specifying -prefix 24 makes deadzones to do something for 4^24 possible 24-mers and this will never end.
Last night I ran it with the default prefix (5) for 100 long reads on mm10 and it took about 19 hours. It may take a bit longer for 60 bp (since there will be more dead zones) but it should not take more than part of the weekend.
Let me know if you encounter any difficulties - I can run it and e-mail you the gzipped .bed file.


--
You received this message because you are subscribed to the Google Groups "RSEG Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rseg-support...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

______________________________________________________________________

The information in this email is confidential and intended solely for the addressee.
You must not disclose, forward, print or use it without the permission of the sender.
______________________________________________________________________

Alex Koeppel

unread,
Aug 19, 2016, 8:40:42 AM8/19/16
to rseg-s...@googlegroups.com
Thanks for your help with this.  I'll try to re-run leaving prefix at default.  My understanding from the RSEG manual was that increasing the value of -prefix was supposed to decrease the run-time at the cost of using additional memory.  Did I read it wrong?

On Thu, Aug 18, 2016 at 11:43 PM, Moshe Olshansky <olsh...@wehi.edu.au> wrote:
Hi Alex,

I am now quite convinced that specifying -prefix 24 makes deadzones to do something for 4^24 possible 24-mers and this will never end.
Last night I ran it with the default prefix (5) for 100 long reads on mm10 and it took about 19 hours. It may take a bit longer for 60 bp (since there will be more dead zones) but it should not take more than part of the weekend.
Let me know if you encounter any difficulties - I can run it and e-mail you the gzipped .bed file.


Hello all,

I'm running RSEG  and I need to make a deadzones file for mm10 for 60bp reads.  I gathered from the documentation that I could speed things up by increasing the value of the -prefix option from the default of 5 up to 24 (I have plenty of memory).  
What is a reasonable expectation for the run time to be for creating the deadzone file?  I ask because I launched it Friday and left it running over the weekend but I still have no output, no text written to stdout, nor any error messages of any kind.

My command was:

deadzones -s fa -k 60 -prefix 24 -o deadzones-mm10-k60.bed $GENOME  , where $GENOME is the path to the directory containing my chromosome fasta files for mm10 (chr1.fa, chr2.fa etc.)

As is, the script seems to be using ~5G RAM.  I have more than that so I could up -prefix higher, but I'm somewhat hesitant to kill it in case it's almost done.  Aside from memory, is there any other limit on how high it is advisable to set -prefix?

Any help or advice would be appreciated.  

Thanks,

Alex


--
You received this message because you are subscribed to the Google Groups "RSEG Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rseg-support+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
______________________________________________________________________

The information in this email is confidential and intended solely for the addressee.
You must not disclose, forward, print or use it without the permission of the sender.
______________________________________________________________________

--
You received this message because you are subscribed to a topic in the Google Groups "RSEG Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rseg-support/xqKQKJwjXCU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rseg-support+unsubscribe@googlegroups.com.

Andrew D. Smith

unread,
Aug 19, 2016, 11:57:46 AM8/19/16
to rseg-s...@googlegroups.com
You read it correctly. But if you start using more memory than the physical memory available on your machine, it will start to slow down again.

> On Aug 19, 2016, at 5:40 AM, Alex Koeppel <afko...@gmail.com> wrote:
>
> Thanks for your help with this. I'll try to re-run leaving prefix at default. My understanding from the RSEG manual was that increasing the value of -prefix was supposed to decrease the run-time at the cost of using additional memory. Did I read it wrong?
>
> On Thu, Aug 18, 2016 at 11:43 PM, Moshe Olshansky <olsh...@wehi.edu.au> wrote:
> Hi Alex,
>
> I am now quite convinced that specifying -prefix 24 makes deadzones to do something for 4^24 possible 24-mers and this will never end.
> Last night I ran it with the default prefix (5) for 100 long reads on mm10 and it took about 19 hours. It may take a bit longer for 60 bp (since there will be more dead zones) but it should not take more than part of the weekend.
> Let me know if you encounter any difficulties - I can run it and e-mail you the gzipped .bed file.
>
> Hello all,
>
> I'm running RSEG and I need to make a deadzones file for mm10 for 60bp reads. I gathered from the documentation that I could speed things up by increasing the value of the -prefix option from the default of 5 up to 24 (I have plenty of memory).
> What is a reasonable expectation for the run time to be for creating the deadzone file? I ask because I launched it Friday and left it running over the weekend but I still have no output, no text written to stdout, nor any error messages of any kind.
>
> My command was:
>
> deadzones -s fa -k 60 -prefix 24 -o deadzones-mm10-k60.bed $GENOME , where $GENOME is the path to the directory containing my chromosome fasta files for mm10 (chr1.fa, chr2.fa etc.)
>
> As is, the script seems to be using ~5G RAM. I have more than that so I could up -prefix higher, but I'm somewhat hesitant to kill it in case it's almost done. Aside from memory, is there any other limit on how high it is advisable to set -prefix?
>
> Any help or advice would be appreciated.
>
> Thanks,
>
> Alex
>
>
> --
> You received this message because you are subscribed to the Google Groups "RSEG Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to rseg-support...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
> ____________________________________________________________
> __________
>
> The information in this email is confidential and intended solely for the addressee.
> You must not disclose, forward, print or use it without the permission of the sender.
> ______________________________
> ________________________________________
>
> --
> You received this message because you are subscribed to a topic in the Google Groups "RSEG Users" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/rseg-support/xqKQKJwjXCU/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to rseg-support...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups "RSEG Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to rseg-support...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages