Hi Alex,Could you please explain me how the genomeLoad in shared memory works in the below example?The below jobs used in core facility production environment, whenever the sequencing is completed with STAR mapping required..Version:STAR_2.4.1d
Let's say the script called by project-A which need to submit 50 jobs to node1 on certain queue (rnaseq.q) which allow four jobs (4*20 slots) to run at a time.1. STAR --genomeLoad LoadAndExit2. sleep 1
3. STAR --genomeLoad LoadAndKeep --readFilesIn sample-A.fastq
STAR --genomeLoad LoadAndKeep --readFilesIn sample-B.fastqSTAR --genomeLoad LoadAndKeep --readFilesIn sample-C.fastqSTAR --genomeLoad LoadAndKeep --readFilesIn sample-D.fastq
whenever the above jobs finished the subsequent jobs will enter the queue ...at the end4.STAR --genomeLoad RemoveAs you mentined in this thread "sleep 1" will help to prevent any of the step two jobs to (SampleA,SampleB,sampleC,sampleD) load the genome at the same time when step 1 happening.
For a case when the 49th and 50th job running in the node1 (48 samples job completed and node have two free slot to accept two more jobs to run), another set of job submitted by the script for project-B which contain 20 jobs.In the available 2 slots one slot would be taken up by "STAR --genomeLoad LoadAndExit" and the another slot will be taken up by Sample-AA of project-B for mapping which uses the genome already loaded by the project-A and before
this sample mapping complete if the project-A's 49th and 50 sample mapping completed and procedd to "STAR --genomeLoad Remove" would affect the undergoing the mapping of sample-a of project-B right?Also can I know the difference btn "LoadAndKeep" and "LoadAndExit" or both can be used to for only loading the genome?And --limitIObufferSize from the default 150000000 (i.e. ~150 MB) to 50000000 you mentioned for situation like loading the genome at the same time only or it may be needed for during the mapping if parallel jobs?
Could you please explain me the mechanism of how the shared memory works in STAR? does this have any Id to identify or specify? and
Other than parallel/sequential jobs access the loaded genome in the shared memory by one user could be accessed (read) by another user on the same node or restricted to user permission?Hi Alex,
Thank you for your reply.
This is wonderful as long the the loaded genome in the shared memory is not removed when any jobs are using it. Do you think this is violation in cluster, when my jobs access other users already loaded memory if any?I have another query about sorting the bam..when I use the shared memory option how much RAM would be desired "--limitBAMsortRAM"My jobs are failed when sorting bam stating not enough memory (Max memory needed for sorting = 16516456) but the cluster have it at that time, I used outBAMsortingThreadN 20 and limitBAMsortRAM 20 the genome is human.