Confused about --genomeLoad parameter

2,787 views
Skip to first unread message

kows...@gmail.com

unread,
Jul 27, 2015, 12:34:12 PM7/27/15
to rna-star
Dear all,

I have several .fastq files that I need to map to the one and the same reference genome. Since STAR is really fast, I would usually put somehing like this in a script:

./STAR --runThreadN 16 --genomeDir ./starindex-mm/ --outFileNamePrefix /home/of/Desktop/run1 --outSAMtype BAM SortedByCoordinate --readFilesIn /home/of/Desktop/reads/run1.fastq
./STAR --runThreadN 16 --genomeDir ./starindex-mm/ --outFileNamePrefix /home/of/Desktop/run2 --outSAMtype BAM SortedByCoordinate --readFilesIn /home/of/Desktop/reads/run2.fastq

...

and let it fly.
But now I would really like to avoid loading the index into memory for each run, and I simply don't know how to do it properly. I've read about loading the genome into shared memory in the manual and then tried this:

./STAR --genomeDir ./starindex-mm --genomeLoad LoadAndExit
Jul 27 18:20:04 ..... Started STAR run
Jul 27 18:20:04 ..... Loading genome

Shared memory error: 4, errno: Invalid argument(22)
EXITING because of FATAL ERROR: problems with shared memory: error from shmget() or shm_open().
SOLUTION: check shared memory settings as explained in STAR manual, OR run STAR with --genomeLoad NoSharedMemory to avoid using shared memory

Jul 27 18:20:04 ...... FATAL ERROR, exiting


but as you can see, it doesn't work - I failed at the first step.
Could you help me out? STAR works normally otherwise, so I'm guessing that probably something about my Ubuntu installation needs configuring? Is genome sharing even possible for mouse genome on a 32GB machine?

Thanks in advance,

Bero.

Alexander Dobin

unread,
Jul 27, 2015, 3:18:32 PM7/27/15
to rna-star, kows...@gmail.com, kows...@gmail.com
Hi Bero,

this most likely is caused by not enough shared memory allowed by system parameters:
What is the output of
$ sysctl -A | grep shm
The important values are shmall and shmmax.

Cheers
Alex

kows...@gmail.com

unread,
Jul 27, 2015, 3:27:20 PM7/27/15
to rna-star, ado...@gmail.com
Hi Alex,

thanks for a quick reply. The output is:

kernel.shm_next_id = -1
kernel.shm_rmid_forced = 0
kernel.shmall = 2097152
kernel.shmmax = 33554432
kernel.shmmni = 4096
vm.hugetlb_shm_group = 0

Alexander Dobin

unread,
Jul 27, 2015, 4:43:39 PM7/27/15
to rna-star, kows...@gmail.com, kows...@gmail.com
Please try:
sysctl -w kernel.shmall=8000000
sysctl -w kernel.shmmax=32000000000
To make the changes permanent, you can change the /etc/sysctl.conf file.

kows...@gmail.com

unread,
Jul 27, 2015, 5:13:50 PM7/27/15
to rna-star, ado...@gmail.com
Thanks Alex.

Now STAR doesn't crash, but it also appears that is is stuck at the genome loading step - RAM is not filling up and I cannot hear the harddrive working either. I fired it up with:


./STAR --genomeDir ./starindex-mm --genomeLoad LoadAndExit

Alexander Dobin

unread,
Jul 27, 2015, 5:22:14 PM7/27/15
to rna-star, kows...@gmail.com, kows...@gmail.com
Please send me the Log.out file.

Alexander Dobin

unread,
Jul 27, 2015, 5:57:47 PM7/27/15
to rna-star, ado...@gmail.com, kows...@gmail.com, ado...@gmail.com
One more thing - run '--genomeLoad Remove' before mapping with --genomeLoad LoadAndExit

kows...@gmail.com

unread,
Jul 28, 2015, 5:11:21 AM7/28/15
to rna-star, ado...@gmail.com
Ok, I rebooted the machine, just to be on the safe side.
Below is attempt 1 (only genomeLoad):

of@of-desktop:~/Desktop$ ./STAR --genomeDir ./starindex-mm/ --genomeLoad LoadAndExit
Jul 28 09:54:39 ..... Started STAR run
Jul 28 09:54:39 ..... Loading genome
of@of-desktop:~/Desktop$

Now after ~5 min STAR apparently finishes, but RAM appears to be empty (htop and system monitor).

If I now try something like this:

of@of-desktop:~/Desktop$ ./STAR --runThreadN 16 --outFilterMultimapNmax 1 --genomeDir ./starindex-mm/ --genomeLoad LoadAndExit --outFileNamePrefix /home/of/Desktop/test --readFilesIn ./uni1.fastq
Jul 28 10:32:01 ..... Started STAR run
Jul 28 10:32:01 ..... Loading genome

of@of-desktop:~/Desktop$

Same thing happens (testLog.out).

With this (now with LoadAndKeep - testlog2):

of@of-desktop:~/Desktop$ ./STAR --runThreadN 16 --outFilterMultimapNmax 1 --genomeDir ./starindex-mm/ --genomeLoad LoadAndKeep --outFileNamePrefix /home/of/Desktop/test --readFilesIn ./uni1.fastq
Jul 28 10:48:05 ..... Started STAR run
Jul 28 10:48:05 ..... Loading genome

Jul 28 10:52:11 ..... Started mapping
Jul 28 10:55:29 ..... Finished successfully
of@of-desktop:~/Desktop$

STAR apparently finishes normally, but spends only ~4GB RAM.

Finally, with this (log3):

of@of-desktop:~/Desktop$ ./STAR --runThreadN 16 --outFilterMultimapNmax 1 --genomeDir ./starindex-mm/ --outFileNamePrefix /home/of/Desktop/test --readFilesIn ./uni1.fastq
Jul 28 11:01:02 ..... Started STAR run
Jul 28 11:01:02 ..... Loading genome
Jul 28 11:05:06 ..... Started mapping
Jul 28 11:08:16 ..... Finished successfully
of@of-desktop:~/Desktop$

RAM is filled and everything looks the way I'm used to. Should this look like this? Is it possible that for some reason RAM usage is reported differently when using shared memory with STAR?

Thanks once more for taking time to answer these newbie-questions. 
Log.out
Screenshot from 2015-07-28 09:59:32.png
testLog.out
testLog2(LoadAndKeep).out
testLog3.out

Alexander Dobin

unread,
Jul 28, 2015, 11:06:27 AM7/28/15
to rna-star, kows...@gmail.com, kows...@gmail.com
It seems all runs completed OK, but the genome was not kept in RAM.
After LoadAndExit, please run 
$ ipcs
Also run this command after the LoadAndKeep option.

This should list the shared memory segments, two per each genome.
One other thing to try is to use absolute path to the genome, not the relative, though on most systems it should not matter.

kows...@gmail.com

unread,
Jul 28, 2015, 12:47:15 PM7/28/15
to rna-...@googlegroups.com, ado...@gmail.com
Ok did it - Attached screenshots with htop and ipcs for:
1) After LoadAnd Exit
2) During LoadAndKeep mapping
3) After LoadAnd Keep mapping
4) After Remove.

Orange bars in htop are filling up the 'memory bracket' all the way to the right in the first three cases, while after Remove they dissapear. System monitor in all four cases shows RAM not being filled/empty.
Absolute/relative paths did not make any difference. So it appears it all goes into cached memory - whatever that really means, but it's ok for running STAR?
LoadAndExit.png
LoadAndKeep - during mapping.png
LoadAndKeep - post mapping.png
After Remove.png

Alexander Dobin

unread,
Jul 28, 2015, 6:54:23 PM7/28/15
to rna-star, kows...@gmail.com, kows...@gmail.com
The case 2) is missing the ipcs output, and that's the most informative one - it will tell us whether the actuall mapping job connects to the the shared memory. The LoadAndExit loads the genome into the shared memory all right, you can see the two fragments with non-zero key.

kows...@gmail.com

unread,
Jul 29, 2015, 2:14:43 AM7/29/15
to rna-star, ado...@gmail.com
Sorry, I missed that one. This is it during mapping step.

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status     
0x00000000 131072     of         600        524288     2          dest        
0x00000000 163841     of         600        524288     2          dest        
0x00000000 262146     of         600        524288     2          dest        
0x00000000 622595     of         600        524288     2          dest        
0x00000000 393220     of         600        16777216   2                      
0x00000000 950277     of         600        524288     2          dest        
0x00000000 655366     of         600        4194304    2          dest        
0x00000000 753671     of         600        524288     2          dest        
0x00000000 786440     of         600        134217728  2          dest        
0x00000000 1048585    of         600        524288     2          dest        
0x00000000 1081354    of         600        4194304    2          dest        
0x00000000 1114123    of         600        1048576    2          dest        
0x17310240 1146892    of         666        1          1                      
0x1731023f 1179661    of         666        26660210006 1                      

------ Semaphore Arrays --------
key        semid      owner      perms      nsems    

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages  
 

Alexander Dobin

unread,
Jul 29, 2015, 8:57:24 AM7/29/15
to rna-star, kows...@gmail.com, kows...@gmail.com
It seems to be working fine. While mapping, the nattch column shows that there is a job attached to the shared memory fragments. What was strange is that with genome already loaded, it took 4 min to start mapping, this should only take a few seconds.

Could you try to 2 LoadAndKeep jobs one after another, and send me the screend output (to check the loading time) and `ipcs` while the jobs are running?

Note that htop will not show the shared memory fragments unless they are attached to the running jobs, and that RAM will be probably considered cached. To see shared memory fragments you need to use `ipcs`.

kows...@gmail.com

unread,
Jul 29, 2015, 11:50:46 AM7/29/15
to rna-star, ado...@gmail.com
Great. About that genome loading - this time it was, for all practical purposes, instantaneous. I think that previously one Remove might have sneaked in between LoadAndExit and LoadAndKeep so that genome was not actually preloaded for step "2) During LoadAndKeep mapping". That's my mistake - sorry for that and for taking so much of your time - I was so focused on doing each step 'cleanly' that I obviously overdid it. :-(
LoadAndKeep - run 1.png
LoadAndKeep - run 2.png

Alexander Dobin

unread,
Jul 30, 2015, 8:50:35 AM7/30/15
to rna-star, kows...@gmail.com, kows...@gmail.com
Now it all makes sense - good luck with the rest of your runs.
Cheers
Alex

Mary Thomas

unread,
Apr 20, 2018, 6:44:55 PM4/20/18
to rna-star
Hi -
We are looking at using the LoadAndKeep parameter for teh shared memory. 
But, when I run the command you gave above, I get HUGE values for the shmall and shmmax values.
What does this mean?


ubuntu@herc:~$ sudo  sysctl -A | grep shm

kernel.shm_next_id = -1

kernel.shm_rmid_forced = 0

kernel.shmall = 18446744073692774399

kernel.shmmax = 18446744073692774399

kernel.shmmni = 4096

sysctl: reading key "net.ipv6.conf.all.stable_secret"

sysctl: reading key "net.ipv6.conf.default.stable_secret"

sysctl: reading key "net.ipv6.conf.eth0.stable_secret"

sysctl: reading key "net.ipv6.conf.lo.stable_secret"

vm.hugetlb_shm_group = 0

Alexander Dobin

unread,
Apr 20, 2018, 6:50:16 PM4/20/18
to rna-star
Hi Mary,

I am not certain about that, but I think it means that the shared memory is not limited and can be as large as physical RAM - so you do not need to adjust these parameters and can try the LoadAndKeep option.

Cheers
Alex
Reply all
Reply to author
Forward
0 new messages