GfServer running more than one .2bit genome

Jayaraman, Pushkala

unread,

Sep 21, 2017, 5:19:06 PM9/21/17

to gen...@soe.ucsc.edu

Hello,

I’m a Bioinformatics Scientist at the Children’s Hospital of Philadelphia and I am currently working on setting up a local BLAT instance for a web application that we have developed.

As of now, we were running BLAT only for Hg19.2bit but now, we are working on expanding the application to also work for galGal4 and a couple other model organism reference files.

I was wondering what would be the best way to set it up without any "Out of memory errors"

I have a couple options I’d like to confirm as the most optimal:

we run gfServer the following way:

cd /data/blatDb/ && /usr/local/isPcr/gfServer start localhost 17779 stepSize=5 <assembly.2bit> &

We are currently hosting our web application also on the same server. sometimes on a smaller server, when we run gfServer for more than a couple .2bit files, we get this error:

gfServer(25656,0xa36c0000) malloc: *** mach_vm_map(size=2720165888) failed (error code=3)

*** error: can't allocate region

*** set a breakpoint in malloc_error_break to debug

needHugeMem: Out of huge memory - request size 2720165344 bytes, errno: 12

How do you suggest i best resolve this? what is the most optimal approach if i need the BLAT application to be locally deployed and run?

1. run each 2bit file on its own port (which means handle logic about what port number is run for which genome)

2. run gfServer for all genomes on a separate server and then call that server under hostname.

3. get more memory no the server (how much memory would i need if i were hosting atleast 5-7 genomes on a server?)

Regards,

Pushkala

Hiram Clawson

unread,

Sep 21, 2017, 5:33:14 PM9/21/17

to Jayaraman, Pushkala, gen...@soe.ucsc.edu

Good Afternoon Pushkala:

One gfServer instance is only for one genome assembly in one 2bit file.

To run other assemblies you need to run one gfServer instance for
each assembly. Each assembly is a single 2bit file.
Each gfServer has unique sets of ports, there are no ports in common.
Each gfServer instance requires approximately 3 to 6 Gb of memory to operate, depending
upon sequence size. The gfServers can be on different hosts if one host
does not have enough memory to run them all at the same time.

Your application needs to be aware of which ports and/or hosts are for which assemblies.
The UCSC genome browser uses a MySQL database table 'blatServers' to record
sequence name, host and ports for the gfServer instance.

--Hiram

Jayaraman, Pushkala

unread,

Sep 22, 2017, 11:50:30 AM9/22/17

to Hiram Clawson, gen...@soe.ucsc.edu

That’s an idea!
Thanks Hiram!

Pushkala Jayaraman,
Bioinformatics Scientist II
Division of Genomic Diagnostics, CHOP
Tel: 215-590-1390
Email: jayar...@email.chop.edu

Reply all

Reply to author

Forward