I want to pursue setting up a chroma instance at BNL to run cosmic
muon MC for Daya Bay. This one class of simulation is very CPU
consuming as we are interested in rare processes like fast neutron and
Li9 production as well as high muon statistics.
To that end, the first thing I'd like to understand is how much memory
on the GPU card I will need for our geometry.
To get a feel for the geometry you can look at, for example, this presentation:
http://neutrino.physics.wisc.edu/talks/2011-07-DayaBay/DayaBay_July2011.pdf
At each site we have a water Cherenkov pool separated into inner and
outer regions. The pools hold either 2 or 4 stainless steel cylinders
that house the antineutrino detectors (AD). The ADs have two nested
acrylic vessels and layers of oil, scintilator and an innermost
scintilator + Gd. There are also reflectors on the top and bottom.
The two pool regions and the ADs each have ~200 8" PMTs. In the water
pool or on top of the ADs there are also details like cable conduits
and AD supports and calibration pods (see page 18 of that talk for a
picture).
So, each detector by itself is simpler than the LBNE WCD (which took
3GB iirc) but do you think I can get away with a 1.5GB card for the
ensemble or will a 3GB card be needed?
-Brett.
> To that end, the first thing I'd like to understand is how much memory
> on the GPU card I will need for our geometry.
>
> To get a feel for the geometry you can look at, for example, this presentation:
>
> http://neutrino.physics.wisc.edu/talks/2011-07-DayaBay/DayaBay_July2011.pdf
>
> At each site we have a water Cherenkov pool separated into inner and
> outer regions. The pools hold either 2 or 4 stainless steel cylinders
> that house the antineutrino detectors (AD). The ADs have two nested
> acrylic vessels and layers of oil, scintilator and an innermost
> scintilator + Gd. There are also reflectors on the top and bottom.
> The two pool regions and the ADs each have ~200 8" PMTs. In the water
> pool or on top of the ADs there are also details like cable conduits
> and AD supports and calibration pods (see page 18 of that talk for a
> picture).
>
> So, each detector by itself is simpler than the LBNE WCD (which took
> 3GB iirc) but do you think I can get away with a 1.5GB card for the
> ensemble or will a 3GB card be needed?
Short version: 1.5GB is plenty for Daya Bay, but I can quantify this below.
GPU storage in Chroma currently is used to hold triangles, unique vertices, per-triangle properties, material and surface definition tables, and the bounding volume hierarchy tree. Material and surface definitions take up negligible space, so really the storage scales like this:
* Vertices: 12 bytes per vertex
* Triangles: 12 bytes (3 vertex IDs) + 4 bytes (color) + 4 bytes (solid ID numbers) + 4 bytes (surface and bulk material ID codes) = 24 bytes per triangle
* BVH Nodes: 32 bytes per node
I find that most geometries have a unique vertex set that is half the size of the number of triangles, so you can ignore the vertices and just say that each triangle requires 36 bytes of storage on average.
LBNE had between 46 and 60 million triangles in it, depending on how much detail I was willing to render the PMTs with. That broke down into:
* 4.4M triangles for the entire cylinder
* 1464 triangles per PMT in the lower resolution case
The reason for so many triangles in the cylinder is to reduce the solid angle size of individual triangles to better match the size of PMTs, as that improves the efficiency of the entire BVH. A smaller cylinder can use fewer triangles.
So to estimate the worst case for a pool of 4 ADs, I'll assume that each cylinder is approximately 1/10th the linear size of the LBNE cavity, and therefore needs 1/100th the number of triangles to discretize the surface adequately. With some rounding up for contingency, that gives something like:
* 200k triangles for the water tank
* ~200 veto PMTs @ 2k triangles each
* 4 ADs * 200 PMTs @ 2k triangles each
* 4 ADs * 6 surfaces (steel can + two vessels) * 50k triangles each
Total = 3.4M triangles
The size of the BVH is dominated by the lowest layers, and scales like the # of triangles, so we can assume something < 2M BVH nodes for this geometry. That gives a total size of:
* Triangles + Vertices = ~117 MB
* BVH = ~60 MB
Given memory overhead for the CUDA driver + the desire to store and propagate many photons on the GPU at once, I'd say you could get by with a 1 GB card easily, so the GTX 580 1.5GB will be both as fast as possible and have more than adequate storage.
Thanks!
It's looking like the purse holders will have some funds for this
project. I'll shop from the specs that you listed in your WCD
reconstruction workshop presentation.
My favorite vendor (aslab.com) sells a system with a 1.5GB GTX 580 but
unfortunately charges more than $600 for the card alone. So, I'll
probably get the card separate from the host computer.
Beyond having an available PCI-E slot, is there anything I must assure
for the host computer?
For example, the power supply? Cooling? What about Batman sticker
vs. Spiderman sticker?...
-Brett.
Two things:
1) I would make sure you have a separate graphics card for running your display if you plan to have this computer sitting on your desk. (Visualization is smoother if you aren't going through an ethernet connection.) Chroma can potentially make long (few seconds) CUDA function calls, which delays screen updates or worse: triggers the display watchdog to terminate the CUDA operation so the screen can refresh. Basically, GPUs only support something like cooperative multitasking at this point in their evolution.
The easiest approach is to get another NVIDIA card so the same driver works for both cards. Anything cheap and GeForce 8/9/200/400/500 series is fine for the display. If you intend to only use the computer as a headless compute node, then you don't have to worry about this and only need the one GTX 580 card. The watchdog is disabled if you don't start X.
2) The high-end NVIDIA cards require supplementary power in the form of two separate power connectors from the power supply directly to the card. The GTX 580 requires an 8-pin and a 6-pin PCI-Express power cable. These are pretty standard on power supplies above 700W, unless you are buying your computer from a big OEM like Dell who tend to get custom components manufactured with all excess capability stripped out. For the display-only card, you can easily find one low enough power so that it does not require separate power connections.
> On Tue, Jan 10, 2012 at 4:35 PM, Stanley Seibert <ssei...@hep.upenn.edu> wrote:
>>
>> Short version: 1.5GB is plenty for Daya Bay, but I can quantify this below.
>
> Beyond having an available PCI-E slot, is there anything I must assure
> for the host computer?
> For example, the power supply? Cooling? What about Batman sticker
> vs. Spiderman sticker?...
And actually, this Batman sticker really does make it go faster:
http://www.newegg.com/Product/Product.aspx?Item=N82E16814130736
(This card is factory overclocked to run 3% faster than the standard GTX 580, which seems kind of silly to get excited over...)