SAN Storage Performance

31 views
Skip to first unread message

Jacky Wong

unread,
Feb 1, 2017, 8:08:50 PM2/1/17
to Dataverse Users Community
Hello to all the superheroes in this project, :)

We are setting up our infrastructure for Dataverse. We intend to attach SAN storage to our servers. 

Our hardware guys wonder what is the performance criteria required for SAN storage in terms of IOPS ( https://en.wikipedia.org/wiki/IOPS )? 

Any idea if dataverse is intensive on I/O operations and what are some of the typical IOPS speed for the SAN storage that you are using please?

Thanks. 

Best Regards,
Jacky Wong
National Institute of Education, Singapore

danny...@g.harvard.edu

unread,
Feb 1, 2017, 8:34:21 PM2/1/17
to Dataverse Users Community
Hi Jacky - it's great to hear that you're moving forward with a Dataverse installation! Let me check with our operations group so that I can get a good answer for your question. I'll write more soon.

Others on the list may have input from their own installations as well. 

Thanks,

Danny

Don Sizemore

unread,
Feb 1, 2017, 8:35:16 PM2/1/17
to dataverse...@googlegroups.com
Hello,

Speaking only from my own experience with https://dataverse.unc.edu, I doubt you'll push your storage very hard. We run our Dataverse in a shared environment (VMware/Dell cluster over NFS mounts from our NetApp). The storage I/O will be bursty only during uploads and downloads, and unless you'll have a very active user base with a ton of huge files I don't see you bogging it down.

Be sure to give Glassfish plenty of CPUs (for us, 4, the maximum supported by VMware Fault Tolerance) and RAM (for us, 64GB, the maximum supported by Fault Tolerance).  Our Glassfish installation typically holds 18G-21G of active memory; the OS keeps 1G-2G free and uses the rest for buffers/cache.

If you install the Rapache module, Apache's CPU usage will increase significantly, but the one time I've seen our installation start to approach a system load of 1.0 was during our webinar when our archivist had all the folks following along at home publish their test datasets simultaneously. Solr indexing kept the machine busy for about 30 seconds, then things quieted back down.

I didn't really answer your question, but in planning for performance you'll want to focus on giving it plenty of RAM and CPU. If we start to outgrow our current VM I imagine we'll break Solr off onto its own host. I hope this helps?

Donald



--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse-community@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/d6910797-0be7-42d2-a1ac-025af41e3249%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jacky Wong

unread,
Feb 2, 2017, 3:59:28 AM2/2/17
to Dataverse Users Community
Dear Danny, 

Thanks so much for this. 

Yes, excited to join the Dataverse community. Looking forward to learn more on this. Thanks.

Jacky Wong

unread,
Feb 2, 2017, 4:30:26 AM2/2/17
to Dataverse Users Community
Hi Donald, 

Thanks for the generous sharing. The info definitely helps tremendously. I am definitely going for good size CPU and RAM. Thanks for the tips. :)
To post to this group, send email to dataverse...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages