[PR] InterSystems Cache to capture data on Billions of stars

136 views
Skip to first unread message

Rich Taylor, InterSystems

unread,
Dec 20, 2013, 10:01:39 AM12/20/13
to mvd...@googlegroups.com
"By the end of the decade, the Gaia archive of processed data is expected to exceed 1 Petabyte (1 million Gigabytes), equivalent to about 200,000 DVDs of information." As Gaia satellite maps the precise positions and distances to more than a billion stars, Caché will crunch all that big data!


Brian Speirs

unread,
Dec 20, 2013, 8:21:08 PM12/20/13
to mvd...@googlegroups.com
That's great news!

I guess we can conclude from this that we need a multi-dimensional database to store information about 3D space ... or the UniVerse isn't flat!

Cheers,

Brian

Anthony Youngman

unread,
Dec 20, 2013, 8:30:39 PM12/20/13
to mvd...@googlegroups.com
On 21/12/13 01:21, Brian Speirs wrote:
> That's great news!
>
> I guess we can conclude from this that we need a multi-dimensional
> database to store information about 3D space ... or the UniVerse isn't flat!

Except it isn't 3D! :-) It is (has to be) 4D, because we get all sorts
of "bendy light" effects once we start looking - even on a scale of just
the solar system ...

But yes, 2D databases just don't cut the mustard :-)
>
> Cheers,
>
> Brian

Cheers,
Wol

Jeremy Thomson

unread,
Dec 20, 2013, 8:32:47 PM12/20/13
to mvd...@googlegroups.com
 
I'm wondering about how the 'java objects' of Cache will be used for the GAIA data.
I suppose the raw data is the billion pixel images but the science happens when those pixels move when you take subsequent pictures at a different phase of orbit.
The moving pixels become the java objects, with a record on which images and where they appear, current best estimate of location.
Then there spectroscopy data, for detecting a star's composition and velocity relative to GAIA and doppler spectroscopy which can detect the wobble of the star which may indicate planets.
I assume the spectroscopy is not collected for all billion target stars and not as often as the optical images.
A good match for the MV model, irregular repeating or even absent sets of data.

Billion star survey, I just love this 21st century tech.

Jeremy Thomson

Rich Taylor, InterSystems

unread,
Dec 22, 2013, 10:45:10 AM12/22/13
to mvd...@googlegroups.com
Jeremy,

Unfortunately I was not the Sales Engineer on this project.  Too bad as it looks like a fun one!  There is a white paper we published on this that I believe has the data structure on it and alot more data on the project.  Here is the link to that if you are curious;

If memory serves you are basically correct.  Data such as position, luminosity, and spectral measurements will be stored.  Each record is only about 600 bytes, but there are alot of them.  Each object will have multiple readings over the 5 year mission.  The really cool part is the requirement for insertion rate and querying.  During the proof-of-concept they achieved something like 110,000 inserts on the database per second.  The production environment tested out at around 250,000.   The white paper discusses the initial project requirements and proof-of-concept work.

Rich
Reply all
Reply to author
Forward
0 new messages