The database should be able to store about 50.000 instruments that have
updates between 1 and 10.000 records per day (some instruments even more
often). At some peak times the database has to handle about 1000 updates
per second.
To handle such update frequencies I have implemented some caching as
well as buffering: I store blocks of small price records (Instrument,
Time, Last, Vol) in one database record.
The structure of the tables are very simple: (Instrument, Time, Last,
Vol).
No complex queries are required. The most often used transactions are:
1) Append new price record (up to 1000 per second)
2) Get a large block of price records of one instrunent.
Could anybody tell me, if this kind of task can be done by a Sybase or
Oracle database with the same performance as by a c-tree database or
even better?
Do I have to take care of some special design considerations?
Thanks for your help
Josef
email: hol...@teletrader.com
You are going to be hard-pressed to match the performance of a
procedural storage language that doesn't perform logging (I assume)
using a standard relational database. The real problem is that
relational buys you lots of flexibility and transparency at a
cost of performance.
>The database should be able to store about 50.000 instruments that have
>updates between 1 and 10.000 records per day (some instruments even more
>often). At some peak times the database has to handle about 1000 updates
>per second.
A single connection updating 1000 records/sec in a single table is
going to hammer most databases (well, without taking exotic solutions
such as multiple servers with middle-ware or smart-clients). I have
seen this sort of thing done, but you'll find that you can't have
much else going on in the database (at least on the table) while you
are performing updates.
>To handle such update frequencies I have implemented some caching as
>well as buffering: I store blocks of small price records (Instrument,
>Time, Last, Vol) in one database record.
>
>The structure of the tables are very simple: (Instrument, Time, Last,
>Vol).
>No complex queries are required. The most often used transactions are:
>1) Append new price record (up to 1000 per second)
>2) Get a large block of price records of one instrunent.
>
>Could anybody tell me, if this kind of task can be done by a Sybase or
>Oracle database with the same performance as by a c-tree database or
>even better?
>Do I have to take care of some special design considerations?
It sounds like the caching trick you are performing more-or-less
breaks with the relational model (you can probably shoe-horn it in
there but you will find it doesn't perform well and isn't very flexible).
One option would be to look into some of the newer object/relational
technologies..many of them support a time-series datatype that would
at least properly model your problem, but I really question wether
any of them can keep up with the data rates that you are suggesting.
You may want to try to go with a hybrid approach...use c-tree as
a staging area during bursts of high activity, then use a separate
thread to migrate records into a relational database. I don't know
if this fits with your problem, but its an idea.
-scott
--
Scott C. Gray gr...@voicenet.com "my keybard is brken"
Sybase Professional Services scott...@sybase.com
http://www.voicenet.com/~gray/sqsh.html
I have done similar things using a Sybase back-end, and using a
Sun Sparc 10 as the server. We were storing a live data feed of
futures & options data, and it worked pretty well, although
we sometimes had a pretty large backlog when the volume got
really large (such as when the US labor report is issued on the
first Friday of every month).
We had a dynamically allocated buffer on the incoming feed that could
grow to several megabytes in worst case scenarios, and we stored
tick-by-tick price information in flat files as we usually loaded
the entire files when we needed them. We optimized the number of
records in the historical database by storing data minute by minute
in the following format:
time open high low close [p2, p3, p4...]
where high, low, close and p2, etc. are stored as differences to
open (the first price in the minute).
We had to write some specialized software to manage the flat files,
but all in all it was a pretty fast and robust system.
Michael
--
Michael Peppler -||- Data Migrations Inc.
mpep...@mbay.net -||- http://www.mbay.net/~mpeppler
>Josef Holzer wrote:
>>
>> I am currently using a c-tree (Faircom) database for collecting
>> pricedata of fiancial instruments. As I want to improve the reliability
>> of the database as well as to increase the number of datarecords, I
>> would like to move to a more robust database like Sybase or Oracle. But
>> I do not want to lose performance compared to my existing c-tree
>> database that also runs fine on a desktop computer.
...
>We had to write some specialized software to manage the flat files,
>but all in all it was a pretty fast and robust system.
If a C++ class library is an option, you could have a look at MetaKit:
http://www.meta4.com/metakit/summary.html
It should be well suited for the performance level and amount of instrument
data you have described. The relevant features of MetaKit here are that it
offers portable and fail-safe data storage, and can handle nested data
structures which can be help in achieving very compact and fast storage.
Feel free to contact me to discuss ideas for your particular situation, I
have worked with scientific measurement data in the past (chromatography -
thousands of samples, millions of data points) and might be able to offer
some suggestions.
Regards,
Jean-Claude
----------------------------------------------------------------------
Jean-Claude Wippler ~ Meta Four Software ~ www.meta4.com
As other posters have pointed out, where the data collector is
faster than the data storer, double buffering (usually to a flat
file) is the solution, proved the storer can eventually catch up.
Secondly, the matter of transaction logging imposes performance losses.
But transaction logging can be stopped on several major RDBMSes,
and you can still have a backup/recovery strategy, and at-least-as-good
confidence in the system as CTREE.
To answer your question about special design considerations:
What's often missed by novice users of RDBMSes, especially complex
ones like Oracle, etc., is the importance of tuning. There are
so many different tuning options once you look, and it is somewhat
of a black art to understand. Without attention to tuning,
programmers get really bad performance, and hate SQL for ever.
With attention to tuning, they still dislike SQL but admit the
speed is workable.
I would point out that the difference between a badly or non-tuned
system and a tuned one are quite considerable, but tuning
requires study (of manuals) if you're new to the area.
Tuning also takes time and effort to get the tweaks and pokes right.
Tuning applies everywhere - server, disk, O/S, clients, programmatic
code, SQL, client-server comms, everywhere.
- Nigel
--
------------------------------------------------------------------------------
Nigel McFarlane. TUSC Computer Systems n...@tusc.com.au
666 Doncaster Rd, Doncaster 3108 BH: +61.3.9840.4476
Melbourne, Victoria, Australia. GMT +10 or +11 hours
When was the last time you were that frustrated _away_ from a
computer?
In article <338E3E...@tusc.com.au>, Nigel McFarlane <n...@tusc.com.au> writes:
> With attention to tuning, they still dislike SQL but admit the
> speed is workable.
Gosh darn that was a funny line!
--
Pablo Sanchez | wk: 415.933.3812| pg: 800.930.5635 -or- pab...@pager.sgi.com
--------------+-----------------+--------------------------------------------
pa...@sgi.com ... when mailing me, place "not spam" in the Subject
Phil
In article <338d9bbb...@news.utrecht.nl.net>,