searching 500 K entries database

Martin at Codim

unread,

Mar 12, 1998, 3:00:00 AM3/12/98

to

What type of database is adviseable for a HUGHE database that has to be
searched FAST? The database consists of only one table with four fields. A
bit like an internet search engine... Shall i use a paradox table and sql
queries, an indexed table or intrabase or other SQL server database? What is
the fastest for this application?

I hope someone can help me out with this decision.

You can also answer me by email: mar...@byzantine.nl

Sundial Services

unread,

Mar 12, 1998, 3:00:00 AM3/12/98

to

In article <6e8r66$c53$1...@news.utrecht.NL.net> "Martin at Codim" <mar...@codim.nl> writes:

>What type of database is adviseable for a HUGHE database that has to be
>searched FAST? The database consists of only one table with four fields. A
>bit like an internet search engine... Shall i use a paradox table and sql
>queries, an indexed table or intrabase or other SQL server database? What is
>the fastest for this application?

>I hope someone can help me out with this decision.

Almost any database should be able to provide rapid searching, even of 500,000
records, if all of the searches being performed are ones that can take
advantage of B-tree indexes... which is the nearly-universal type of index
that you find.

Interbase search-engines usually use a different type of searching algorithm
because they are called upon to do fast full-text searches of document data.
It does not sound to me like you are doing that kind of search.

A B-tree takes a "divide and conquer" approach, using a tree structure that
always remains balanced. For example, the topmost level of the tree might be
(conceptually speaking...) the "100,000's." The next, "10,000's," then
"1,000's" and so on. You can see that any one of 500,000 entries could be
retrieved with no more than five or six index-page fetches and one data-page
fetch, if the tree had the structure I suggested. Actually the performance
would be better than that.

Jerry Gitomer

unread,

Mar 13, 1998, 3:00:00 AM3/13/98

to Sundial Services

Hi,

A B-Tree is a balanced index structure organized so that every
subsequent access to the index reduces the number of unsearched entries
by half. In a pure B-tree accessing a file with between 256K and 512K
records, 19 accesses to the index would be required to locate the record
and one to retrieve it. In practice most B-Tree developers adopt
shortcuts which can significantly reduce the number of physical reads of
the index at the expense of slightly slower inserts. (For instance,
keep a "master block" with pointers into the index. Even a 1K block for
an index with 12 byte keys and a 4 byte address will result in cutting
the number of accesses down from 19 to 13 and using a 16K master block
would reduce the number down to 9.

Regards

Jerry

--
Jerry Gitomer Since I know how to spell DBA I became one.
jgit...@p3.net

Michael Kay

unread,

Mar 13, 1998, 3:00:00 AM3/13/98

to

>A B-Tree is a balanced index structure organized so that every
>subsequent access to the index reduces the number of unsearched entries
>by half.

You're referring to a Binary Tree, where each node refers to two children.
The B-Tree
is a more general structure, in which there is usually a much higher fan-out
than
this. The B in B-Tree actually stands for Balanced, not for Binary.

Mike Kay, ICL

Sundial Services

unread,

Mar 13, 1998, 3:00:00 AM3/13/98

to

A very easy oversight to make. But the bottom line is that a tree with just a
few levels of depth can easily manage millions of entries, provided only
that the searches being requested are compatible with the index's abilities to
help them. Also, it is necessary to provide end-users with the ability (and
the obligation!) to be specific in searching. "Browsing" the database is out.

John Nickelson

unread,

Mar 14, 1998, 3:00:00 AM3/14/98

to

A lot depends on your current enviroment. Paradox, Dbase, Foxpro are
"desktop" databases meaning that if the table does not reside on the
workstation's hard drive that the search is being performed on, it will have
to load the entire table over the network for every search.

A client-server database, such as: MS SQL server, interbase, or Oracle, the
huge table exists in only one place, the server, and the query is sent to
the server , which executes the search, then only the "answer" is sent back
to the workstation.

I'm not familiar with any performance comparisons between the databases, i'm
sure you can get a lot of info on that if you pick up a database magazine.
But, again , first you must decide whether you need a desktop or
client-server database. What is your enviroment?

--
Jack
http://www.iamproud.com/~galactic/index.html
Martin at Codim wrote in message <6e8r66$c53$1...@news.utrecht.NL.net>...

>What type of database is adviseable for a HUGHE database that has to be
>searched FAST? The database consists of only one table with four fields. A
>bit like an internet search engine... Shall i use a paradox table and sql
>queries, an indexed table or intrabase or other SQL server database? What
is
>the fastest for this application?
>
>I hope someone can help me out with this decision.
>