New database stuff

51 views
Skip to first unread message

The Beez

unread,
Sep 14, 2016, 2:54:54 PM9/14/16
to 4tH-compiler
Hi 4tH-ers!

You haven't heard from me for a long time, that doesn't mean I wasn't working on anything. Reason? I wanted a facility that would update automatically the indexes after a table insert. Well, I got it. Compare this:

OLD
===
begin
  refill                               \ read next CSV record
while
  db.clear                             \ clear the DB buffer
    Field> db.buffer -> Rank      place
    Field> db.buffer -> Name      place
    Field> db.buffer -> City      place
    Field> db.buffer -> Country   place
    Field> db.buffer -> Height_m  place
    Field> db.buffer -> Height_ft place
    Field> db.buffer -> Floors    place
    Field> db.buffer -> Built     place
  db.insert                            \ read fields and insert record

  Sky.Name    idx.insert drop          \ update index, drop flag
  Sky.Height  idx.insert drop          \ update index, drop flag
  Sky.City    idx.insert drop          \ update index, drop flag
  Sky.Country idx.insert drop          \ update index, drop flag
  Sky.Rank    idx.insert drop          \ update index, drop flag
  Sky.Floors  idx.insert drop          \ update index, drop flag
  Sky.Built   idx.insert drop          \ update index, drop flag
repeat


NEW
===
begin
  refill                               \ read next CSV record
while
  db.clear                             \ clear the DB buffer
    Field> db.buffer -> State      place
    Field> db.buffer -> Abbr       place
    Field> db.buffer -> Capital    place
    Field> db.buffer -> Population place
    Field> db.buffer -> Area       place
    Field> db.buffer -> Seats      place
  dbs.insert                            \ read fields and insert record
repea
t

The only overhead is that you need to declare the indexes once:

US.State US.Abbr US.Capital 3 dbs.indexes abort" Cannot declare indexes"


And if you work neat, you close the database using "dbs.close" instead of "db.close". You'll see all the stuff pop up shortly, just be patient.

I also noticed people have no idea what they can do with the 4tH database library. Note, this thing is about 400 lines of native 4tH, which compiles to 1200 opcodes. Compared to that SQLITE is a giant!

So let's say we have a table with skyscrapers and a table with US states and their capitals and we want to list all skyscrapers in capital cities, like this:

SELECT US.Capital, US.State, Skyscraper.Name, Skyscraper.City
FROM US INNER JOIN Skyscraper ON US.Capital = Skyscraper.City
ORDER BY US.Capital

In 4tH a table consists of 3 elements:
- The table ID;
- The table buffer;
- The table indexes.

All communication goes by the table buffer, keep that in mind. Note we omitted all opening and closing overhead:

US.Capital idx.first                   \ goto the first record

begin                                  \ ** ENTER THE MAIN LOOP **
  US.Capital idx.error 0=              \ are we end-of-index?
while
  Sky db.use                           \ switch to the 'Skyscraper' table
  UState -> Capital count Sky.City dup idx.clear idx.find
                                       \ find capital using the 'City' index
  if                                   \ if we found it
    begin                              \ print fields from both tables
      UState -> Capital    count type ." , "
      UState -> State      count type ." , "
      Skyscraper -> Name   count type ." , "
      Skyscraper -> Floors count type cr
      Sky.City dup idx.next idx.error 0=
    while                              \ are we end-of-index?
      UState -> Capital count Skyscraper -> City count compare
    until                              \ have we had all records
  then                                 \ for this capital

  US db.use                            \ switch back to 'US' table
  US.Capital dup idx.next idx.error    \ get the next capital
until                                  \ are we end-of-index?
                                       \ ** END OF MAIN LOOP **


First we use the "US.Capital" index to fill the "UState" buffer. We use the field "Capital" in the buffer to find the appropriate record in the "Sky.City" index, which in turn fills the "Skyscraper" buffer. We use the flag it returns to determine if we print anything at all - hence an "inner" join. If we used the flag to determine whether to use the contents of a valid "Skyscraper" buffer or to simply print "NULL", it would have been an "outer" join.

The table ID's "Sky" and "US" are used to activate a table (and associated buffers and indexes). As you can see, (de)activation does not invalidate a buffer. Hence, you can still use a valid buffer, even if another table is active.

For those of you who still remember the time of dBase III or Dataflex this should be quite familiar. Sure, it may require a bit more manual work than SQLITE, but that doesn't translate to size. The program doing this query is in total 1300 opcodes - 1200 of them are dedicated to the database engine. On the other hand, you can decide how to approach a query - you don't have to worry about "query optimizers", since you're it. ;-)

Hans Bezemer

The Beez

unread,
Sep 15, 2016, 3:23:39 PM9/15/16
to 4tH-compiler
Hi 4tH-ers!

I added another function to the DBMIDX.4TH library. Let's say you're searching the "skyscraper" database and you need all the ones 1000 ft and over. Chances are slim there is one EXACTLY like 1000 ft, so this will most likely tell you "No such record, have a nice day".

: .record                              ( --)
  ." Rank   : " db.buffer -> Rank count type cr
  ." Name   : " db.buffer -> Name count type cr
  ." City   : " db.buffer -> City count type cr
  ." Country: " db.buffer -> Country count type cr
  ." Height : " db.buffer -> Height_ft count type ."  ft" cr
  9 spaces db.buffer -> Height_m count type ."  m" cr
  ." Floors : " db.buffer -> Floors count type cr
  ." Built  : " db.buffer -> Built count type cr cr
;

s" 1000" Sky.Height @ idx.find if .record then


What you want is this:

s" 1000" Sky.Height @ idx.search 0> 0= if .record then

This will give you the first record equal to or bigger than 1000 ft (>=). If the search results in "end of index" it will return 1. If it is found, it returns 0. If a bigger one is found, it will return -1. If the return value is less than 1, there is a valid buffer you can use. E.g. if you don't want the bigger one, but the smaller one, you can do this:

s" 1000" Sky.Height @ idx.search dup          \ perform the search
0< if
Sky.Height @ idx.previous then          \ not found, get previous one
0> 0= if .record
then                                                    \ only print a valid buffer

That brings this tiny program almost up to par with most other (small) database systems. Yes, we are still missing compound indexes - but there is a way round that: ugly, but it works. Combine the fields in an extra field, separated by a TAB, and add an index to that one. It works, I've already tested that one.

Hans Bezemer


The Beez

unread,
Sep 28, 2016, 5:37:55 PM9/28/16
to 4tH-compiler
Dear 4tH-ers!

Note I made some changes to the API since. dbs.indexes has been replaced by dbs.bind. dbs.close is replaced by dbs.release - and it doesn't do an implicit db.close anymore. dbs.binder ( xt -- xt) has been added. It takes an xt with the stackdiagram (x -- ) and loops through all index addresses of the extended dictionary. dbs.release is defined in terms of dbs.binder.

Code in SVN, as always.

Hans Bezemer
Reply all
Reply to author
Forward
0 new messages