Backup ignores data?

jarle....@hjemme.no

unread,

Oct 27, 2006, 3:13:00 AM10/27/06

to

IB 7.5, running on Fedora 9

Backing up a 25G database resulted in an backup file of less than 3G. No
error message. Restoring this backup file resulted in a 5G database. Again
no error messages, and the new database worked all right. Since I recieved
no error messages during backup and restore, and since I had deleted most of
the text log records (BLOB), which are produced during data import, I
ignored the 25G versus 5G warning. I was stupid and deleted the original
25G database.
I later discovered that the biggest table, T_PERIODEDATA, only had 20E6
records left after restoring. The table T_PERIODEDATA contains hourly
consumption of energy, gas, current etc, collected from approx. 15000 meters
in different buildings in Norway.
This is the metadata for T_PERIODEDATA, with size in bytes for each column:

CREATE TABLE "T_PERIODEDATA"
(
"PD" NUMERIC(18, 0) NOT NULL, /* 8 bytes */
"PD_PG" INTEGER NOT NULL, /* 4 bytes */
"PD_VERSJON" SMALLINT NOT NULL, /* 8 bytes */
"PD_TYPE" CHAR(1) NOT NULL, /* 1 bytes */
"PD_FOM" TIMESTAMP NOT NULL, /* 8 bytes */
"PD_OT" TIMESTAMP NOT NULL, /* 8 bytes */
"PD_FORBRUK" NUMERIC(18, 4) NOT NULL, /* 8 bytes */
"PD_STAND" NUMERIC(18, 4) NOT NULL, /* 8 bytes */
"PD_TOTAL" NUMERIC(18, 4) NOT NULL, /* 8 bytes */
"PD_CRC" INTEGER NOT NULL, /* 4 bytes */
CONSTRAINT "PK_PERIODEDATA" PRIMARY KEY ("PD"),
CONSTRAINT "UQ_PD_FOM" UNIQUE ("PD_PG", "PD_FOM")
);
ALTER TABLE "T_PERIODEDATA" ADD CONSTRAINT "FK_PD_PG" FOREIGN KEY ("PD_PG")
REFERENCES "T_PULSGIVER" ("PG") ON UPDATE CASCADE ON DELETE CASCADE;

Refferring to http://www.ib-aid.com/interbase/firebird/bug/research.html
Adding header of 14 bytes, the record size of T_PERIODEDATA is 73 bytes.
Maximum page count for one table can be calculated as MaxDataPageCount =
(MaxInt / PageSize) * 17.476 .
With page size = 4096, one page can contain 4096/73 = 56 records. Therefore
the record limit of T_PERIODEDATA is approx. 500E6 records, .
If record limit is exceeded I should find "pointer page vanished from
DPM_next (249)" in the interbase.log and my database should stop working.
This has not happend.

There is another table containg information about daily consumption. This
table, T_DOGN_FORBRUK, has 8584276 records with no corresponding records in
T_PERIODEDATA. This means that T_PERIODEDATA has lost 8584276 * 24 = 206E6
records. This is only half of the calculated record limit of 500E6. This
also means that the T_DOGN_FORBRUK and T_PERIODEDATA tables are out of sync.
That should not happen.

We also know for certain that a backup < 3G (again the crazy size) was
produced automatically the 20th october, and that invoice produced (Based on
T_PERIODEDATA) the 23th of october was succesfull. After my manually backup
and restore the 24th october, the data in T_PERIODEDATA used for the invoice
at 23th october are partly missing.

The automatic backup is done on a windows server, while tha manual backup is
done on the Linux (Fedora 9) server.
I am certain the problem is caused by gbak.
Is this a known issue?

Jarle Nilsen
Scandinavian Electric AS

Dir .telf :55 50 60 41
Fax: 55 50 60 99
Mobil 40 40 21 58
E-post: jarle....@scel.no

Besøk vår hjemmeside: http://www.scel.no

Bill Todd

unread,

Oct 27, 2006, 10:28:56 AM10/27/06

to

<jarle....@hjemme.no> wrote:

> I am certain the problem is caused by gbak.
> Is this a known issue?

I assume you are running version 7.5.1. It is possible to lose data
from a corrupt database when you do a gfix -mend followed by a backup
and restore. That is the only case of data loss that I am aware of.

--
Bill Todd (TeamB)

Craig Stuntz [TeamB]

unread,

Oct 27, 2006, 10:26:51 AM10/27/06

to

I have a vague recollection of some versions of gbak on some platforms
(OS/filesystem) being limited to 2 or 4 GB file sizes. The workaround
was to use multi-file backups. But I don't remember the specifics. If
you have a test case which fails with a single-file backup, try a
multi-file backup and see what happens.

--
Craig Stuntz [TeamB] · Vertex Systems Corp. · Columbus, OH
Delphi/InterBase Weblog : http://blogs.teamb.com/craigstuntz
Please read and follow Borland's rules for the user of their
server: http://support.borland.com/entry.jspa?externalID=293

jarle....@hjemme.no

unread,

Oct 27, 2006, 12:36:29 PM10/27/06

to

I have never used gfix -mend.
I'm not shure of the version, but I have an SP1 folder with ibserver,
gds_lock_print and interclient.jar. This ibserver has same size as
/interbase/bin/ibserver.

> I assume you are running version 7.5.1. It is possible to lose data
> from a corrupt database when you do a gfix -mend followed by a backup
> and restore. That is the only case of data loss that I am aware of.

Jarle Nilsen

jarle....@hjemme.no

unread,

Oct 27, 2006, 12:27:13 PM10/27/06

to

When this database was approx 17G we had a succesfull manual gbak and
restore on the same configuration. I believe the backup file was approx. 7G.
Could there be an 8G limit for backup files?
I did have a problem when restoring if temporary files exceeded 2G. There is
a 2G limit for temporary files in IB7. Workaround was several TEMP_DIRECTORY
of 2G in ibconfig.
I write code for inserting records in T_PERIODEDATA, based on the records
for daily consumption that are lacking corresponding records in
T_PERIODEDATA. This should give the exact number of missing records. Than I
can try your suggestion and do a multifile backup.

Jarle Nilsen

Bill Todd

unread,

Oct 27, 2006, 12:41:56 PM10/27/06

to

<jarle....@hjemme.no> wrote:

> Could there be an 8G limit for backup files?

No. File system size limits are determined by whether they use a 32 bit
signed integer for the file size (max = 2g), an unsigned 32 bit integer
(max = 4g) or a 64 bit integer (max = 18 terabytes).

Before your try the multi-file backup you might want to run gfix
-no_update and see if it reports any errors.

--
Bill Todd (TeamB)

Quinn Wildman

unread,

Oct 27, 2006, 5:34:46 PM10/27/06

to

gbak doesn't use 32 bit I/O and doesn't need to. It simply appends to
the file for backup and reads serially through it for restore.
Therefore, the only limitations on file size for gbak are imposed by the
OS.

As Bill has suggested, gbak loosing data implies a corrupt database as
gbak reads data from the database just as any other client does.

jarle....@hjemme.no

unread,

Oct 30, 2006, 3:54:11 AM10/30/06

to

"Quinn Wildman" <qwil...@borland.com> skrev i melding
news:45427aaa$1...@newsgroups.borland.com...> gbak doesn't use 32 bit I/O and

doesn't need to. It simply appends to
> the file for backup and reads serially through it for restore. Therefore,
> the only limitations on file size for gbak are imposed by the OS.

1) At least some of the data that was troublesome for gbak, was possible to
read using my application.
Unfortunately gbak gave no warning about the situation.
How can I tell that something is wrong, i.e. what are the best steps before
using gbak?

2) I have been experimenting all weekend. Went back to the last backup and
restored it again. A new db of ca. 5G was created. I created a procedure
that inserted 9642408 records each time it was executed from IBConsole. I
checked the record count for the first 3 operations. Everything appeared to
be correct.
I executed the procedure 21 times, for PD_PG = 0 through PD_PG = 20, and
yes, I remembered to commit :-).
This should give 202E6 records. During these 21 operations I watched the
size of the database, ls -al, as it grew. The size grew from 5G to 25G
during the inserts.

After the operations I performed an select count(*) from T_PERIODEDATA2
which gave me 40345909
records. It should have been 21 * 9642408 = 202490568.

The newest records existed in the table, the other was not visible, but the
size had grown from 5G to 25G.

SQL> select count(*) from T_PERIODEDATA2 where PD_PG = 20; => count =
9642408
SQL> select count(*) from T_PERIODEDATA2 where PD_PG = 19; => count =
9642408
SQL> select count(*) from T_PERIODEDATA2 where PD_PG = 18; => count =
9642408
SQL> select count(*) from T_PERIODEDATA2 where PD_PG = 17; => count =
9642408
SQL> select count(*) from T_PERIODEDATA2 where PD_PG = 16; => count =
1776277 .
(Where are the rest of the 9642408 record?) => Total of 40345909 records
which is consistent with result from select count(*) from T_PERIODEDATA2.

SQL> select count(*) from T_PERIODEDATA2 where PD_PG = 15; => count = 0
...
And where have those records gone? I verified their existence...
SQL> select count(*) from T_PERIODEDATA2 where PD_PG = 2; => count = 0, did
have 9642408
SQL> select count(*) from T_PERIODEDATA2 where PD_PG = 1; => count = 0, did
have 9642408
SQL> select count(*) from T_PERIODEDATA2 where PD_PG = 0; => count = 0, did
have 9642408

How on earth can I avoid this problem? I recieve *no* error messages.

I executed the procedure a few times more with correct results, i.e. the
total number of records increased as it should.

I have deleted the database, but I will restore again and repeat this with a
script running from isql.

Suggestions are most welcome !

Jarle Nilsen

jarle....@hjemme.no

unread,

Mar 11, 2007, 11:33:50 AM3/11/07

to

We continued to use interbase despite the problems described earlier. It now
turns out that the problem most likely is a defect BIOS, or other hardware.
The machine is a HP G3 XENON, with fedora 9, IB 7.5.1, which
died completely last Friday, according to interbase.log. The BIOS error
message appears during startup. This machine has RAID, all disks shows green
light, i.e. OK state. Not only InterBase, but also systemfiles have been
corrupted. The machine is approx 12 months old. There are "binary" entries
in the interbase log, files dating back to february 2007. They are rare, but
they are there. Sometimes just one line of nonsense.

Although nobody can state that they have seen the BIOS error message during
startup before today, it is a server..., we are now wondering if the machine
have been faulty since it was delivered in march 2006, and this has caused
the described loss of data during backup in october 2006. Worse, the
database is corrupted corrupted again:
[root@SEE-Energi kundedb]# /opt/interbase/bin/gfix -v 2007.03.11.qkundedb.ib
database file appears corrupt ()
-wrong page type
-page 11949242 is of wrong type (expected 4, found 0)
[root@SEE-Energi kundedb]#

Unfortunately a lot of work has been done since our latest backup last
Wednesday. :-(

Jarle Nilsen

<jarle....@hjemme.no> skrev i melding
news:4542...@newsgroups.borland.com...

Bill Todd

unread,

Mar 11, 2007, 12:54:14 PM3/11/07

to

Thanks for letting us know what you found.

--
Bill Todd (TeamB)