I have a environment where I run some old version of xHarbour,
xHarbour Compiler build 1.1.0.
I'm trying to migrate it to latest version of Harbour cause I think
Harbour is much more stable than xHarbour and I feel secure with it
current development.
The biggest problem in out current environment is index corruption.
We run a terminal based application with DBFCDX index with lots of
users, about 1000 concurrent. All tables are reindexed at midnight but
still we have corruption some days during the working hours. We have a
clue that it might occur because our support team are killing users
processes that freeze.
I want to know if it is possible that without external intervention,
no processes being killed, the index still could corrupt per si, if
there was some error on DBFCDX code on xHarbour 1.1.0 that Harbour in
it latest version had fix. It would be a great argument to convince
our staff to do the migration.
Best regards.
Raul Almeida.
I'm already doing some work to port our code to Harbour.
Actually I prefer to really change our code to remove non clipper/
harbour compatible xHarbour extensions such the third argument for
"AT" function.
But I still would like to know if DBFCDX code had significantly
evolved since the projects break apart.
I already compare the sources files, then seams to be very different
but I cannot tell if functions that could possibility broke index had
changed.
Can you?
Thanks very much.
On Nov 29, 12:24 pm, Massimo Belgrano <mbelgr...@deltain.it> wrote:
> If you have index corrupttion with xHarbour Compiler build 1.1.0. you can
> recompile application withh harbour 3.0/3.1
> reindex with harbour application and verify
>
> add to each source #include "hbcompat.ch"
> hbmk2 source1.prg source2.prg xhb.lib
>
> then i suggest convert your app for not use the bad xhb
>
> http://harbourlanguage.blogspot.com/2011/10/understanding-xharbour-li...
>
> 2011/11/29 Raul Almeida <raul....@gmail.com>
Hope it helps
Maurizio
El 29/11/2011 12:52 p.m., Raul Almeida escribi�:
Hi,
> I have a environment where I run some old version of xHarbour,
> xHarbour Compiler build 1.1.0.
> I'm trying to migrate it to latest version of Harbour cause I think
> Harbour is much more stable than xHarbour and I feel secure with it
> current development.
> The biggest problem in out current environment is index corruption.
> We run a terminal based application with DBFCDX index with lots of
> users, about 1000 concurrent. All tables are reindexed at midnight but
> still we have corruption some days during the working hours. We have a
> clue that it might occur because our support team are killing users
> processes that freeze.
I also have quite big installations with hundreds users working
simultaneously and they work without any problems. I do not use any
automatic reindex procedures. Such tricks do not fix anything but
only hide the real reason of problem.
Without any doubts your code should work without reindexing too.
> I want to know if it is possible that without external intervention,
> no processes being killed, the index still could corrupt per si, if
> there was some error on DBFCDX code on xHarbour 1.1.0 that Harbour in
> it latest version had fix. It would be a great argument to convince
> our staff to do the migration.
AFAIK index should never be corrupted without external intervention.
You are using xHarbour version vreated between:
2007-11-01 10:09 UTC+0300 Phil Krylov <phil a t newstar.rinet.ru>
* include/hbver.h
* Changed CVS HEAD version number to 1.1.0.
2008-10-21 18:11 UTC+0100 Patrick Mast <patric...@xharbour.com>
* include/hbver.h
* Changed CVS HEAD version number to 1.2.0.
I cannot say too much about xHarbour binaries from this period because
I moved to Harbour. Just before Sandro Freire asked me about some help
in xHarbour HPUX builds and I created some patches which later were
commited to xHarbour:
2007-12-19 11:10 UTC-0300 Luiz Rafael Culik Guimaraes <luiz at xharbour.com.br>
! source/rtl/filesys.c
source/rtl/fssize.c
source/hbffind.c
! sync with harbour to allow usage of files >=4gb on hpux and linux
As I can see Luiz also commited some build time modifications for MT builds:
2007-10-23 11:50 UTC-0300 Luiz Rafael Culik Guimaraes <luiz at xharbour.com.br>
* config/hpux/dir.cf
config/hpux/gcc.cf
config/hpux/global.cf
config/hpux/install.cf
! updated to proper compile on hpux and generate mt libs
Anyhow this xHarbour version cannot be used to access the same files
simultaneously in aliased WA or by RDD from different threads in MT
mode so if you are creating MT programs or access the same files in
aliassed WAs then it can be the source of your problems.
In last years I haven't made any critical modifications for file
corruption in core DBFCDX. Just only few fixes for some very seldom
situations and some extensions for CP with multibyte characters and
accented characters with the same weight and added support for
concurrent file access in POSIX systems for MT programs.
I'm attaching ChangeLog entries. I do not think that they are releated
to your problem.
The problem you have it's rather local to your installation.
I.e. maybe corruption is caused by reindexing. It may happen if you
remove indexes when some applications are active and create new ones.
In such case active applications still use old indexes and update
them instead of the newly created which are not synced correctly.
Killing application which updates index can also cause index corruption.
You should catch sigterm signal, set some internal exit flag in signal
handler without program interrupting. Then you should cleanly close your
application checking this flag. You need current Harobur code which
can repeat IO operation interrupted by signals.
BTW why your applications are killed?
What does it mean: "users processes that freeze"?
Why they are frozen? What used to happen in such cases?
best regards,
Przemek
==========================================================================
2010-11-30 23:25 UTC+0100 Przemyslaw Czerpak (druzus/at/priv.onet.pl)
* harbour/src/rdd/dbfntx/dbfntx1.c
* harbour/src/rdd/dbfnsx/dbfnsx1.c
* harbour/src/rdd/dbfcdx/dbfcdx1.c
! fixed hb_cdpcmp() call used in indexing RDDs so it can work with
CPs using digraphs
2010-03-15 14:04 UTC+0100 Przemyslaw Czerpak (druzus/at/priv.onet.pl)
* harbour/src/rdd/dbfcdx/dbfcdx1.c
! fixed bad copy and past typo which could cause internal error when
new index using existing order (subindex) was created without ADDITIVE
clause. Bug reported by Mindaugas - many thanks for the information.
2009-10-31 12:44 UTC+0100 Przemyslaw Czerpak (druzus/at/priv.onet.pl)
* harbour/src/rdd/dbfcdx/dbfcdx1.c
% added small protection against code which may want to create
degenerated index tree using very large keys (over 158 bytes length)
2009-09-30 23:15 UTC+0200 Przemyslaw Czerpak (druzus/at/priv.onet.pl)
* harbour/source/rdd/dbfcdx/dbfcdx1.c
! fixed sorting with code pages using accented characters with the
same weight - it's necessary to disable some optimizations for
such CPs. Thanks to Jaroslav Janik for the information and example.
2009-09-22 12:58 UTC+0200 Przemyslaw Czerpak (druzus/at/priv.onet.pl)
* harbour/include/hbrddcdx.h
* harbour/source/rdd/dbfcdx/dbfcdx1.c
+ added support for national sorting using accented and multibyte
characters.
Warning CDX indexes created so far for such CDPs are not sorted using
the same conditions as current SVN code so new applications should
reindex.
Harbour codepages with accented characters:
cs852, csiso, cskam, cswin,
sk852, skiso, skkam, skwin
sv850, sviso, svwin
fr850, friso, frwin
el737, elwin,
Harbour codepages with multibyte characters:
cs852, csiso, cskam, cswin,
sk852, skiso, skkam, skwin
Now string keys in CDX indexes using above codepages are sorted
in the same way as HVM sorts strings. Please only remember that
CDX internal format was designed for byte weight sorting so such
CPs may reduce internal compression level increasing the size of
final indexes.
After this modification all native index types (NTX, NSX and CDX)
fully respects national sorting defined in Harbour CPs.
2009-09-11 20:37 UTC+0200 Przemyslaw Czerpak (druzus/at/priv.onet.pl)
* harbour/source/rdd/delim1.c
* harbour/source/rdd/sdf1.c
* harbour/source/rdd/dbf1.c
* harbour/source/rdd/dbffpt/dbffpt1.c
* harbour/source/rdd/dbfnsx/dbfnsx1.c
* harbour/source/rdd/dbfcdx/dbfcdx1.c
* harbour/source/rdd/dbfntx/dbfntx1.c
* replaced depreciated hb_cdp[n]Translate() functions with new
hb_cdp[n]Dup*()
+ added support for CP translations which can change the size of
translated strings
! disabled memo field translations for fields marked as binary
+ added support for UTF8 translations.
Using UTF8 as CP disables any national sorting in indexes.
Warning: please remember that normal character fields in tables have
fixed size and translation to UTF8 may increase the string
and cut results.
2009-04-21 18:24 UTC+0200 Przemyslaw Czerpak (druzus/at/priv.onet.pl)
* harbour/include/hbrddcdx.h
* harbour/source/rdd/dbfcdx/dbfcdx1.c
+ added support for CLIP indexes with IgnoreCase flag
2009-04-21 13:41 UTC+0200 Przemyslaw Czerpak (druzus/at/priv.onet.pl)
* harbour/include/hbrddcdx.h
* harbour/source/rdd/dbfcdx/dbfcdx1.c
+ added support for numeric indexes which store keys as modified
32bit big endian integer values - undocumented VFP hack.
2009-03-21 16:07 UTC+0100 Przemyslaw Czerpak (druzus/at/priv.onet.pl)
* harbour/source/rdd/dbf1.c
+ added native support for time and timestamp fields
* harbour/include/hbrddcdx.h
* harbour/include/hbrddnsx.h
* harbour/source/rdd/dbfntx/dbfntx1.c
* harbour/source/rdd/dbfcdx/dbfcdx1.c
* harbour/source/rdd/dbfnsx/dbfnsx1.c
* harbour/source/rdd/dbffpt/dbffpt1.c
+ added support for indexing timestamp fields
+ added support for using DATE values with timestamp fields
which replicate HVM behavior.
SEEK and SEEKLAST with date value when active index is on
timestamp positions to 1-st or last record where date part
of indexed timesamp value is equal.
Settings scopes to date values when active index is on timestamp
value reduce the visible record range to these ones which have
date part of timestamp value in the range of dates values used
for scopes. It possible to mix date and timestamp values in scope
and set one scope to date value and the second to timesamp.
2008-11-12 01:48 UTC+0100 Przemyslaw Czerpak (druzus/at/priv.onet.pl)
* harbour/source/rdd/dbfcdx/dbfcdx1.c
! fixed bug in joined leaf pages size calculation which activated
error massage enabled by HB_CDX_DBGCODE_EXT macro.
Thanks to Saulius for reporting the problem.
[TOMERGE 1.0]
2008-10-02 14:33 UTC+0200 Przemyslaw Czerpak (druzus/at/priv.onet.pl)
* harbour/include/hbrdddbf.h
* harbour/include/hbrddcdx.h
* harbour/include/hbrddntx.h
* harbour/include/hbrdddel.h
* harbour/include/hbrddsdf.h
* harbour/source/rdd/dbf1.c
* harbour/source/rdd/delim1.c
* harbour/source/rdd/sdf1.c
* harbour/source/rdd/dbffpt/dbffpt1.c
* harbour/source/rdd/dbfntx/dbfntx1.c
* harbour/source/rdd/dbfcdx/dbfcdx1.c
* harbour/source/rdd/hsx/hsx.c
* harbour/contrib/hbbmcdx/bmdbfcdx.c
* harbour/contrib/hbbmcdx/hbbmcdx.h
* use PHB_FILE and hb_file*() functions instead of HB_FHANDLE (hb_fs*())
to access files.
2008-09-18 19:28 UTC+0200 Przemyslaw Czerpak (druzus/at/priv.onet.pl)
* harbour/source/rdd/dbfcdx/dbfcdx1.c
! fixed memory leak - Many thanks to Miguel for report
2008-07-08 14:13 UTC+0200 Przemyslaw Czerpak (druzus/at/priv.onet.pl)
* harbour/include/hbrddcdx.h
* harbour/source/rdd/dbfcdx/dbfcdx1.c
! fixed casting for indexes with key length greater then 196 bytes
created on tables with record number smaller then 256. In such
case if keys have equal value then after decoding they may need
more then 32767 bytes and casting to SHORT gives negative indexes.
Thanks to Saulius Zrelskis for example.
"Przemysław Czerpak" <dru...@poczta.onet.pl> pisze:
> On Tue, 29 Nov 2011, Raul Almeida wrote:
>
> Hi,
>
[...]
> BTW why your applications are killed?
> What does it mean: "users processes that freeze"?
> Why they are frozen? What used to happen in such cases?
>
>
> best regards,
> Przemek
Raul, what is your operating system ?
When netware, then this is the reason.
Regards,
Marek Horodyski
It's HPUX.
Regards.
It's funny, actually I work for the same company that Sandro Freire
and Luiz Culik do.
We already intercept some signals for do a "CLOSE ALL" before the
process terminate. But I know that in some situations processes are
being killed with a SIGKILL signal. I will try to change our internal
processes so no process will be killed this way.
Well, about those process that freeze, I have some situations where
our users complains that their application are frozen. They call to
our support team for kill their terminal sessions. I need to do more
investigation about these cases. I used to think that these situations
were rare and that application might had entered in a infinite loop or
something, but I recently watched a process that had freeze on a fcntl
system call on a cdx index file. But I can't say why that happened.
I already identified that when there are much competition on a cdx
file, many reports being generate on the same table for instance, it
causes starve on process that are trying to update the index file. We
are trying to eliminate this implementing a posix semaphore feature
inside dbfcdx code, so that when some process try to get a write lock
(exclusive) on dbfcdx it sign the semaphore for other processes that
will be geting a read lock (shared) on index file get into sleep
state.
I don't know if you ever got into this situation. It can be
replicated by writing two programs that use the same indexed table,
one is going to be just "dbskiping" on the table, the other will be
writing to table in a way that index has to be updated. You first
start the program that will update the index, then you start several's
instances of program that will be reading the index. It will be a
moment, when I started about 20 "index reader" instances on our
development machine, that they will prevent the "index writer"
program to get a exclusive lock on index so it will sleep for the time
those "readers" are active reading the index file.
Regards,
Raul Almeida
> 2008-10-21 18:11 UTC+0100 Patrick Mast <patrick.m...@xharbour.com>
> 2008-07-08 14:13 UTC+0200 ...
>
> read more >>
Hi Raul,
> It's funny, actually I work for the same company that Sandro Freire
> and Luiz Culik do.
>
> We already intercept some signals for do a "CLOSE ALL" before the
> process terminate. But I know that in some situations processes are
> being killed with a SIGKILL signal. I will try to change our internal
> processes so no process will be killed this way.
You do not have to call CLOSE ALL. You can make simple QUIT. CLOSE ALL
is executed automatically in such case. Import is only that index updated
operation is not interrupted.
> Well, about those process that freeze, I have some situations where
> our users complains that their application are frozen. They call to
> our support team for kill their terminal sessions. I need to do more
> investigation about these cases. I used to think that these situations
> were rare and that application might had entered in a infinite loop or
> something, but I recently watched a process that had freeze on a fcntl
> system call on a cdx index file. But I can't say why that happened.
fcntl() is used to lock the index so it means that the process was
stopped by some other one which were keeping active lock. It's also
possible that you exceed some system resources, i.e. total number of
FCNTL locks in system. Please check your OS configuration and check
how many locks can be set by single and by all processes and if
necessary extend existing limits. Each file used by RDD needs one
lock to emulate DOS/Windows DENY flags. Then each index needs one or
two lock for read/write operation, each memo file needs one lock and
each DBF file needs one lock for header locking in update operations
plus number of RLOCKs set concurrently (FLOCK is like a single RLOCK).
> I already identified that when there are much competition on a cdx
> file, many reports being generate on the same table for instance, it
> causes starve on process that are trying to update the index file. We
> are trying to eliminate this implementing a posix semaphore feature
> inside dbfcdx code, so that when some process try to get a write lock
> (exclusive) on dbfcdx it sign the semaphore for other processes that
> will be geting a read lock (shared) on index file get into sleep
> state.
It's a classic starvation effect. It may happen when kernel does not
try to block readers when writer set locks. I have no idea if this
can be tuned in HPUX. If not then you need different locking scheme
but I'm not sure using IPC for it is good idea. I do not know details
of your implementation but usually in such case you should disable
existing locks and use IPC only instead. Otherwise it's a risk of
~creating deadlocks. Anyhow I do not see how IPC can help in such case.
It has to mean that HPUX kernel uses different scheduling algorithms
for processes waiting on IPC semaphores and for processes waiting on
FCNTL locks what seems to be strange for me. Can you present the
exact locking scheme you implanted?
> I don't know if you ever got into this situation. It can be
> replicated by writing two programs that use the same indexed table,
> one is going to be just "dbskiping" on the table, the other will be
> writing to table in a way that index has to be updated. You first
> start the program that will update the index, then you start several's
> instances of program that will be reading the index. It will be a
> moment, when I started about 20 "index reader" instances on our
> development machine, that they will prevent the "index writer"
> program to get a exclusive lock on index so it will sleep for the time
> those "readers" are active reading the index file.
So looks that your system does not block new readers setting shared
lock when writing process is waiting for exclusive lock. If this
system behavior cannot be tuned then you have to use different
locking scheme which can eliminate this problem. I guess that at
least exclusive (write) lock are implemented as simple queue so
each process can set such lock so the trivial solution is disabling
shared locks in DBFCDX (and memo drivers if you are using them).
Anyhow it's net very efficient solution on multi CPU machine if
many read processes access concurrently the same index because only
one of them can set the lock anyhow it may be enough for you.
You can easy test it. If not then you may try to implement some more
advanced lock scheme which allows to use concurrently many readers
using shared locks which periodically switch ti exclusive locks to
give writer a chance for updating the index.
best regards,
Przemek
> You do not have to call CLOSE ALL. You can make simple QUIT. CLOSE ALL
> is executed automatically in such case. Import is only that index updated
> operation is not interrupted.
That is ok. I'll ajust.
> fcntl() is used to lock the index so it means that the process was
> stopped by some other one which were keeping active lock. It's also
> possible that you exceed some system resources, i.e. total number of
> FCNTL locks in system. Please check your OS configuration and check
> how many locks can be set by single and by all processes and if
> necessary extend existing limits. Each file used by RDD needs one
> lock to emulate DOS/Windows DENY flags. Then each index needs one or
> two lock for read/write operation, each memo file needs one lock and
> each DBF file needs one lock for header locking in update operations
> plus number of RLOCKs set concurrently (FLOCK is like a single RLOCK).
I think first hypothesis is most likely. Maximum file locks system
parameter is set to a very high number. I already took a look at it
and it's utilization is almost always below 50%.
> It's a classic starvation effect. It may happen when kernel does not
> try to block readers when writer set locks. I have no idea if this
> can be tuned in HPUX. If not then you need different locking scheme
> but I'm not sure using IPC for it is good idea. I do not know details
> of your implementation but usually in such case you should disable
> existing locks and use IPC only instead. Otherwise it's a risk of
> ~creating deadlocks. Anyhow I do not see how IPC can help in such case.
> It has to mean that HPUX kernel uses different scheduling algorithms
> for processes waiting on IPC semaphores and for processes waiting on
> FCNTL locks what seems to be strange for me. Can you present the
> exact locking scheme you implanted?
The "solution" consist on creating a semaphore pair for each process's
open table. One semaphore indicates read locks, other indicates write
locks. Before a process put a read lock it first wait on write lock
semaphore, then it updates read lock semaphore, then it do the read.
Before a process put a write lock it first updates write lock
semaphore, then wait on read semaphore, then it do the write.
This algorithm was implemented on hb_cdxIndex(Un)Lock(Read|Write)
functions but the first implementation failure because people who
implemented it didn't create semaphores with right access permissions
and had placed the algorithm in wrong places inside
hb_cdxIndex(Un)Lock(Read|Write) functions, causing deadlock when index
were opened in exclusive mode for instance.
> So looks that your system does not block new readers setting shared
> lock when writing process is waiting for exclusive lock. If this
> system behavior cannot be tuned then you have to use different
> locking scheme which can eliminate this problem. I guess that at
> least exclusive (write) lock are implemented as simple queue so
> each process can set such lock so the trivial solution is disabling
> shared locks in DBFCDX (and memo drivers if you are using them).
> Anyhow it's net very efficient solution on multi CPU machine if
> many read processes access concurrently the same index because only
> one of them can set the lock anyhow it may be enough for you.
> You can easy test it. If not then you may try to implement some more
> advanced lock scheme which allows to use concurrently many readers
> using shared locks which periodically switch ti exclusive locks to
> give writer a chance for updating the index.
Indeed our production server is a 16 cores machine, but I think your
solution is much clever then ours.
I naively imagined that I could put a unsigned integer counter on
reads locks then check it for a certain number for cycles to do a
exclusive lock. Could this work?
Thanks you for again.
Regards,
Raul Almeida.
Hi Raul,
> The "solution" consist on creating a semaphore pair for each process's
> open table. One semaphore indicates read locks, other indicates write
> locks. Before a process put a read lock it first wait on write lock
> semaphore, then it updates read lock semaphore, then it do the read.
> Before a process put a write lock it first updates write lock
> semaphore, then wait on read semaphore, then it do the write.
>
> This algorithm was implemented on hb_cdxIndex(Un)Lock(Read|Write)
> functions but the first implementation failure because people who
> implemented it didn't create semaphores with right access permissions
> and had placed the algorithm in wrong places inside
> hb_cdxIndex(Un)Lock(Read|Write) functions, causing deadlock when index
> were opened in exclusive mode for instance.
Reading your description I do not understand why you need two locks.
Above text suggest that one exclusive lock is enough.
Can you send me the exact code of hb_cdxIndex(Un)Lock(Read|Write)
functions so I can see the exact locking order.
BTW What does happen when your application is killed with active
IPC lock? Who does unlock it?
best regards,
Przemek
Pls chk ur mail box.
Thanks.
> Anyhow this xHarbour version cannot be used to access the same files
> simultaneously in aliased WA or by RDD from different threads in MT
> mode so if you are creating MT programs or access the same files in
> aliassed WAs then it can be the source of your problems.
We don't use MT programs, but I like if you could tell me more about
this same file in aliassed WAs problem.
I don't believe we open the exactly same file with different alias,
but I know we do open same file with different paths.
We do archiving on old data to a "years before" area in a different
path and some program do open current date table and "years before"
table concurrently with different alias's.
Can this be a problem?
Regards,
Raul Almeida
> 2008-10-21 18:11 UTC+0100 Patrick Mast <patrick.m...@xharbour.com>
> 2008-07-08 14:13 UTC+0200 ...
>
> read more >>
Hi,
> We don't use MT programs, but I like if you could tell me more about
> this same file in aliassed WAs problem.
> I don't believe we open the exactly same file with different alias,
> but I know we do open same file with different paths.
if this is exactly the same file (the same inode pointed by different
soft or hard links) then kernel recognize as single file and locks
set on different descriptors are shared between workareas in the same
process and can be overwritten so it's definitely not safe to use
such file in different WA without additional synchronization code
which is not present in the xHarbour version you are using.
> We do archiving on old data to a "years before" area in a different
> path and some program do open current date table and "years before"
> table concurrently with different alias's.
If I understand you well then these are different files so it should
not be a problem.
best regards,
Przemek
Hi Raul,
> Indeed our production server is a 16 cores machine, but I think your
> solution is much clever then ours.
> I naively imagined that I could put a unsigned integer counter on
> reads locks then check it for a certain number for cycles to do a
> exclusive lock. Could this work?
Nearly. I commited exact implementation to Harbour SVN today.
best regards,
Przemek