Need some help to avoid ConflictError

43 views
Skip to first unread message

Jürgen Gmach

unread,
May 23, 2019, 4:13:29 PM5/23/19
to zodb
Setting
- a software license management system based on Zope 2.13.29  / Python 2.7.15 / ZEO / ZODB3 = 3.10.5
- one off script which upgrades roundabout 2000 software licenses + creates delivery notes (pdf) + updates company data + create build infos (for the software) and packs everything in a zip file (called zil) for further processing
- script gets triggered via browser.

Test run on dev machine
On my dev machine script ran without a problem.

first test run on staging
On staging the script ran completely - than hit a ConflictError at the very end, and started again right from the beginning (one request = one transaction).


2019-05-17T15:34:34 INFO ZPublisher.Conflict ConflictError at /CompanyCenter/perform_mass_delivery: database conflict error (oid 0x6c8a1d, class BTrees.OOBTree.OOBucket, serial this txn started with 0x03cfc5fb1c25c788 2019-05-17 12:43:06.597088, serial currently committed 0x03cfc6232ad84677 2019-05-17 13:23:10.041756) (1 conflicts (0 unresolved) since startup at Thu May 16 11:24:35 2019)


second test run on staging
For the next test run, I put a transaction.commit() after each license upgrade.

After about the half of the licenses, another ConflictError occured, but this time the same license got retried, worked, and then also the rest of the licenses got upgraded.

At the same time the ConflictError occured, also a long running cron job via XML-RPC was triggered - which does not act on licenses, but on Calendar data, so I am unsure whether this caused the problem.


2019-05-17T17:03:11 INFO ZPublisher.Conflict ConflictError at /CompanyCenter/perform_mass_delivery: database conflict error (oid 0x6cc634, class BTrees.OOBTree.OOBucket, serial this txn started with 0x03cfc68726d9b288 2019-05-17 15:03:09.105558, serial currently committed 0x03cfc687300f5311 2019-05-17 15:03:11.264030) (5 conflicts (1 unresolved) since startup at Fri May 17 16:35:08 2019)


When I had a look at the oid, I got a BTrees.OOBTree.OOBucket - without deeper knowledge at first I expected to get a business object like a license or a company or ... which was tried to write on twice.

The bucket contains...
[('C2WGKES3NZBI', 'DeliveryNotePDF'), ('C2WGKES4VDVZ', 'BillingPdfPart'), ('C2WGKEWEYHYI', 'Licence'), ('C2WGKEWF6UNI', 'LicenseProfile'), ('C2WGKEWGVXYG', 'DeliveryNotePDF'), ('C2WGKEWIHBW4', 'BillingPdfPart'), ('C2WGKFA26DPJ', 'BillingPdfPart'), ('C2WGKFAY22TM', 'Licence'), ('C2WGKFAZMYZM', 'LicenseProfile'), ('C2WGKFAZVBHH', 'DeliveryNotePDF'), ('C2WGKFEFYQCI', 'Licence'), ('C2WGKFEGMMVU', 'LicenseProfile'), ('C2WGKFEGUFEK', 'DeliveryNotePDF'), ('C2WGKFEIFTFG', 'BillingPdfPart'), ('C2WGKFIS5NHJ', 'Licence')]


The zero index of the tuples are custom identifiers.

one off script

    def perform_mass_delivery(self):
       
"""This method gets triggered by the browser."""
        ziller_log
.info("beginning zil generation")
        licenses
= self.db.Licenses.get_licenses_for_update()
       
self._process_all_licenses(licenses)
       
return "Success!"


   
def _process_all_licenses(self, licenses):
        ziller_log
.info("licenses to be updated: %s" % len(licenses))
       
for i, license in enumerate(licenses):
            ziller_log
.info("about updating license no %s of %s" % (i, len(licenses)))
           
self._try_zil_generation(license)
           
# best place for transaction.commit?


   
@staticmethod
   
def _try_zil_generation(license):
       
try:
           
# a lot is going on in generate_zil_for_initial_deliery
           
# including the zip file generation - which is not covered by the transaction
            file_name
, successor = license.generate_zil_for_initial_delivery()
       
except Exception:
            ziller_log
.error("license: " + license.getId(), exc_info=True)
       
else:
            ziller_log
.info("license: " + license.getId() + " successfully updated")
            ziller_log
.info("location of ziller file: %s" % file_name)
            ziller_log
.info("successor: %s" % successor.getId())




Many questions..... (as I really, really want to understand what's going on)

1. Why did the ConflictError in the first test run happen exactly at the end of the run? Coincidence?

2. Why do I get an OOBucket from an oid and not a business object - like a single license?

3. What exactly does the ConflictError mean? Problem when writing to the bucket or writing to a single business object?

4. Is it common that a bucket contains so many objects? I always thought a good "hash table" contains one or zero values for a given key.

5. Where exactly did the ConflictError occur? Unfortunately there is no line number in the traceback.

7. When I look at my code again, I also think that "except Exception" is a bad decision. Should I at least catch ConflictError and re-raise?

8. Iff the `except Exception` would have caught the above ConflictError, than the log message would have to start with "license ..." - but it does not. 

The ConflictError gets logged as "INFO" - which line of my code triggered that message?

9. When a ConflictError occurs the transaction gets rolled back - but the created zil/zip file is still on disk. What is the best way to clean it up? In the except block?

10. When I redo the 2nd test run with the same conditions, will the ConflictError occur at the very same license or is this non deterministic?

11. Why does it read "1 unresolved" at the second test run, when the license upgrade indeed got retried and finally resolved?

12. Why do **ConflictErrors** even occur when I am the only user?

For the production license upgrade, I plan to:
- add a transaction.commit after each license upgrade
- deactivate all cron jobs
- deactivate nginx/haproxy, so I am the only user of the application (via lynx)
...

... any other hints/tipps/improvement suggestions for the above code or how to proceed?

Thank you very much for your help!
Jürgen

P.S.: This was first posted on the Plone community board - but there I got directed there.

Jürgen Gmach

unread,
Jul 8, 2019, 4:29:29 AM7/8/19
to zodb

As a quick feedback and to close this discussion.

I did the mass update on Friday night (I was not allowed to to it on Sunday because of working hours act), and it worked like a charm.

I did the following:

  • added a transaction.commit after each update
  • added a except ConflictError and re-raised it again
  • shut down Nginx so no concurrent requests could hit the server
  • temp. deactivated the cron jobs which also act on the db
Reply all
Reply to author
Forward
0 new messages