appenddb error duing load

41 views
Skip to first unread message

Bill Dorsey

unread,
May 17, 2021, 9:35:38 AM5/17/21
to python-etl
Hello..

During my appenddb to oracle, i had a data issue with a child table.

Is there a way to tell petl to write the error-record to a flat file, and continue loading the good data?

Or better, capture the db error (ORA-02291), then run a different sql which would fix the issue, then reprocess the record?

As always, thanks in advance.


Brad Maggard

unread,
May 17, 2021, 8:40:21 PM5/17/21
to python-etl
"ORA-02291" is "Integrity Constraint,parent Key Not Found"

So I'm inferring that "had a data issue with a child table" means that you tried to insert children that had no parent.
 
In one proposed work-around, you are asking to be able to "run a different sql which would fix the issue, then reprocess".  I'm therefore assuming that your petl Table contains values for all required columns in both tables

Two ways to approach this:

1) fromdb(<parent table key colunns>).recordcomplement(<distinct parent table key columns from your petl Table>) joined back to all parent table columns from your petl Table and appenddb() that to the parent before the appenddb() to the child table

2) Instead of appenddb(), iterate over your petl Table with records() and process the parent, then the child, with e.g. SQLAlchemy

Bill Dorsey

unread,
May 24, 2021, 3:27:30 PM5/24/21
to python-etl
Thanks for the help.

Yea, I thought though options 1 and 2.   Both will work in my case.

It would be really cool if petl could isolate the bad record(s) based on the ora error, and keep loading the rest.   But of course, the is a request for enhancement and would end up in the pipeline of enhancements. 

Brad Maggard

unread,
May 25, 2021, 8:18:55 AM5/25/21
to python-etl
Fundamentally: petl's current implementation forces better design.  Having no information is better than having bad information.  Allowing only "good" records in a unit of work corrupts your database; e.g.:
  • Foreign key violation?  Invoices with only some of the line items.
  • Check constraint violation?  Database is missing all of the hazel-eyed folk (bad example: nobody but me likes or uses database constraints)
  • Resource problems on the target?  Now you have a random sample of rows committed.
Practically: .todb() and appenddb() process a unit of work.  There's no way to allow only "good" records in a unit of work.  The first "bad" record puts the transaction in a state where rollback is the only option.  
Reply all
Reply to author
Forward
0 new messages