Re: [python-etl] How can I deal with clob(big text) data in oracle

706 views
Skip to first unread message

Alistair Miles

unread,
Jan 4, 2017, 4:59:24 AM1/4/17
to pytho...@googlegroups.com
Hi there,

I don't have direct experience with Oracle myself so I'm afraid I won't be much help. 

There is a lot of debug logging that is done by the petl.io.db module which you could try outputting to inspect the SQL statements that petl is generating and executing, that might give some clues for people familiar with the Oracle SQL dialect.

Logging is done via the Python standard logging module. If you add something like:

import logging
logging.basicConfig(filename='petl.log', level=logging.DEBUG)

...to your script you should get some debug logging from petl.io.db. If that doesn't give you any clues, feel free to post the log file here.

Cheers,
Alistair

On Wednesday, January 4, 2017, 조수웅 <suun...@gmail.com> wrote:
Hi

I've been used this petl for postgresql. It's Good.

But I'm struggle to insert in oracle database.

I have some clob clumns and I want to load data but It's not work.

anybody help me?? 

this my 
import petl as etl
from sqlalchemy import create_engine
import cx_Oracle

ORACLE_CONNECT = "the connection string for target oracle"
engine = create_engine('the connection string for source oracle')
connection = engine.connect()

#connection = cx_Oracle.connect(ORACLE_CONNECT)

class CursorProxy(object):
     def __init__(self, cursor):
         self._cursor = cursor
     def executemany(self, statement, parameters, **kwargs):
         # convert parameters to a list
         parameters = list(parameters)
         # check clob
         print(statement)
         input_sizes = {}
         input_sizes[1] = cx_Oracle.CLOB
         #for row in list(parameters):
             #print(row)
             #for k, v in row.items():
             #    if isinstance(v, basestring) and len(v) > 2000:
             #        input_sizes[k] = cx_Oracle.CLOB

         self._cursor.setinputsizes(**input_sizes)
         # pass through to proxied cursor
         return self._cursor.executemany(statement, parameters, **kwargs)
     def __getattr__(self, item):
         return getattr(self._cursor, item)

def get_cursor():
     return CursorProxy(cx_Oracle.connect(ORACLE_CONNECT).cursor())

table = etl.fromdb(connection, 'SELECT ID , BIG_TEXT FROM RESEARCH WHERE ROWNUM <= 10')

table = etl.rename(table,{'id': 'ID', 'big_text': 'BIG_TEXT'})
etl.todb(table, get_cursor(), 'TESTAAAAA', commit=True, dialect='oracle')

--
You received this message because you are subscribed to the Google Groups "python-etl" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python-etl+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Alistair Miles
Head of Epidemiological Informatics
Centre for Genomics and Global Health <http://cggh.org>
The Wellcome Trust Centre for Human Genetics
Roosevelt Drive
Oxford
OX3 7BN
United Kingdom
Email: alim...@googlemail.com
Web: http://purl.org/net/aliman
Twitter: https://twitter.com/alimanfoo
Tel: +44 (0)1865 287721

조수웅

unread,
Jan 5, 2017, 12:01:14 AM1/5/17
to python-etl
Thank you for your help.

But I need more.

Sorry. 
Please, Look at my code
class CursorProxy(object):
     def __init__(self, cursor):
         self._cursor = cursor
     def executemany(self, statement, parameters, **kwargs):
         # convert parameters to a list
         parameters = list(parameters)
         # check clob
         #print(statement)
         
         abc = [None,cx_Oracle.CLOB]
         self._cursor.setinputsizes(*abc)
         # pass through to proxied cursor
         return self._cursor.executemany(statement, parameters, **kwargs)
     def __getattr__(self, item):
         return getattr(self._cursor, item)


I changed CursorProxy class and It works.
But I hard coded the abc var part. In the oracle, I need to set cx_Oracle.CLOB in the right position which is more than 4000 char.
I want to know which column(or field) is going to be big string so I can set cx_Oracle.CLOB

Is there good way to know field type?

actually. In the oracle, data type is different(varchar2 is short string, clob is big string)

is there way to know the field type before the query or when I execute the fromdb.

Sorry for my english. But I love this petl and want to communicate.

Thank you. 


2017년 1월 4일 수요일 오후 6시 59분 24초 UTC+9, aliman 님의 말:

Alistair Miles

unread,
Jan 6, 2017, 10:52:49 AM1/6/17
to pytho...@googlegroups.com
Hi there,

Again I am not Oracle expert so may not give good advice. But it looks to me like you are using petl.todb() to insert some data into an Oracle database table. It also looks like that table already exists. I.e., you created this table yourself, at some previous time, not using petl (I assume by executing SQL statements directly). Therefore you already know the schema for the table you are loading into. If you know the schema, then you must already know the data type for all columns. Is this true?

Petl does not provide any tools for inspecting the schema of a table in an existing database. If you want to programmatically inspect a database schema and find out what data type is used for what column, I would suggest using SQLAlchemy.

Apologies if I have misunderstood anything.

Cheers,
Alistair

Reply all
Reply to author
Forward
0 new messages