What is the best way to declare a table with 260 columns and add rows on that table

16 views
Skip to first unread message

janio mendonca junior

unread,
Feb 23, 2022, 3:42:03 PM2/23/22
to sqlal...@googlegroups.com
Hi all,

I have a inquiry from my job to create 2 tables related one-to-one and insert some rows on the table. I have a .CSV with the data dictionary from the table and I am wondering to know how to declare the tables columns automatically without write one by one column (there are 260 columns). Same thing for the insert, how to add rows to the multiple columns table without write column by column?

I have the data in a Data frame but I was not able to insert it using df.to_sql from pandas. Do you guys have any similar example?

Thank you all

Simon King

unread,
Feb 23, 2022, 3:52:30 PM2/23/22
to sqlal...@googlegroups.com
Build a list of Column objects from the columns in the CSV file, and
use that list to create a Table:

https://docs.sqlalchemy.org/en/14/core/metadata.html

Once you've created the Table, you can insert data into it using the
table.insert() method:

https://docs.sqlalchemy.org/en/14/core/tutorial.html#executing-multiple-statements

Hope that helps,

Simon
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
> ---
> You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/CADF7wwb0_ncRuU_CadqFegE9583W-xWWD4x%3DGy8V%3DgW0jKtcyg%40mail.gmail.com.

janio mendonca junior

unread,
Feb 23, 2022, 4:13:50 PM2/23/22
to sqlal...@googlegroups.com
Hi Simon,

Thank you for your help. I am brand new working with SQLalchemy, really appreciate if you explain how to generate the metadata with the list of column names from the .CSV to create the tables?

Simon King

unread,
Feb 24, 2022, 4:01:10 AM2/24/22
to sqlal...@googlegroups.com
Before we do that, you said that you tried pandas dataframe.to_sql but
it didn't work - can you explain what you mean? Did it raise an error,
or produce the wrong result, or something else?

Simon

On Wed, Feb 23, 2022 at 9:13 PM janio mendonca junior
> To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/CADF7wwYLD-OrxU9-Eq5mnSMoCBJ-EcFEcD%3DVXUEJJNLCx_dOjw%40mail.gmail.com.

janio mendonca junior

unread,
Feb 24, 2022, 8:22:21 AM2/24/22
to sqlal...@googlegroups.com
Hi Simon,
let me try to explain myself.

The picture below is a sample of the 260 fields from the table res_treb data_dictionary.csv that I have to create. 
Screen Shot 2022-02-24 at 7.28.33 AM.png

I have started trying to create the table by defining the class table, and defining the fields, however I noticed that will take so long because I have two more tables to define with the same number of fields. Below is the sample code that I am manually defining (column by column) in the Res table.

Screen Shot 2022-02-24 at 7.36.59 AM.png
Screen Shot 2022-02-24 at 7.44.42 AM.png

I was thinking about something like your suggestion to read the DBName column from .csv, create a list and define the table columns, however I don't know how to do it. I know how to create the list with a DBName column, but how to make the attribution, for example: area_code = Column(String(4))?

Second Approach

Regarding the df.to_sql from pandas I have found two problems, how to define the primary key and foreign key using the command below?
Screen Shot 2022-02-24 at 8.11.23 AM.png
I have tried to overcome this problem by using.Screen Shot 2022-02-24 at 8.12.08 AM.png
this is the problem when I try to insert using df.to_sql: I have a rowScreen Shot 2022-02-24 at 8.17.30 AM.png
and I am trying to insert on the table mls_treb if it doesn't exist
Screen Shot 2022-02-24 at 8.18.23 AM.png
I got this error that looks like it is related to the index but I could not figure it out. The field mls_number is declared as the primary key in my table mls_treb.

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1782, in _execute_context
    self.dialect.do_executemany(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 729, in do_executemany
    cursor.executemany(statement, parameters)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/mysql/connector/cursor.py", line 670, in executemany
    return self.execute(stmt)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/mysql/connector/cursor.py", line 568, in execute
    self._handle_result(self._connection.cmd_query(stmt))
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/mysql/connector/connection.py", line 854, in cmd_query
    result = self._handle_result(self._send_cmd(ServerCmd.QUERY, query))
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/mysql/connector/connection.py", line 664, in _handle_result
    raise errors.get_exception(packet)
mysql.connector.errors.ProgrammingError: 1054 (42S22): Unknown column 'mls_number' in 'field list'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/io/sql.py", line 1419, in to_sql
    raise err
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/io/sql.py", line 1411, in to_sql
    table.insert(chunksize, method=method)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/io/sql.py", line 845, in insert
    exec_insert(conn, keys, chunk_iter)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/io/sql.py", line 762, in _execute_insert
    conn.execute(self.table.insert(), data)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1289, in execute
    return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 325, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1481, in _execute_clauseelement
    ret = self._execute_context(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1845, in _execute_context
    self._handle_dbapi_exception(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2026, in _handle_dbapi_exception
    util.raise_(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1782, in _execute_context
    self.dialect.do_executemany(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 729, in do_executemany
    cursor.executemany(statement, parameters)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/mysql/connector/cursor.py", line 670, in executemany
    return self.execute(stmt)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/mysql/connector/cursor.py", line 568, in execute
    self._handle_result(self._connection.cmd_query(stmt))
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/mysql/connector/connection.py", line 854, in cmd_query
    result = self._handle_result(self._send_cmd(ServerCmd.QUERY, query))
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/mysql/connector/connection.py", line 664, in _handle_result
    raise errors.get_exception(packet)
sqlalchemy.exc.ProgrammingError: (mysql.connector.errors.ProgrammingError) 1054 (42S22): Unknown column 'mls_number' in 'field list'
[SQL: INSERT INTO mlstreb (mls_number, `0`) VALUES (%(mls_number)s, %(0)s)]
[parameters: ({'mls_number': 'mls_number', '0': 'C5405199'}, {'mls_number': 'active', '0': True}, {'mls_number': 'class_name', '0': 'RES'}, {'mls_number': 'active_date', '0': datetime.date(2022, 1, 3)})]
(Background on this error at: https://sqlalche.me/e/14/f405)

Simon King

unread,
Feb 25, 2022, 5:33:44 AM2/25/22
to sqlal...@googlegroups.com
I don't know enough about Pandas to explain your error, but I did
notice that the SQL that is failing is trying to insert into a table
called "mlstreb", but elsewhere in your code you refer to "mls_treb".
Could that be part of the problem?

To define a table from your CSV file, you could do something like this
(untested):

###############################
import csv
import sqlalchemy as sa

def get_columntype(column_definition):
if column_definition["DataType"] == "Character":
return sa.String(column_definition["MaximumLength"])
elif column_definition["DataType"] == "Decimal":
return sa.Numeric()
# etc.


def get_columns(csvfile):
with open(csvfile, "r") as f:
reader = csv.DictReader(f)
column_definitions = list(reader)

columns = []
for cd in column_definitions:
column = sa.Column(cd["DBName"], get_columntype(cd))
columns.append(column)
return columns

def create_table(metadata, tablename, csvfile):
columns = get_columns(csvfile)
return sa.Table(tablename, metadata, *columns)


metadata = sa.MetaData()
res_table = create_table(metadata, "res", "res_data_dictionary.csv")
mls_table = create_table(metadata, "mls", "mls_data_dictionary.csv")

engine = sa.create_engine("...")
metadata.create_all(engine)

###############################

"res_table" and "mls_table" in the code above are instances of
sqlalchemy.Table, which is part of "SQLAlchemy Core":

https://docs.sqlalchemy.org/en/14/core/metadata.html#sqlalchemy.schema.Table

If you want to use the ORM layer of SQLAlchemy, you can map classes to
those tables something like this:

https://docs.sqlalchemy.org/en/14/orm/mapping_styles.html#imperative-a-k-a-classical-mappings

import sqlalchemy.orm as asorm

mapper_registry = saorm.registry(metadata)

class Res:
pass

mapper_registry.map_imperatively(Res, res_table)

# Now Res will have properties for each of the columns from res_table

Hope that helps,

Simon


On Thu, Feb 24, 2022 at 1:22 PM janio mendonca junior
<jani...@gmail.com> wrote:
>
> Hi Simon,
> let me try to explain myself.
>
> The picture below is a sample of the 260 fields from the table res_treb data_dictionary.csv that I have to create.
>
>
> I have started trying to create the table by defining the class table, and defining the fields, however I noticed that will take so long because I have two more tables to define with the same number of fields. Below is the sample code that I am manually defining (column by column) in the Res table.
>
>
>
>
> I was thinking about something like your suggestion to read the DBName column from .csv, create a list and define the table columns, however I don't know how to do it. I know how to create the list with a DBName column, but how to make the attribution, for example: area_code = Column(String(4))?
>
> Second Approach
>
> Regarding the df.to_sql from pandas I have found two problems, how to define the primary key and foreign key using the command below?
>
> I have tried to overcome this problem by using.
> this is the problem when I try to insert using df.to_sql: I have a row
> and I am trying to insert on the table mls_treb if it doesn't exist
>
> To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/CADF7wwbR5mSkPHJ-sRQJGKzp9i2Xk_thj4UhbRVhwSsZfYshkw%40mail.gmail.com.

janio mendonca junior

unread,
Feb 25, 2022, 7:38:51 AM2/25/22
to sqlal...@googlegroups.com
Hi Simon,

Thank you for your explanation.

I managed to create the table by using df.to_sql. My mistake was because I was using a serie to sql instead of a dataframe.



Reply all
Reply to author
Forward
0 new messages