Hi !
Ok. So if I understand you correctly, you want to keep query parameters solely for DBAPI drivers connection parameters and would hence not accept a PR that would implement something that changes that.
There are other reasons though for which I was looking into this. In particular, what I am mentioning is already sort of done by PyAthena. They use at least two query parameters that help tell where the data is stored.
One (`s3_staging_prefix`) tells where query results are stored and fits nicely amongst the connection parameters.
The second (`s3_prefix`) is used to tall where data should be stored when a table is created and does not fit so well.
In particular DDL statements compilation just blows in your face. A statement like:
Table('name', MetaData(), Column('c', Integer)).create(bind=engine)
Fails with:
File "~/pyathena/sqlalchemy_athena.py", line 313, in post_create_table
raw_connection = table.bind.raw_connection()
AttributeError: 'NoneType' object has no attribute 'raw_connection'
Table('name', MetaData(), Column('c', Integer)).create(bind=engine)
I guess the storage location of a table does fit in the table dialect kwargs:
Table('<name>', MetaData(), ..., awsathena_location='s3://...')
Initially I thought it could be useful, e.g. when building ETL pipelines that moves data around, to be able to bind a table with the actual storage location as late as possible (to reuse a Table object).
But generally other bits in the table definition needs to change too, like the name of the schema. So there is no real benefit and one has to create several Table objects anyway.
And the use of the connection is just an unfortunate hack... And this is an issue that should be addressed in PyAthena.
Thanks for your input, helps choosing the better fix for this.
Regards,
Nicolas