Am I doing this query of lots of columns correctly?

Mike Soultanian

unread,

Jun 25, 2020, 5:25:06 PM6/25/20

to sqlalchemy

I have a table with lots of columns (30) and I need to pull all columns and convert the selected rows to JSON and send it to my server (and I want the column names in the JSON so the server can easily figure out which value is which). I first tried this:

results = session.query(RecoveryLogEntries).limit(record_count)

I did this thinking it would give me everything and I could pull the column names and build my JSON, but that didn't seem to work because "results.column_descriptions" isn't available for some reason.

Then I tried this (provided every column):

results = session.query(RecoveryLogEntries.id, RecoveryLogEntries.recovery_instance_uuid, etc).limit(record_count)

After that I did the following:

columns = [desc['name'] for desc in results.column_descriptions]

recovery_data_list = []

for row in results:

row_dict = {}

for column_name in columns:

print('column name:', column_name)

row_dict[column_name] = getattr(row, column_name)

recovery_data_list.append(row_dict)

If I name each column the above code works, if I don't, I've learned it doesn't (only took me like 5 hours to figure that out!!).

So, my question: is it generally better practice to name every column that you want to pull, even if it's a long list? Also, why does using just RecoveryLogEntries instead of naming each column yield a different result? It seems weird because in the SQL world, I could do a "SELECT *" or "SELECT id, ..." and the output is still in the same format regardless of whether I explicitly name name each column or use * to select all columns. It just seems like it's a whole bunch of typing which could be error-prone. I'll do it if I need to, but what I'm really asking is what is the most appropriate/accepted/standard way to do this.

btw, SQLAlchemy rocks! It's a headfull, but I'm digging it!

thanks!

Mike

Jonathan Vanasco

unread,

Jun 26, 2020, 1:49:21 PM6/26/20

to sqlalchemy

I use a mixin class to handle this stuff. Example below.

> So, my question: is it generally better practice to name every column that you want to pull, even if it's a long list?

Not really. There is a "bundle" api here that might be better for you- https://docs.sqlalchemy.org/en/13/orm/loading_columns.html

> Also, why does using just RecoveryLogEntries instead of naming each column yield a different result?

One is querying a "table", the other is querying a list of items. The queried items could be a "table", "table column", "subquery column" or several other things.

> It seems weird because in the SQL world, I could do a "SELECT *" or "SELECT id, ..." and the output is still in the same format regardless of whether I explicitly name name each column or use * to select all columns.

A lot of people approach SqlAclhemy like they are writing SQL. IMHO, a better approach is to remember that SqlAlchemy lets you write Python that generates Sql for multiple dialects.

> It just seems like it's a whole bunch of typing which could be error-prone. I'll do it if I need to, but what I'm really asking is what is the most appropriate/accepted/standard way to do this.

The more standard ways are to expect/inspect the types of objects that are queried and act upon it. The results are not text, but objects. If you inspect them, you can pull out the relevant information.

anyways, using a generic mixin (far below), I use the following code. I also sometimes have methods on my objects to return json that only has specific fields (such as `as_json_v1`, `as_json_v2`, etc)

class RecoveryLogEntries(Base, UtilityObjectMixin):
    # ... your class ...

# then...

results = session.query(RecoveryLogEntries).limit(record_count)
as_json = [r.loaded_columns_as_dict for r in results]

class UtilityObjectMixin(object):
 """see https://github.com/jvanasco/pyramid_sqlassist/blob/master/pyramid_sqlassist/objects.py#L127-L165"""


    def columns_as_dict(self):
        """
        Beware- this function will trigger a load of attributes if they have not been loaded yet.
        """
        return dict((col.name, getattr(self, col.name))
                    for col
                    in sa_class_mapper(self.__class__).mapped_table.c
                    )


    def loaded_columns_as_dict(self):
        """
        This function will only return the loaded columns as a dict.
        See Also: ``loaded_columns_as_list``
        """
        _dict = self.__dict__
        return {col.name: _dict[col.name]
                for col in sa_class_mapper(self.__class__).mapped_table.c
                if col.name in _dict
                }


    def loaded_columns_as_list(self, with_values=False):
        """
        This function will only return the loaded columns as a list.
        By default this returns a list of the keys(columns) only.
        Passing in the argument `with_values=True` will return a list of key(column)/value tuples, which could be blessed into a dict.
        See Also: ``loaded_columns_as_dict``
        """
        _dict = self.__dict__
        if with_values:
            return [(col.name, _dict[col.name], )
                    for col in sa_class_mapper(self.__class__).mapped_table.c
                    if col.name in _dict
                    ]
        return [col.name
                for col in sa_class_mapper(self.__class__).mapped_table.c
                if col.name in _dict
                ]

Jonathan Vanasco

unread,

Jun 26, 2020, 1:50:16 PM6/26/20

to sqlalchemy

that should be `loaded_columns_as_dict()` , unless you decorate the method with @property.

Reply all

Reply to author

Forward