Cheers,
Will
There's not a whole lot to it, frankly, assuming you've created a
sqlalchemy Engine:
df = DataFrame.from_records(engine.execute(sql))
any post-processing / adding of indexes will be up to you. e.g. you could do:
indexed_df = df.set_index(index_column_list)
here engine is an Engine and sql is some string with an SQL query
- Wes
Sorry, shooting from the hip here (or have had too much coffee).
Here's code that may actually work (forgot that SQLAlchemy returns a
ResultProxy object that pandas doesn't understand):
result = engine.execute(sql)
rows = [tuple(x) for x in result]
data = DataFrame.from_records(rows, names=result.keys())
- W
On Dec 12, 9:29 pm, Wes McKinney <wesmck...@gmail.com> wrote:
> On Mon, Dec 12, 2011 at 4:19 PM, Wes McKinney <wesmck...@gmail.com> wrote:
> > On Thu, Dec 8, 2011 at 8:06 AM, Wouter Overmeire <loda...@gmail.com> wrote:
> >> 2011/12/8 Will Furnass <willfurn...@gmail.com>
On a ORM-based query, I will get a list of user-defined objects
mapping (roughly) to rows of database tables; the data are held in
these objects' attributes (names corresponding to columns). One added
complication is that sometimes the data I care about are 'a couple of
tables away', so I need to catch things like
"qry_result[0].some_attr.data_from_another_table".
The best I have come up with so far is pasted below. Does this seem
like a common enough use case to merit an alternative to
DataFrame.from_records() eventually? I'd be happy to look into the
pandas core if given some pointers.
Best,
Hans-Martin
def getattr_recursively(object, name):
"""Same as getattr() built-in, but first split *name* into parts
and
descend into object hierarchy.
Return *None* if any of the attributes along the way is *None*.
"""
current_object = object
for n in name.split('.'):
try:
current_object = getattr(current_object, n)
except AttributeError as err:
if current_object is None:
break
else:
raise err
return current_object
columns = ('age_id', 'is_working.value')
list_of_lists = [[getattr_recursively(row, col) for col in columns]
for row in qry_result]
data = DataFrame.from_records(list_of_lists, columns=columns)
Having only made very basic use of SQLAlchemy I can't immediately
comment, but I imagine you can inspect each value in result.keys() and
if it's a nested record (a row in another table) then I imagine you
get another ResultProxy? So if you wanted to get a "flattened
DataFrame" from the SQLAlchemy results this would probably be the way
to go. Performance won't be all that great for large data sets, but
then you might not be using SQLAlchemy if that were a major concern.
- Wes