Extract a list of columns?

1,239 views
Skip to first unread message

Matt

unread,
Apr 22, 2009, 6:49:52 PM4/22/09
to sqlparse
Can sqlparse 0.10 be used to extract a list of columns and details
(type, width, nullability, ...) from a CREATE TABLE statement?

Andi Albrecht

unread,
Apr 23, 2009, 2:45:32 AM4/23/09
to sqlp...@googlegroups.com
Hi Matt,

the current release (0.1.0) doesn't yet support methods to analyze
statements. But this is something that will go into some upcoming
version.

Anyway, you can retrieve this information by using low-level
functions. Here's an example:

# --- SNIP
#!/usr/bin/env python

import sqlparse

SQL = """CREATE TABLE foo (
id integer primary key,
title varchar(200) not null,
description text
);"""


parsed = sqlparse.parse(SQL)[0]

# extract the parenthesis which holds column definitions
par = parsed.token_next_by_instance(0, sqlparse.sql.Parenthesis)


# Helper function, that splits the column definitions
def extract_definitions(token_list):
# assumes that token_list is a parenthesis
definitions = []
# grab the first token, ignoring whitespace
token = token_list.token_next(0)
tmp = []
while token and not token.match(sqlparse.tokens.Punctuation, ')'):
tmp.append(token)
idx = token_list.token_index(token)
# grab the next token, this times including whitespace
token = token_list.token_next(idx, skip_ws=False)
# split on ","
if (token is not None # = end of statement
and token.match(sqlparse.tokens.Punctuation, ',')):
definitions.append(tmp)
tmp = []
idx = token_list.token_index(token)
token = token_list.token_next(idx)
if tmp and isinstance(tmp[0], sqlparse.sql.Identifier):
definitions.append(tmp)
return definitions


columns = extract_definitions(par)

for column in columns:
print 'NAME: %-12s DEFINITION: %s' % (column[0],
''.join(str(t) for t in column[1:]))

# --- SNIP

Internally the column definitions are not properly tagged ATM. For
example the definition for the "title" column is represented as this
(omitting the whitespaces):

- Identifier 'title'
- Builtin 'varchar'
- Parenthesis '(200)'
- Keyword 'not'
- Keyword 'null'

Note that both "Builtin varchar"/"Parenthesis (200)" and the two
keywords are not recognized as single items. When a top-level function
for this is implemented it may look something like this:

>>> stmt = sqlparse.parse(SQL)[0]
>>> stmt.get_type()
CREATE
>>> stmt.get_columns()
[<Column 'id'>, <Column 'title'>, <Column 'description']
>>> col = stmt.get_column('title')
>>> col.name
'title'
>>> col.notnull
True

BTW, I've file an issue on the tracker as a reminder to myself:
http://code.google.com/p/python-sqlparse/issues/detail?id=5

Andi
Reply all
Reply to author
Forward
0 new messages