On 06/24/2017 10:53 AM, Dave Vitek wrote:
> Hi all,
>
> I'll start with an example. Assume A and B are Table objects:
>
> >>> print select([A.id], from_obj=[A], whereclause=(B.field == 123))
> SELECT A.id FROM A, B WHERE B.id = 123
>
> As a convenience, sqlalchemy has added B to the FROM clause on the
> assumption that I meant to do this.
>
> However, in my experience, this behavior has hidden many bugs.
So yes, the implicit FROM behavior is for a very long time the ultimate
symptom of many other bugs, probably moreso in the ORM itself within
routines such as eager loading and joined table inheritance. A lot of
ORM bugs manifest as this behavior. However, as often as this is the
clear sign of a bug, it is actually not a bug in a whole bunch of other
cases when the "FROM comma list" is what's intended.
It isn't
> a feature we intentionally rely on. Usually, someone forgot a join and
> the query sqlalchemy invents is drastically underconstrained. These
> bugs sometimes go unnoticed for a long time. I want to raise an
> exception when this happens instead of letting the computer take a guess
> at what the human meant.
So to continue from my previous statement that "comma in the FROM
clause" is more often than not not what was intended, there is a large
class of exceptions to this, which is all those times when this *is*
what's intended.
Within the ORM, the most common place the FROM clause w/ comma is seen
is in the lazy load with a "secondary" table. In modern versions, the
two tables are added as explicit FROM, so your patch still allows this
behavior to work, though in older versions it would have failed:
from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
atob = Table(
'atob', Base.metadata,
Column('aid', ForeignKey('
a.id'), primary_key=True),
Column('bid', ForeignKey('
b.id'), primary_key=True)
)
class A(Base):
__tablename__ = 'a'
id = Column(Integer, primary_key=True)
bs = relationship("B", secondary=atob)
class B(Base):
__tablename__ = 'b'
id = Column(Integer, primary_key=True)
e = create_engine("sqlite://", echo=True)
Base.metadata.create_all(e)
s = Session(e)
s.add(A(bs=[B(), B()]))
s.commit()
a1 = s.query(A).first()
print
a1.bs
lazyload at the end is:
SELECT
b.id AS b_id
FROM b, atob
WHERE ? = atob.aid AND
b.id = atob.bid
So this works with your patch because that routine is using explicit
select_from(). However, there are a lot of other codepaths in the ORM
that still rely on implicit FROM and will fail with this change.
>
> I have modified _get_display_froms from selectable.py to return fewer
> FROM entities.
I am pleased that you have the motivation to dig in and work on this
code, and you've produced a quality patch that has taken the time to
understand how this works. So I think we can work here to possibly
come up with a behavioral adjustment though it's not certain yet.
So far, this seems to be working. Can you think of any
> pitfalls with this change?
It's always a good idea to run the tests to see. The
README.unittests.rst file has lots of information on how to do this but
for a deep change like this it's as easy as "python setup.py test".
There are lots of cases that break, many because you've asked that the
feature be "global opt in" and that isn't yet implemented, but also many
that would need to be fixed even if the "global opt in" is enabled.
I've pushed up your patch to gerrit here:
https://gerrit.sqlalchemy.org/#/c/441/
The builds there are both the main SQLAlchemy test suite as well as a
selection of test suites from the Openstack project, and in this case
both fail:
https://jenkins.sqlalchemy.org/job/sqlalchemy_gerrit/962/
https://jenkins.sqlalchemy.org/job/openstack_gerrit/824/
(note that the above links are only good for a few days as the builds
get automatically truncated).
It's not surprising that the SQLAlchemy suite has failures, though it
has a lot, but even the Openstack suite has failures and that is much
more surprising, because Openstack does not use the fancy mapping
patterns that we see failing in the main SQLAlchemy tests.
Within the SQLAlchemy suite, we have failures in the aaa_profiling
suite, which is testing that the number of Python calls in a select() is
limited to a known value; for these particular tests the guidelines are
pretty strict, so the change adds a small performance overhead, for example:
Adjusted function call count 145 not within 5.0% of expected 157,
platform 2.7_postgresql_psycopg2_dbapiunicode_cextensions
That's not a deal breaker as those tests are intended just to alert when
the performance profile of something changes. Additionally, you've
stated you would want this to be "opt-in", so one would be opting in to
that small performance hit in any case.
The patch causes many failures just in the Core tutorial, meaning there
are a bunch of basic SELECT examples that have been there for years that
don't work under strict mode:
OperationalError: (sqlite3.OperationalError) no such column:
users.fullname [SQL: u'SELECT users.fullname || ? ||
addresses.email_address AS title \nWHERE
users.id = addresses.user_id
AND
users.name BETWEEN ? AND ? AND (addresses.email_address LIKE ? OR
addresses.email_address LIKE ?)'] [parameters: (', ', 'm', 'z',
'%@
aol.com', '%@
msn.com')]
But those are end-user queries and if they had opted in to the alternate
behavior, they'd not be writing them that way, OK.
Then, lots of ORM-inheritance kinds of tests, CTE tests, and others
break with this change, and in many cases, the Core / ORM itself is
relying on implicit FROM taking place in an internal sense, so the
application that "globally opts in" here would still fail on all these
other cases, unless fixed. Many if not all of these breakages may be
fixable by reworking all the various logic to no longer make use of the
implicit FROM logic, however that would be lots of complexity and
destabilization to make that happen.
One example is, using the Company/Person/Engineer set of models that are
similar to those introduced at
http://docs.sqlalchemy.org/en/latest/orm/inheritance.html#joined-table-inheritance,
the following query fails:
sess.query(Person)
.with_polymorphic(
Engineer,
people.outerjoin(engineers))
.order_by(Person.person_id)
The above query names only one entity, Person, and does not outwardly
break any rules about adding additional FROM objects implicitly, however
the logic that triggers when we go to load additional columns that have
to do with Manager and Boss objects (another form of lazy load) renders
an invalid SELECT:
OperationalError: (sqlite3.OperationalError) no such column:
managers.manager_name [SQL: u'SELECT managers.manager_name AS
managers_manager_name, managers.status AS managers_status,
boss.golf_swing AS boss_golf_swing \nWHERE ? = managers.person_id AND
managers.person_id = boss.boss_id'] [parameters: (3,)]
The query it wants to emit is:
SELECT managers.status AS managers_status, boss.golf_swing AS
boss_golf_swing, managers.manager_name AS managers_manager_name
FROM managers, boss
WHERE ? = managers.person_id AND managers.person_id = boss.boss_id
So fixing cases like that means a lot of new work and testing, not to
mention that all of these cases must be tested in *two* entirely
different behavioral contracts, making this a much greater maintenance
and testing burden. Any time we fix a new bug in the area of SELECT
rendering in Core or ORM, we will often have to write twice as many
tests to make sure they work in "alternate FROM mode" as well.
Over in the Openstack suite, here's a Keystone test that fails:
keystone.tests.unit.test_backend_sql.SqlIdentity.test_list_users_with_unique_id_and_idp_id
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such
column: federated_user.user_id [SQL: u'SELECT user.enabled AS
user_enabled,
user.id AS user_id, user.domain_id AS user_domain_id,
user.extra AS user_extra, user.default_project_id AS
user_default_project_id, user.created_at AS user_created_at,
user.last_active_at AS user_last_active_at \nFROM user LEFT OUTER JOIN
local_user ON
user.id = local_user.user_id AND user.domain_id =
local_user.domain_id \nWHERE
user.id = federated_user.user_id AND
federated_user.unique_id = ? AND federated_user.idp_id = ?']
[parameters: ('9a5a4599df20426c9e9e6c86c8e36d3b', 'ORG_IDP')]
There's seven more in that variety in Keystone, looking at the query
they are testing it is:
query = session.query(model.User).outerjoin(model.LocalUser)
query = query.filter(
model.User.id == model.FederatedUser.user_id)
query = self._update_query_with_federated_statements(hints, query)
user_refs = sql.filter_limit_query(model.User, query, hints)
return [identity_base.filter_user(x.to_dict()) for x in user_refs]
We can guess that they are probably making use of the implicit FROM
feature within _update_query_with_federated_statements or
filter_limit_query(). Curious, I dug into Keystone to see what query
they intend to render; I thought hey, perhaps it would be funny if they
actually were getting a cartesian product here and your feature has
"found" it. Although, if that were the case, already I think this
points more to the kind of feature I'd prefer, which is that instead of
it just emitting the wrong SQL, it emits the SQL exactly as it does now
but emits a warning; that way, everyone gets to know about their poor
FROM listing and exactly why it is occurring, without having to
"globally opt in" to anything:
SELECT user.enabled AS user_enabled,
user.id AS user_id, user.domain_id
AS user_domain_id, user.extra AS user_extra, user.default_project_id AS
user_default_project_id, user.created_at AS user_created_at,
user.last_active_at AS user_last_active_at
FROM federated_user, user LEFT OUTER JOIN local_user ON
user.id =
local_user.user_id AND user.domain_id = local_user.domain_id
WHERE
user.id = federated_user.user_id AND federated_user.unique_id = ?
AND federated_user.idp_id = ?
So above, we can see they are in fact using the dreaded comma in the
FROM and even in an unusual way, "FROM table1, table2 JOIN table3" which
is very odd for SQL, however they are setting up the join from
federated_user to the user table properly, and they know what they are
doing.
So let's think of "global opt in" from that perspective. Openstack
application are enormous, sprawling apps that are developed and
maintained by dozens of people who often haven't met each other. By
requiring that the feature is only available as "global opt in",
Keystone in order to use this feature would need to do the same work we
did earlier; that they'd need to locate these queries and add in the
explicit FROM clause, or otherwise alter their logic that is taking
advantage of the implicit FROM to simplify their code. Additionally,
developers who work on Keystone would observe that for some reason they
don't understand, SQLAlchemy now behaves completely differently than it
does when they work on another Openstack app like Nova that does not use
this feature.
What would more likely happen is that Keystone just wouldn't make this
change; "global opt in" is too coarse grained to make migrations towards
using this feature simple and risk-free. That would lead to a
balkanized chunk of very intricate code in the middle of Core that
doesn't get used often, which is a recipe for heavy maintenance burdens,
when the behavior of FROM logic itself needs to be maintained and now it
must be maintained across two very different behavioral contracts, one
of which is almost never used.
The original use case you had is, "the current behavior leads to subtle
bugs". Why don't we re-think how to fix that directly, which I would
propose is most easily grantable to the entire SQLAlchemy ecosystem at
once by using a warning.
The warning would be, if someone does this:
stmt = select([t1.c.a]).select_from(t1).where(t2.c.b == 5)
that is, they put explicit FROM but then suggested a second implicit
FROM, and therefore are mixing the approaches, would emit a warning, and
that's it.
For your case, you'd see these warnings in your logs, and I would argue
this is sufficient for you to prevent the thing you're concerned about,
which is the mixing of explicit / implicit FROM.
But I'm not even sure I'm comfortable with a warning here because as we
see in Keystone, they are using this feature successfully. They
shouldn't have to see a warning if they like how it works now; there is
of course the warnings filter but I've observed people are rightly still
annoyed about warnings that are perceived as undeserved.
So far I've spent a lot of time showing my thought process in great
detail here because I hope to illustrate that yes, the FROM comma-list
thing is a problem, but the use is too much a core of SQLAlchemy's being
for it to really be changed, noting that I'd love to find a way to
improve this situation. An alternative use contract is definitely an
option, but as given as a patch to Core I fear this produces a so-called
"balkanized block" that greatly adds to maintenance burden. I've had to
remove lots of "pet features" over the years that I was too liberal in
accepting up front. Usually, if we can really detect a pattern that
"smells" of an error, and there's a way to explicitly prevent it, we can
make a warning in that case, but even in that case I'm not sure people
would appreciate being pushed away from "implicit from"; it's a handy
feature when used correctly and simplifies code.
A more elaborate detection would be, look for implicit FROM, then look
inside the WHERE clause and detect that in fact the multiple FROM
entries aren't being related to each other. That logic would be very
complicated as well as a performance hit, and would often be incorrect.
We would essentially be reinventing a small fragment of what the
database's query planner already does.
We can instead just ask the database using EXPLAIN to analyze the query,
or even just turn on slow query logging. This is pretty much the
answer I'd normally have given in this area in general; your developers
may write an inefficient SQL statement whether or not SQLAlchemy makes
one particular class of them more likely, and to really look for errors
of this class, you should be looking at SQL logs, running EXPLAIN on
queries you see, and setting up slow query logging on your development /
CI servers. There are other query-planner error cases besides this
one, such as missing indexes or other table-scan scenarios, gratuitous
use of OFFSET, etc. There's lots of things that can make a query slow.
Generalized safeguards in CI would guard against this one very
specific class of error in any case.
After all that, what I can suggest for you is that since you are looking
to "raise an error", that is actually easy to do here using events or
compiles. All you need to do is compare the calculated FROM list to
the explicit FROM list:
from sqlalchemy import table, column, select
from sqlalchemy.ext.compiler import compiles
from sqlalchemy.sql import Select
from sqlalchemy import exc
@compiles(Select)
def _strict_froms(stmt, compiler, **kw):
actual_froms = stmt.froms
# TODO: tune this as desired
if stmt._from_obj:
# TODO: we can give you non-underscored access to
# _from_obj
legal_froms = stmt._from_obj
else:
legal_froms = set([
col.table for col in stmt.inner_columns
])
if len(legal_froms) != len(actual_froms):
raise exc.CompileError("SELECT has implicit froms")
return compiler.visit_select(stmt, **kw)
t1 = table('t1', column('a'))
t2 = table('t2', column('b'))
# works
print select([t1]).where(t2.c.b == 5).select_from(t1).select_from(t2)
# raises
print select([t1]).where(t2.c.b == 5)
The issue of "FROM with comma" *is* a problem I'd like to continue to
explore solutions towards. I've long considered various rules in
_display_froms but none have ever gotten through the various issues I've
detailed above. I hope this helps to clarify the complexity and depth
of this situation.