Re: [Django] #34459: SearchVector() can return query strings that are unsafe to combine.

5 views
Skip to first unread message

Django

unread,
Apr 4, 2023, 6:41:25 PM4/4/23
to django-...@googlegroups.com
#34459: SearchVector() can return query strings that are unsafe to combine.
----------------------------------+------------------------------------
Reporter: Patryk Zawadzki | Owner: (none)
Type: Bug | Status: new
Component: contrib.postgres | Version: 4.2
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+------------------------------------

Comment (by Patryk Zawadzki):

David, I’m not sure if it’s “supported” or not, but my project uses
SearchVector with expressions such as Value instead of column names so I
see a lot of percent signs passed as the corpus of text, not the config or
weight. We calculate some of the values we index in Python code and
combine strings with different weights.

--
Ticket URL: <https://code.djangoproject.com/ticket/34459#comment:3>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Apr 4, 2023, 10:09:58 PM4/4/23
to django-...@googlegroups.com
#34459: SearchVector() can return query strings that are unsafe to combine.
----------------------------------+------------------------------------
Reporter: Patryk Zawadzki | Owner: (none)
Type: Bug | Status: new
Component: contrib.postgres | Version: 4.2
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+------------------------------------
Changes (by Simon Charette):

* cc: Simon Charette (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/34459#comment:4>

Django

unread,
Apr 5, 2023, 3:54:14 AM4/5/23
to django-...@googlegroups.com
#34459: SearchVector() can return query strings that are unsafe to combine.
----------------------------------+------------------------------------
Reporter: Patryk Zawadzki | Owner: (none)
Type: Bug | Status: new
Component: contrib.postgres | Version: 4.2
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+------------------------------------

Comment (by Mariusz Felisiak):

We use `compose_sql()` in different places. Unfortunately, it's sometimes
used together with uncomposed SQL statements causing issues when binding
parameters. IMO, we should escape `%` when we know that the parameters are
bound later, e.g.

{{{#!diff
diff --git a/django/contrib/postgres/search.py
b/django/contrib/postgres/search.py
index 4e370aa167..279a39e80e 100644
--- a/django/contrib/postgres/search.py
+++ b/django/contrib/postgres/search.py
@@ -146,7 +146,9 @@ class SearchVector(SearchVectorCombinable, Func):

# These parameters must be bound on the client side because we
may
# want to create an index on this expression.
- sql = connection.ops.compose_sql(sql, config_params + params +
extra_params)
+ sql = connection.ops.compose_sql(
+ sql, config_params + params + extra_params, quote_params=True
+ )
return sql, []


@@ -321,7 +323,9 @@ class SearchHeadline(Func):
if self.options:
options_params.append(
", ".join(
- connection.ops.compose_sql(f"{option}=%s", [value])
+ connection.ops.compose_sql(
+ f"{option}=%s", [value], quote_params=True
+ )
for option, value in self.options.items()
)
)
diff --git a/django/db/backends/postgresql/operations.py
b/django/db/backends/postgresql/operations.py
index 18cfcb29cb..8aa8bb5173 100644
--- a/django/db/backends/postgresql/operations.py
+++ b/django/db/backends/postgresql/operations.py
@@ -201,7 +201,14 @@ class DatabaseOperations(BaseDatabaseOperations):
return name # Quoting once is enough.
return '"%s"' % name

- def compose_sql(self, sql, params):
+ def quote_value(self, value):
+ if isinstance(value, str):
+ value = value.replace("%", "%%")
+ return value
+
+ def compose_sql(self, sql, params, quote_params=False):
+ if quote_params and params:
+ params = [self.quote_value(param) for param in params]
return mogrify(sql, params, self.connection)

def set_time_zone_sql(self):
}}}
Patryk, does it work for you?

--
Ticket URL: <https://code.djangoproject.com/ticket/34459#comment:5>

Django

unread,
Apr 5, 2023, 4:37:45 AM4/5/23
to django-...@googlegroups.com
#34459: SearchVector() can return query strings that are unsafe to combine.
----------------------------------+------------------------------------
Reporter: Patryk Zawadzki | Owner: (none)
Type: Bug | Status: new
Component: contrib.postgres | Version: 4.2
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+------------------------------------

Comment (by David Sanders):

I've confirmed Simon's patch works with this test:

{{{#!diff
diff --git a/tests/postgres_tests/test_search.py
b/tests/postgres_tests/test_search.py
index 6ec20c0654..9b5b841539 100644
--- a/tests/postgres_tests/test_search.py
+++ b/tests/postgres_tests/test_search.py
@@ -161,6 +161,12 @@ class SearchVectorFieldTest(GrailTestData,
PostgreSQLTestCase):
)
self.assertNotIn("COALESCE(COALESCE", str(searched.query))

+ def test_values_with_percent(self):
+ searched = Line.objects.annotate(
+ search=SearchVector(Value("This week everything is 10% off"))
+ ).filter(search="10 % off")
+ self.assertEqual(len(searched), 9)
+

class SearchConfigTests(PostgreSQLSimpleTestCase):
def test_from_parameter(self):
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/34459#comment:6>

Django

unread,
Apr 5, 2023, 4:53:05 AM4/5/23
to django-...@googlegroups.com
#34459: SearchVector() can return query strings that are unsafe to combine.
----------------------------------+------------------------------------
Reporter: Patryk Zawadzki | Owner: (none)
Type: Bug | Status: new
Component: contrib.postgres | Version: 4.2
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+------------------------------------

Comment (by Mariusz Felisiak):

Replying to [comment:6 David Sanders]:


> I've confirmed Simon's patch works with this test

Thanks :+1: I'm not Simon ;)

--
Ticket URL: <https://code.djangoproject.com/ticket/34459#comment:7>

Django

unread,
Apr 5, 2023, 10:16:08 AM4/5/23
to django-...@googlegroups.com
#34459: SearchVector() can return query strings that are unsafe to combine.
----------------------------------+------------------------------------
Reporter: Patryk Zawadzki | Owner: (none)
Type: Bug | Status: new
Component: contrib.postgres | Version: 4.2
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+------------------------------------

Comment (by Phil Gyford):

This is probably the same issue, but wanted to check it wasn't only
tangentially related and so missed.

If I have a model with a field like:

{{{
search_document = SearchVectorField(null=True)
}}}

Then I try to update that for object 123 by doing:

{{{
MyModel.objects.filter(pk=123).update(search_document=SearchVector(Value('10%')))
}}}

then with psycopg2 I get "IndexError: tuple index out of range".

and with psycopg3 I get "ProgrammingError: only '%s', '%b', '%t' are
allowed as placeholders, got '%''"

If I omit the % then there's no error.

--
Ticket URL: <https://code.djangoproject.com/ticket/34459#comment:8>

Django

unread,
Apr 5, 2023, 10:22:56 AM4/5/23
to django-...@googlegroups.com
#34459: SearchVector() can return query strings that are unsafe to combine.
----------------------------------+------------------------------------
Reporter: Patryk Zawadzki | Owner: (none)
Type: Bug | Status: new
Component: contrib.postgres | Version: 4.2
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+------------------------------------

Comment (by Florian Apolloner):

I am a bit confused, shouldn't `compose_sql` (and then `mogrify`) do that
automatically? ie is this a psycopg bug?

--
Ticket URL: <https://code.djangoproject.com/ticket/34459#comment:9>

Django

unread,
Apr 5, 2023, 10:28:02 AM4/5/23
to django-...@googlegroups.com
#34459: SearchVector() can return query strings that are unsafe to combine.
----------------------------------+------------------------------------
Reporter: Patryk Zawadzki | Owner: (none)
Type: Bug | Status: new
Component: contrib.postgres | Version: 4.2
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+------------------------------------

Comment (by Florian Apolloner):

> Unfortunately, it's sometimes used together with uncomposed SQL


statements causing issues when binding parameters. IMO, we should escape %
when we know that the parameters are bound later, e.g.

Ugh, we should probably not use `compose_sql` then

--
Ticket URL: <https://code.djangoproject.com/ticket/34459#comment:10>

Django

unread,
Apr 5, 2023, 11:33:01 AM4/5/23
to django-...@googlegroups.com
#34459: SearchVector() can return query strings that are unsafe to combine.
----------------------------------+------------------------------------
Reporter: Patryk Zawadzki | Owner: (none)
Type: Bug | Status: new
Component: contrib.postgres | Version: 4.2
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+------------------------------------
Changes (by Natalia Bidart):

* cc: Natalia Bidart (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/34459#comment:11>

Django

unread,
Apr 5, 2023, 12:02:01 PM4/5/23
to django-...@googlegroups.com
#34459: SearchVector() can return query strings that are unsafe to combine.
----------------------------------+------------------------------------
Reporter: Patryk Zawadzki | Owner: (none)
Type: Bug | Status: new
Component: contrib.postgres | Version: 4.2
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+------------------------------------

Comment (by Mariusz Felisiak):

Replying to [comment:10 Florian Apolloner]:


> > Unfortunately, it's sometimes used together with uncomposed SQL
statements causing issues when binding parameters. IMO, we should escape %
when we know that the parameters are bound later, e.g.
>
> Ugh, we should probably not use `compose_sql` then

Escaping `%` sounds like a reasonable compromise to me, we already do this
in schema editors, e.g. in
[https://github.com/django/django/blob/38e63c9e61152682f3ff982c85a73793ab6d3267/django/db/backends/mysql/schema.py#L59-L60
MySQL] or
[https://github.com/django/django/blob/38e63c9e61152682f3ff982c85a73793ab6d3267/django/db/backends/postgresql/schema.py#L59-L60
PostgreSQL]. Also, we avoid side-effect by providing the `quote_params`
argument.

--
Ticket URL: <https://code.djangoproject.com/ticket/34459#comment:12>

Django

unread,
Apr 5, 2023, 12:33:11 PM4/5/23
to django-...@googlegroups.com
#34459: SearchVector() can return query strings that are unsafe to combine.
----------------------------------+------------------------------------
Reporter: Patryk Zawadzki | Owner: (none)
Type: Bug | Status: new
Component: contrib.postgres | Version: 4.2
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+------------------------------------

Comment (by Florian Apolloner):

Yes, but the schema editor has generally been a thing where we know we do
suboptimal things and we have the generated SQL under control. Here we
have (in the worst case) user input and it simply feels like opening a can
of worms if we are not able to distinguish between parameters and the sql
iteself clearly.

--
Ticket URL: <https://code.djangoproject.com/ticket/34459#comment:13>

Django

unread,
Apr 5, 2023, 1:38:53 PM4/5/23
to django-...@googlegroups.com
#34459: SearchVector() can return query strings that are unsafe to combine.
----------------------------------+------------------------------------
Reporter: Patryk Zawadzki | Owner: (none)
Type: Bug | Status: new
Component: contrib.postgres | Version: 4.2
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+------------------------------------

Comment (by Mariusz Felisiak):

Replying to [comment:13 Florian Apolloner]:


> Yes, but the schema editor has generally been a thing where we know we
do suboptimal things and we have the generated SQL under control. Here we
have (in the worst case) user input and it simply feels like opening a can
of worms if we are not able to distinguish between parameters and the sql
iteself clearly.

Alternatively, we can remove `compose_sql()` from `SearchVector` (it was
added in
[https://github.com/django/django/commit/f83ee2a23ee28a271a50a005b62259d735fbea3f
f83ee2a23ee28a271a50a005b62259d735fbea3f]) but looks unnecessary now, see
draft [https://github.com/django/django/pull/16731 PR]. We can add test
for an index with `SearchVector` to confirm this.

--
Ticket URL: <https://code.djangoproject.com/ticket/34459#comment:14>

Django

unread,
Apr 5, 2023, 1:51:33 PM4/5/23
to django-...@googlegroups.com
#34459: SearchVector() can return query strings that are unsafe to combine.
----------------------------------+------------------------------------
Reporter: Patryk Zawadzki | Owner: (none)
Type: Bug | Status: new
Component: contrib.postgres | Version: 4.2
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+------------------------------------

Comment (by Florian Apolloner):

Lovely, if that passes the testsuite that would be certainly preferred.
Might be that follow up changes made this unnecessary

--
Ticket URL: <https://code.djangoproject.com/ticket/34459#comment:15>

Django

unread,
Apr 6, 2023, 12:47:37 AM4/6/23
to django-...@googlegroups.com
#34459: SearchVector() can return query strings that are unsafe to combine.
-------------------------------------+-------------------------------------
Reporter: Patryk Zawadzki | Owner: Mariusz
| Felisiak
Type: Bug | Status: assigned

Component: contrib.postgres | Version: 4.2
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* owner: (none) => Mariusz Felisiak
* status: new => assigned
* has_patch: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/34459#comment:16>

Django

unread,
Apr 6, 2023, 7:33:05 AM4/6/23
to django-...@googlegroups.com
#34459: SearchVector() can return query strings that are unsafe to combine.
-------------------------------------+-------------------------------------
Reporter: Patryk Zawadzki | Owner: Mariusz
| Felisiak
Type: Bug | Status: closed
Component: contrib.postgres | Version: 4.2
Severity: Release blocker | Resolution: fixed

Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak <felisiak.mariusz@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"4bf4222010fd8e413963c6c873e4088614332ef9" 4bf4222]:
{{{
#!CommitTicketReference repository=""
revision="4bf4222010fd8e413963c6c873e4088614332ef9"
Fixed #34459 -- Fixed SearchVector() crash for parameters with % symbol.

Thanks Patryk Zawadzki for the report.

Regression in 09ffc5c1212d4ced58b708cbbf3dfbfb77b782ca.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/34459#comment:17>

Django

unread,
Apr 6, 2023, 7:34:11 AM4/6/23
to django-...@googlegroups.com
#34459: SearchVector() can return query strings that are unsafe to combine.
-------------------------------------+-------------------------------------
Reporter: Patryk Zawadzki | Owner: Mariusz
| Felisiak
Type: Bug | Status: closed
Component: contrib.postgres | Version: 4.2
Severity: Release blocker | Resolution: fixed
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak <felisiak.mariusz@…>):

In [changeset:"db49def5fc03757048d6725097d4c3da44e7ea84" db49def5]:
{{{
#!CommitTicketReference repository=""
revision="db49def5fc03757048d6725097d4c3da44e7ea84"
[4.2.x] Fixed #34459 -- Fixed SearchVector() crash for parameters with %
symbol.

Thanks Patryk Zawadzki for the report.

Regression in 09ffc5c1212d4ced58b708cbbf3dfbfb77b782ca.

Backport of 4bf4222010fd8e413963c6c873e4088614332ef9 from main
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/34459#comment:18>

Reply all
Reply to author
Forward
0 new messages