[Django] #27227: Full text search by UUIDField returns DataError

20 views
Skip to first unread message

Django

unread,
Sep 15, 2016, 6:25:52 AM9/15/16
to django-...@googlegroups.com
#27227: Full text search by UUIDField returns DataError
----------------------------------------------+----------------------------
Reporter: danclaudiupop | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer (models, ORM) | Version: 1.10
Severity: Normal | Keywords: UUIDField fts
| postgres
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------------------+----------------------------
Consider the following model:

{{{
class Show(models.Model):
cid = models.UUIDField(
default=uuid.uuid4,
editable=False,
verbose_name="Content ID",
help_text="Unique Identifier"
)
title_short = models.CharField(
max_length=60,
blank=True,
verbose_name="Short Title",
help_text="Short title (60 chars)"
)
}}}

The following snippet returns DataError, see below traceback:

{{{
>>> from django.contrib.postgres.search import SearchVector
>>> Show.objects.annotate(search=SearchVector('cid')).filter(search='foo')

Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/home/vagrant/ve/lib/python3.4/site-
packages/django/db/models/query.py", line 232, in __repr__
data = list(self[:REPR_OUTPUT_SIZE + 1])
File "/home/vagrant/ve/lib/python3.4/site-
packages/django/db/models/query.py", line 256, in __iter__
self._fetch_all()
File "/home/vagrant/ve/lib/python3.4/site-
packages/django/db/models/query.py", line 1087, in _fetch_all
self._result_cache = list(self.iterator())
File "/home/vagrant/ve/lib/python3.4/site-
packages/django/db/models/query.py", line 54, in __iter__
results = compiler.execute_sql()
File "/home/vagrant/ve/lib/python3.4/site-
packages/django/db/models/sql/compiler.py", line 835, in execute_sql
cursor.execute(sql, params)
File "/home/vagrant/ve/lib/python3.4/site-
packages/django/db/backends/utils.py", line 79, in execute
return super(CursorDebugWrapper, self).execute(sql, params)
File "/home/vagrant/ve/lib/python3.4/site-
packages/django/db/backends/utils.py", line 64, in execute
return self.cursor.execute(sql, params)
File "/home/vagrant/ve/lib/python3.4/site-packages/django/db/utils.py",
line 94, in __exit__
six.reraise(dj_exc_type, dj_exc_value, traceback)
File "/home/vagrant/ve/lib/python3.4/site-packages/django/utils/six.py",
line 685, in reraise
raise value.with_traceback(tb)
File "/home/vagrant/ve/lib/python3.4/site-
packages/django/db/backends/utils.py", line 64, in execute
return self.cursor.execute(sql, params)
django.db.utils.DataError: invalid input syntax for uuid: ""
LINE 1: ...tag", to_tsvector(COALESCE("entities_show"."cid", '')) AS "s...
}}}

Tried with PostgreSQL 9.3.14 and PostgreSQL 9.5.3

--
Ticket URL: <https://code.djangoproject.com/ticket/27227>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Sep 15, 2016, 6:26:30 AM9/15/16
to django-...@googlegroups.com
#27227: Full text search by UUIDField returns DataError
-------------------------------------+-------------------------------------

Reporter: danclaudiupop | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: 1.10
(models, ORM) |
Severity: Normal | Resolution:
Keywords: UUIDField fts | Triage Stage:
postgres | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by danclaudiupop):

* cc: danclaudiupop@… (added)
* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0


--
Ticket URL: <https://code.djangoproject.com/ticket/27227#comment:1>

Django

unread,
Sep 15, 2016, 6:26:57 AM9/15/16
to django-...@googlegroups.com
#27227: Full text search by UUIDField returns DataError
-------------------------------------+-------------------------------------
Reporter: danclaudiupop | Owner: nobody
Type: Bug | Status: new

Component: Database layer | Version: 1.10
(models, ORM) |
Severity: Normal | Resolution:
Keywords: UUIDField fts | Triage Stage:
postgres | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by danclaudiupop):

* type: Uncategorized => Bug


--
Ticket URL: <https://code.djangoproject.com/ticket/27227#comment:2>

Django

unread,
Sep 15, 2016, 7:15:21 AM9/15/16
to django-...@googlegroups.com
#27227: Full text search by UUIDField returns DataError
-------------------------------------+-------------------------------------
Reporter: danclaudiupop | Owner: nobody

Type: Bug | Status: new
Component: Database layer | Version: 1.10
(models, ORM) |
Severity: Normal | Resolution:
Keywords: UUIDField fts | Triage Stage:
postgres | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by melvyn-sopacua):

Analysis:

In `django/contrib/postgres/search.py` we find in __init__() around line
55 (stable/1.10.x):


{{{
self.source_expressions = [
Coalesce(expression, Value('')) for expression in
self.source_expressions
]
}}}

These are propagated to Coalesce() and generate the expression:
{{{
COALESCE("entities_show"."cid", '')
}}}

For most cases, the empty string is a valid input value, yet for the
native PostgreSQL UUID type, this is not the case. The API doesn't provide
a way to specify the fallback value.

--
Ticket URL: <https://code.djangoproject.com/ticket/27227#comment:3>

Django

unread,
Sep 15, 2016, 9:05:32 AM9/15/16
to django-...@googlegroups.com
#27227: Full text search by UUIDField returns DataError
-------------------------------------+-------------------------------------
Reporter: danclaudiupop | Owner: nobody

Type: Bug | Status: new
Component: Database layer | Version: 1.10
(models, ORM) |
Severity: Normal | Resolution:
Keywords: UUIDField fts | Triage Stage:
postgres | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by timgraham):

What's the use case for running full text search on a UUID?

--
Ticket URL: <https://code.djangoproject.com/ticket/27227#comment:4>

Django

unread,
Sep 15, 2016, 9:42:27 AM9/15/16
to django-...@googlegroups.com
#27227: Full text search by UUIDField returns DataError
-------------------------------------+-------------------------------------
Reporter: danclaudiupop | Owner: nobody

Type: Bug | Status: new
Component: Database layer | Version: 1.10
(models, ORM) |
Severity: Normal | Resolution:
Keywords: UUIDField fts | Triage Stage:
postgres | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by danclaudiupop):

We have a system in which we identify a resource by uuid field. We also
expose a search endpoint where we give back a list of different resource
types. For commodity we also search in the uuid field to give back the
proper resource on the search endpoint.

Instead of checking "/api/book/<uuid>/" or "/api/author/<uuid>/" for an
<uuid> we can go directly to "api/search/?q=<uuid>" to fetch the resource.

--
Ticket URL: <https://code.djangoproject.com/ticket/27227#comment:5>

Django

unread,
Sep 15, 2016, 10:28:38 AM9/15/16
to django-...@googlegroups.com
#27227: Full text search by UUIDField returns DataError
-------------------------------------+-------------------------------------
Reporter: danclaudiupop | Owner: nobody

Type: Bug | Status: new
Component: Database layer | Version: 1.10
(models, ORM) |
Severity: Normal | Resolution:
Keywords: UUIDField fts | Triage Stage:
postgres | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by charettes):

We could be extra safe and always cast the expression to text in a
`CAST(expression, output_field=models.TextField())` but I don't understand
why you're using FTS here, using `LIKE` through the `__contains` lookup
would be more appropriate IMHO. In the meant time you can take care of
casting the expression yourself before annotating your queryset to work
around your issue.

--
Ticket URL: <https://code.djangoproject.com/ticket/27227#comment:6>

Django

unread,
Sep 16, 2016, 2:19:37 AM9/16/16
to django-...@googlegroups.com
#27227: Full text search by UUIDField returns DataError
-------------------------------------+-------------------------------------
Reporter: danclaudiupop | Owner: nobody

Type: Bug | Status: new
Component: Database layer | Version: 1.10
(models, ORM) |
Severity: Normal | Resolution:
Keywords: UUIDField fts | Triage Stage:
postgres | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by melvyn-sopacua):

So unfortunately, even when supporting the fallback value, this fails
because to_tsvector(uuid) fails for reasons pointed out by charettes: uuid
isn't searchable. However, having a way to provide the fallback is useful,
if one wants to have NULL returned. From the Coalesce docs:

The COALESCE function returns the first of its arguments that is not
null. Null is returned only if all arguments are null.

But, I don't have a real-world use case for this. If there's no interest
in supporting the fallback value, I'll throw away my local branch.

--
Ticket URL: <https://code.djangoproject.com/ticket/27227#comment:7>

Django

unread,
Sep 16, 2016, 2:44:11 AM9/16/16
to django-...@googlegroups.com
#27227: Full text search by UUIDField returns DataError
-------------------------------------+-------------------------------------
Reporter: danclaudiupop | Owner: nobody

Type: Bug | Status: new
Component: Database layer | Version: 1.10
(models, ORM) |
Severity: Normal | Resolution:
Keywords: UUIDField fts | Triage Stage:
postgres | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by danclaudiupop):

The use case that we encounter is this: having an administration interface
which lists all the shows, i want to search shows by `cid`, `title_short`
and other fields (such as `description`, etc). So I wanted to leverage
FTS, because `__contains` is really slow over many records and fields.

By charettes suggestion, I used `CAST` and the queryset looks like this:
{{{
Show.objects.annotate(cidtxt=Cast('cid',
TextField())).annotate(search=SearchVector('cidtxt',
'title_short')).filter(search='foo')
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/27227#comment:8>

Django

unread,
Sep 19, 2016, 1:30:22 PM9/19/16
to django-...@googlegroups.com
#27227: Full text search by UUIDField returns DataError
-------------------------------------+-------------------------------------
Reporter: danclaudiupop | Owner: nobody

Type: Bug | Status: new
Component: Database layer | Version: 1.10
(models, ORM) |
Severity: Normal | Resolution:
Keywords: UUIDField fts | Triage Stage:
postgres | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by timgraham):

I'm not sure. Maybe you can send a pull request to give an idea of what
the patch allows?

--
Ticket URL: <https://code.djangoproject.com/ticket/27227#comment:9>

Django

unread,
Sep 23, 2016, 2:24:50 PM9/23/16
to django-...@googlegroups.com
#27227: Full text search by UUIDField returns DataError
-------------------------------------+-------------------------------------
Reporter: Dan Claudiu Pop | Owner: nobody
Type: Bug | Status: closed

Component: Database layer | Version: 1.10
(models, ORM) |
Severity: Normal | Resolution: needsinfo

Keywords: UUIDField fts | Triage Stage:
postgres | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Tim Graham):

* status: new => closed
* resolution: => needsinfo


--
Ticket URL: <https://code.djangoproject.com/ticket/27227#comment:10>

Reply all
Reply to author
Forward
0 new messages