{{{
class Paper(models.Model):
...
authors = models.JSONField(default=list)
...
}}}
{{{
In [53]: p = Paper.objects.get(id=1)
In [54]: p.__dict__
Out[54]:
{'_state': <django.db.models.base.ModelState at 0x7f2aa74b3790>,
'id': 1,
...
'authors': [{'name': 'Rehmann Kim-Thomas',
'workplace': 'SAP SE, Walldorf, Germany'},
{'name': 'Folkerts Enno', 'workplace': 'SAP SE, Walldorf, Germany'}],
...
}
In [55]: print(Paper.objects.filter(authors__icontains="rehmann").count())
0
In [56]: print(Paper.objects.filter(authors__icontains="Rehmann").count())
1
In [57]: print(Paper.objects.filter(authors__icontains="Rehmann").query)
SELECT `scholardata_paper`.`id`, `scholardata_paper`.`created`,
`scholardata_paper`.`modified`, `scholardata_paper`.`doi`,
`scholardata_paper`.`s2_id`, `scholardata_paper`.`wanfang_id`,
`scholardata_paper`.`corpus_id`, `scholardata_paper`.`pm_id`,
`scholardata_paper`.`arxiv_id`, `scholardata_paper`.`acl_id`,
`scholardata_paper`.`mag_id`, `scholardata_paper`.`language`,
`scholardata_paper`.`url`, `scholardata_paper`.`abstract`,
`scholardata_paper`.`title`, `scholardata_paper`.`authors`,
`scholardata_paper`.`keywords`, `scholardata_paper`.`issued`,
`scholardata_paper`.`venue`, `scholardata_paper`.`journal_id`,
`scholardata_paper`.`volume`, `scholardata_paper`.`issue`,
`scholardata_paper`.`page`, `scholardata_paper`.`references_count`,
`scholardata_paper`.`citations_count`, `scholardata_paper`.`refreshed`,
`scholardata_paper`.`references_refreshed`,
`scholardata_paper`.`citations_refreshed` FROM `scholardata_paper` WHERE
JSON_UNQUOTE(`scholardata_paper`.`authors`) LIKE %Rehmann% ORDER BY
`scholardata_paper`.`id` ASC
In [58]: print(Paper.objects.filter(authors__iregex="rehmann").count())
1
In [59]: print(Paper.objects.filter(authors__iregex="Rehmann").count())
1
In [60]: print(Paper.objects.filter(authors__iregex="Rehmann").query)
SELECT `scholardata_paper`.`id`, `scholardata_paper`.`created`,
`scholardata_paper`.`modified`, `scholardata_paper`.`doi`,
`scholardata_paper`.`s2_id`, `scholardata_paper`.`wanfang_id`,
`scholardata_paper`.`corpus_id`, `scholardata_paper`.`pm_id`,
`scholardata_paper`.`arxiv_id`, `scholardata_paper`.`acl_id`,
`scholardata_paper`.`mag_id`, `scholardata_paper`.`language`,
`scholardata_paper`.`url`, `scholardata_paper`.`abstract`,
`scholardata_paper`.`title`, `scholardata_paper`.`authors`,
`scholardata_paper`.`keywords`, `scholardata_paper`.`issued`,
`scholardata_paper`.`venue`, `scholardata_paper`.`journal_id`,
`scholardata_paper`.`volume`, `scholardata_paper`.`issue`,
`scholardata_paper`.`page`, `scholardata_paper`.`references_count`,
`scholardata_paper`.`citations_count`, `scholardata_paper`.`refreshed`,
`scholardata_paper`.`references_refreshed`,
`scholardata_paper`.`citations_refreshed` FROM `scholardata_paper` WHERE
REGEXP_LIKE(JSON_UNQUOTE(`scholardata_paper`.`authors`), Rehmann, 'i')
ORDER BY `scholardata_paper`.`id` ASC
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/32411>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
Old description:
New description:
given model
In [63]:
print(Paper.objects.filter(authors__0__name__icontains="rehmann").count())
1
In [64]:
print(Paper.objects.filter(authors__0__name__icontains="rehmann").query)
SELECT `scholardata_paper`.`id`, `scholardata_paper`.`created`,
`scholardata_paper`.`modified`, `scholardata_paper`.`doi`,
`scholardata_paper`.`s2_id`, `scholardata_paper`.`wanfang_id`,
`scholardata_paper`.`corpus_id`, `scholardata_paper`.`pm_id`,
`scholardata_paper`.`arxiv_id`, `scholardata_paper`.`acl_id`,
`scholardata_paper`.`mag_id`, `scholardata_paper`.`language`,
`scholardata_paper`.`url`, `scholardata_paper`.`abstract`,
`scholardata_paper`.`title`, `scholardata_paper`.`authors`,
`scholardata_paper`.`keywords`, `scholardata_paper`.`issued`,
`scholardata_paper`.`venue`, `scholardata_paper`.`journal_id`,
`scholardata_paper`.`volume`, `scholardata_paper`.`issue`,
`scholardata_paper`.`page`, `scholardata_paper`.`references_count`,
`scholardata_paper`.`citations_count`, `scholardata_paper`.`refreshed`,
`scholardata_paper`.`references_refreshed`,
`scholardata_paper`.`citations_refreshed` FROM `scholardata_paper` WHERE
LOWER(JSON_UNQUOTE(JSON_EXTRACT(`scholardata_paper`.`authors`,
$[0]."name"))) LIKE LOWER(%rehmann%) ORDER BY `scholardata_paper`.`id` ASC
}}}
--
--
Ticket URL: <https://code.djangoproject.com/ticket/32411#comment:1>
* cc: Mariusz Felisiak (added)
* status: new => closed
* resolution: => invalid
Comment:
Hi. Thanks for the report.
Your usage is wrong for what you're trying to do. See the
[https://docs.djangoproject.com/en/3.2/topics/db/queries/#querying-
jsonfield Querying JSONField topic docs].
For JSONField `contains` takes key pairs, rather than a string:
{{{
>>> Dog.objects.filter(data__contains={'owner': 'Bob'})
}}}
`icontains` is not explicitly implemented on the field itself, so it's
falling back to the base implementation (wrongly — the
[https://database.guide/json_unquote-remove-quotes-from-a-json-document-
in-mysql/ JSON_UNQUOTE] gives us a string but we didn't generate the
`LOWER`) **but** it's not clear that makes much sense. (Mariusz: I'm
closing this as invalid, but do we want to accept in order to raise an
error here? 🤔)
As you've discovered the correct usage is to query by field
([https://docs.djangoproject.com/en/3.1/topics/db/queries/#key-index-and-
path-transforms as per here]):
{{{
>>> Dog.objects.filter(data__owner__name__icontains='bob')
}}}
For your example
`Paper.objects.filter(authors__name__icontains="rehmann")` should give you
what you want.
I hope that helps.
This seems related to the documentation issue #26511.
--
Ticket URL: <https://code.djangoproject.com/ticket/32411#comment:2>
* status: closed => new
* cc: sage (added)
* type: Bug => Cleanup/optimization
* keywords: JSONField => JSONField mysql
* resolution: invalid =>
* stage: Unreviewed => Accepted
Comment:
This is not a documented usage, however it doesn't work only on MySQL, so
IMO we can fix this for consistency. MySQL handles strings used in `JSON`
context using the `utf8mb4_bin` binary collation, that's why comparison
of JSON values is case-sensitive and we need to use
`CaseInsensitiveMixin`. We can fix this be registering a custom case-
insensitive lookup for `JSONField`, e.g.
{{{
class JSONFieldIContains(CaseInsensitiveMixin, lookups.IContains):
pass
JSONField.register_lookup(JSONFieldIContains)
}}}
I don't think it makes sense to do the same for `__istartswith`,
`iendswith`, and `__iexact`.
--
Ticket URL: <https://code.djangoproject.com/ticket/32411#comment:3>
* owner: nobody => Hasan Ramezani
* status: new => assigned
* has_patch: 0 => 1
Comment:
[https://github.com/django/django/pull/13979 PR]
--
Ticket URL: <https://code.djangoproject.com/ticket/32411#comment:4>
* stage: Accepted => Ready for checkin
--
Ticket URL: <https://code.djangoproject.com/ticket/32411#comment:5>
* status: assigned => closed
* resolution: => fixed
Comment:
In [changeset:"63d239db037f02d98b7771c90422840bbb4a319a" 63d239db]:
{{{
#!CommitTicketReference repository=""
revision="63d239db037f02d98b7771c90422840bbb4a319a"
Fixed #32411 -- Fixed __icontains lookup for JSONField on MySQL.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/32411#comment:6>
Comment (by elonzh):
Replying to [comment:2 Carlton Gibson]:
> Hi. Thanks for the report.
>
> Your usage is wrong for what you're trying to do. See the
[https://docs.djangoproject.com/en/3.2/topics/db/queries/#querying-
jsonfield Querying JSONField topic docs].
>
> For JSONField `contains` takes key pairs, rather than a string:
>
> {{{
> >>> Dog.objects.filter(data__contains={'owner': 'Bob'})
> }}}
>
> `icontains` is not explicitly implemented on the field itself, so it's
falling back to the base implementation (wrongly — the
[https://database.guide/json_unquote-remove-quotes-from-a-json-document-
in-mysql/ JSON_UNQUOTE] gives us a string but we didn't generate the
`LOWER`) **but** it's not clear that makes much sense. (Mariusz: I'm
closing this as invalid, but do we want to accept in order to raise an
error here? 🤔)
>
> As you've discovered the correct usage is to query by field
([https://docs.djangoproject.com/en/3.1/topics/db/queries/#key-index-and-
path-transforms as per here]):
>
> {{{
> >>> Dog.objects.filter(data__owner__name__icontains='bob')
> }}}
>
> For your example
`Paper.objects.filter(authors__name__icontains="rehmann")` should give you
what you want.
> I hope that helps.
>
> This seems related to the documentation issue #26511.
Thanks for your information, I know how to query field in a json object,
but the problem is the `authors` fields is a list and Django does not
support filter it.
`Paper.objects.filter(authors__name__icontains="rehmann")` won't give me
what I want.
{{{
>>> print(Paper.objects.filter(authors__name__icontains="rehmann").query)
SELECT `scholardata_paper`.`id`, `scholardata_paper`.`created`,
`scholardata_paper`.`modified`, `scholardata_paper`.`doi`,
`scholardata_paper`.`s2_id`, `scholardata_paper`.`wanfang_id`,
`scholardata_paper`.`corpus_id`, `scholardata_paper`.`pm_id`,
`scholardata_paper`.`arxiv_id`, `scholardata_paper`.`acl_id`,
`scholardata_paper`.`mag_id`, `scholardata_paper`.`language`,
`scholardata_paper`.`url`, `scholardata_paper`.`abstract`,
`scholardata_paper`.`title`, `scholardata_paper`.`authors`,
`scholardata_paper`.`keywords`, `scholardata_paper`.`issued`,
`scholardata_paper`.`venue`, `scholardata_paper`.`journal_id`,
`scholardata_paper`.`volume`, `scholardata_paper`.`issue`,
`scholardata_paper`.`page`, `scholardata_paper`.`references_count`,
`scholardata_paper`.`citations_count`, `scholardata_paper`.`refreshed`,
`scholardata_paper`.`references_refreshed`,
`scholardata_paper`.`citations_refreshed` FROM `scholardata_paper` WHERE
LOWER(JSON_UNQUOTE(JSON_EXTRACT(`scholardata_paper`.`authors`, $."name")))
LIKE LOWER(%rehmann%) ORDER BY `scholardata_paper`.`id` ASC
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/32411#comment:7>
Comment (by Simon Charette):
> the problem is the authors fields is a list and Django does not support
filter it
That little to do with this ticket though. The ORM has no way of
''knowing'' `authors` is a JSON Array because `JSONField` are schema less.
If you want to filter out papers that have at least one author with a
matching name you'll have to use the MySQL equivalent of
`json_array_elements` and a subquery through a `Func` expression but
that's something you should ask about on
[https://docs.djangoproject.com/en/3.1/faq/help/#how-do-i-do-x-why-
doesn-t-y-work-where-can-i-go-to-get-help support channels].
--
Ticket URL: <https://code.djangoproject.com/ticket/32411#comment:8>
Comment (by elonzh):
Replying to [comment:8 Simon Charette]:
> > the problem is the authors fields is a list and Django does not
support filter it
>
> That little to do with this ticket though. The ORM has no way of
''knowing'' `authors` is a JSON Array because `JSONField` are schema less.
If you want to filter out papers that have at least one author with a
matching name you'll have to use the MySQL equivalent of PostgreSQL
`json_array_elements` and a subquery through a `Func` expression but
that's something you should ask about on
[https://docs.djangoproject.com/en/3.1/faq/help/#how-do-i-do-x-why-
doesn-t-y-work-where-can-i-go-to-get-help support channels].
Yes, that little to do with this ticket. I just want to tell Carlton
Gibson that he has given the wrong solution.
--
Ticket URL: <https://code.djangoproject.com/ticket/32411#comment:9>