{{{
def test_non_exact_match(self):
searched = Line.objects.filter(dialogue__search='hearts')
self.assertSequenceEqual(searched, [self.verse2])
def test_search_two_terms(self):
searched = Line.objects.filter(dialogue__search='heart bowel')
self.assertSequenceEqual(searched, [self.verse2])
}}}
The first test calls:
> SELECT to_tsvector('His head smashed in and his heart cut out, ') @@
plainto_tsquery('hearts')
which is false. In particular:
{{{SELECT to_tsvector('His head smashed in and his heart cut out, ');}}}
returns {{{'and':5 'cut':8 'head':2 'heart':7 'his':1,6 'in':4 'out':9
'smashed':3}}} and {{{SELECT plainto_tsquery('hearts');}}} returns
{{{'hearts'}}}.
However this works:
> SELECT to_tsvector('english', 'His head smashed in and his heart cut
out, ') @@ plainto_tsquery('english', 'hearts');
as {{{SELECT to_tsvector('english', 'His head smashed in and his heart cut
out, ')}}} returns {{{'cut':8 'head':2 'heart':7 'smash':3}}} and
{{{SELECT plainto_tsquery('english', 'hearts');}}} returns {{{heart}}}.
The second test calls:
> SELECT to_tsvector(COALESCE('His head smashed in and his heart cut out,
And his liver removed and his bowels unplugged, And his nostrils ripped
and his bottom burned off, And his--')) @@ plainto_tsquery('heart bowel');
which is again false. In particular {{{SELECT COALESCE('His head smashed
in and his heart cut out, And his liver removed and his bowels unplugged,
And his nostrils ripped and his bottom burned off, And his--');}}} returns
{{{His head smashed in and his heart cut out, And his liver removed and
his bowels unplugged, And his nostrils ripped and his bottom burned off,
And his--}}}, {{{SELECT to_tsvector(COALESCE('His head smashed in and his
heart cut out, And his liver removed and his bowels unplugged, And his
nostrils ripped and his bottom burned off, And his--'));}}} returns
{{{and':5,10,14,18,22,27 'bottom':24 'bowels':16 'burned':25 'cut':8
'head':2 'heart':7 'his':1,6,11,15,19,23,28 'in':4 'liver':12
'nostrils':20 'off':26 'out':9 'removed':13 'ripped':21 'smashed':3
'unplugged':17}}} and {{{SELECT plainto_tsquery('heart bowel');}}} returns
{{{'heart' & 'bowel'}}}.
Here again 'english' helps:
{{{SELECT to_tsvector('english', COALESCE('His head smashed in and his
heart cut out, And his liver removed and his bowels unplugged, And his
nostrils ripped and his bottom burned off, And his--'));}}} returns
{{{'bottom':24 'bowel':16 'burn':25 'cut':8 'head':2 'heart':7 'liver':12
'nostril':20 'remov':13 'rip':21 'smash':3 'unplug':17}}} and
> SELECT to_tsvector('english', COALESCE('His head smashed in and his
heart cut out, And his liver removed and his bowels unplugged, And his
nostrils ripped and his bottom burned off, And his--')) @@
(plainto_tsquery('heart bowel'));
is true.
--
Ticket URL: <https://code.djangoproject.com/ticket/29084>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
Comment (by Tim Graham):
Are you saying that the tests don't pass on your system? Is the difference
based on the system's language or something? Maybe a skip condition can be
added for those tests.
--
Ticket URL: <https://code.djangoproject.com/ticket/29084#comment:1>
Comment (by Дилян Палаузов):
That is exactly what I am saying. These are my search configurations:
{{{
psql \dF
List of text search configurations
Schema | Name | Description
------------+------------+---------------------------------------
pg_catalog | danish | configuration for danish language
pg_catalog | dutch | configuration for dutch language
pg_catalog | english | configuration for english language
pg_catalog | finnish | configuration for finnish language
pg_catalog | french | configuration for french language
pg_catalog | german | configuration for german language
pg_catalog | hungarian | configuration for hungarian language
pg_catalog | italian | configuration for italian language
pg_catalog | norwegian | configuration for norwegian language
pg_catalog | portuguese | configuration for portuguese language
pg_catalog | romanian | configuration for romanian language
pg_catalog | russian | configuration for russian language
pg_catalog | simple | simple configuration
pg_catalog | spanish | configuration for spanish language
pg_catalog | swedish | configuration for swedish language
pg_catalog | turkish | configuration for turkish language
(16 rows)
psql SHOW default_text_search_config ;
default_text_search_config
----------------------------
pg_catalog.simple
(1 row)
psql \dF+ simple
Text search configuration "pg_catalog.simple"
Parser: "pg_catalog.default"
Token | Dictionaries
-----------------+--------------
asciihword | simple
asciiword | simple
email | simple
file | simple
float | simple
host | simple
hword | simple
hword_asciipart | simple
hword_numpart | simple
hword_part | simple
int | simple
numhword | simple
numword | simple
sfloat | simple
uint | simple
url | simple
url_path | simple
version | simple
word | simple
}}}
The {{{simple dictionary}}} is described at
https://www.postgresql.org/docs/9.6/static/textsearch-dictionaries.html
#TEXTSEARCH-SIMPLE-DICTIONARY . The tests assume that in the default
configuration the {{{english_stem}}} dictionary is used. However
{{{simple}}} is the default configuration for unconfigured PG:
https://www.postgresql.org/docs/9.6/static/runtime-config-client.html#GUC-
DEFAULT-TEXT-SEARCH-CONFIG .
--
Ticket URL: <https://code.djangoproject.com/ticket/29084#comment:2>
* type: Uncategorized => Cleanup/optimization
* stage: Unreviewed => Accepted
Comment:
Skipping the tests is an option. Feel free to propose something else.
--
Ticket URL: <https://code.djangoproject.com/ticket/29084#comment:3>
Comment (by Дилян Палаузов):
I propose passing explicitly 'english' as first parameter to both of
{{{to_tsvector/plainto_tsquery}}}:
{{{
diff --git a/tests/postgres_tests/test_search.py
b/tests/postgres_tests/test_search.py
index b93077f..b721f1f 100644
--- a/tests/postgres_tests/test_search.py
+++ b/tests/postgres_tests/test_search.py
@@ -88,11 +88,11 @@ class SimpleSearchTest(GrailTestData,
PostgreSQLTestCase):
self.assertSequenceEqual(searched, [self.verse1])
def test_non_exact_match(self):
- searched = Line.objects.filter(dialogue__search='hearts')
+ searched = Line.objects.annotate(search=SearchVector('dialogue',
config='english')).filter(search=SearchQuery('hearts', config='english'))
self.assertSequenceEqual(searched, [self.verse2])
def test_search_two_terms(self):
- searched = Line.objects.filter(dialogue__search='heart bowel')
+ searched = Line.objects.annotate(search=SearchVector('dialogue',
config='english')).filter(search=SearchQuery('heart bowel',
config='english'))
self.assertSequenceEqual(searched, [self.verse2])
def test_search_two_terms_with_partial_match(self):
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/29084#comment:4>
Comment (by Tim Graham):
The test needs to test `__search`.
--
Ticket URL: <https://code.djangoproject.com/ticket/29084#comment:5>
Comment (by Дилян Палаузов):
I don't know. Take the words as they are: "heart" and "bowels":
{{{
diff --git a/tests/postgres_tests/test_search.py
b/tests/postgres_tests/test_search.py
--- a/tests/postgres_tests/test_search.py
+++ b/tests/postgres_tests/test_search.py
@@ -88,11 +88,11 @@ class SimpleSearchTest(GrailTestData,
PostgreSQLTestCase):
self.assertSequenceEqual(searched, [self.verse1])
def test_non_exact_match(self):
- searched = Line.objects.filter(dialogue__search='hearts')
+ searched = Line.objects.filter(dialogue__search='heart')
self.assertSequenceEqual(searched, [self.verse2])
def test_search_two_terms(self):
- searched = Line.objects.filter(dialogue__search='heart bowel')
+ searched = Line.objects.filter(dialogue__search='heart bowels')
self.assertSequenceEqual(searched, [self.verse2])
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/29084#comment:6>
* owner: (none) => Pablo Nicolas Estevez
* status: new => assigned
Comment:
Hi, i will try to solve the problem.
--
Ticket URL: <https://code.djangoproject.com/ticket/29084#comment:7>
* status: assigned => closed
* resolution: => fixed
Comment:
The config in postgresql.config require:
default_text_search_config = 'pg_catalog.english'
In case the config has another languaje i writed a pull request to skip
the test
https://github.com/django/django/pull/16357
--
Ticket URL: <https://code.djangoproject.com/ticket/29084#comment:8>
* status: closed => new
* has_patch: 0 => 1
* resolution: fixed =>
Comment:
The ticket isn't closed until the fix is committed.
--
Ticket URL: <https://code.djangoproject.com/ticket/29084#comment:9>
* status: new => assigned
* needs_better_patch: 0 => 1
--
Ticket URL: <https://code.djangoproject.com/ticket/29084#comment:10>
* needs_better_patch: 1 => 0
* stage: Accepted => Ready for checkin
--
Ticket URL: <https://code.djangoproject.com/ticket/29084#comment:11>
* status: assigned => closed
* resolution: => fixed
Comment:
In [changeset:"e673c87b5620a0801432a3d628508a09522e8e2b" e673c87]:
{{{
#!CommitTicketReference repository=""
revision="e673c87b5620a0801432a3d628508a09522e8e2b"
Fixed #29084 -- Skipped some postgres_tests.test_search tests when
pg_catalog isn't English.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/29084#comment:12>