Search feature (Ticket #76)

6 views
Skip to first unread message

RobertGawron

unread,
Dec 8, 2009, 4:19:17 PM12/8/09
to byteflow-hackers
Hi byteflow-hackers ,

I made search engine for byteflow (it's working on my personal blog,
you can check how on http://rgawron.megiteam.pl/search/?q=by%C5%82a)
and I would like to hear your feedback/opinion.

It's only search form for where you put your phrase and list of
results (with pagination and keywords highlighting). Do you think that
other options are necessary?

I used haystack (http://haystacksearch.org/), which is an search
engine layer that can work with whoosh, solr ad xapian. I used woosh -
this search engine is unavailable for Python2.4 users (permanently).

There is two ways to update index: add task cron or use class that mae
updates automatically when model is saved/deleted (they said it's
inefficient). With one solution should be used in your opinion?

If you find other bugs please let me know. I noticed that some of
keywords aren't highlighted, I will check why this happened later.
When all will be OK I will make path.

regards,
Robert Gawron

Alexander Solovyov

unread,
Dec 13, 2009, 12:11:41 PM12/13/09
to byteflow...@googlegroups.com
On 2009-12-08, RobertGawron wrote:

> I made search engine for byteflow (it's working on my personal blog,
> you can check how on http://rgawron.megiteam.pl/search/?q=by%C5%82a)
> and I would like to hear your feedback/opinion.

> It's only search form for where you put your phrase and list of
> results (with pagination and keywords highlighting). Do you think that
> other options are necessary?

I think that's enough. It looks really nice. :)

> I used haystack (http://haystacksearch.org/), which is an search
> engine layer that can work with whoosh, solr ad xapian. I used woosh -
> this search engine is unavailable for Python2.4 users (permanently).

That's actually their problem. ;)

> There is two ways to update index: add task cron or use class that mae
> updates automatically when model is saved/deleted (they said it's
> inefficient). With one solution should be used in your opinion?

As we're anyway not going to have search enabled by default, it would be
ok to have option named like "SEARCH_AUTOUPDATE" (whatever), which is
True by default. Then after enabling search you can disable that, if you
have enabled cron task.

Though... I hope it updates only saved/deleted object, right? Not the
whole index?

> If you find other bugs please let me know. I noticed that some of
> keywords aren't highlighted, I will check why this happened later.
> When all will be OK I will make path.

Cool, I'm eagerly awaiting for it.

--
Alexander

RobertGawron

unread,
Dec 17, 2009, 4:50:22 PM12/17/09
to byteflow-hackers
Hi Alex, hi all,

"It looks really nice. :)"

Thanks!

"As we're anyway not going to have search enabled by default, it would
be
ok to have option named like "SEARCH_AUTOUPDATE" (whatever), which is
True by default. Then after enabling search you can disable that, if
you
have enabled cron task."

Done. Named HAYSTACK_SEARCH_AUTOUPDATE.

You need also to update your template (files from templates/search)
and add CSS class named "highlighted" (name by default in haystack,
may be set to different if this one in used in different context) to
bring out keywords in results.

BTW whoosh deletes all files in HAYSTACK_WHOOSH_PATH during index
rebuilding so set this variable to separate directory (to avoid data
lost).

"I noticed that some of
keywords aren't highlighted, I will check why this happened later."

After all I think it's a bug so I've sent mail to haystack mailing
group with sample + bugfix proposition. Post is waiting for
moderation.

If case of bugs please let me know so I could fix them.

diff -r 62e994990709 apps/blog/search_indexes.py
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/apps/blog/search_indexes.py Thu Dec 17 22:33:47 2009 +0100
@@ -0,0 +1,24 @@
+from django.conf import settings
+from datetime import datetime as dt
+from haystack import indexes
+from haystack import site
+from blog.models import Post
+
+def get_search_index_class():
+ if settings.HAYSTACK_SEARCH_AUTOUPDATE:
+ return indexes.RealTimeSearchIndex
+ else:
+ return indexes.SearchIndex
+
+class PostIndex(get_search_index_class()):
+ text = indexes.CharField(document=True, use_template=True)
+ name = indexes.CharField(model_attr='name')
+ body = indexes.CharField(model_attr='text')
+ date = indexes.DateField(model_attr='date')
+ is_draft = indexes.BooleanField(model_attr='is_draft')
+
+ def get_queryset(self):
+ return Post.objects.filter(date__lt=dt.now())
+
+site.register(Post, PostIndex)
+
diff -r 62e994990709 apps/search/__init__.py
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/apps/search/__init__.py Thu Dec 17 22:33:47 2009 +0100
@@ -0,0 +1,4 @@
+import
haystack
+
+haystack.autodiscover()
+
diff -r 62e994990709 apps/search/urls.py
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/apps/search/urls.py Thu Dec 17 22:33:47 2009 +0100
@@ -0,0 +1,13 @@
+from datetime import datetime as dt
+from django.conf.urls.defaults import *
+from haystack.query import SearchQuerySet
+from haystack.views import SearchView
+
+sqs = SearchQuerySet().filter(date__lt=dt.now()).order_by('-
date').filter(is_draft=False)
+
+urlpatterns = patterns('haystack.views',
+ url(r'^$', SearchView(
+ searchqueryset=sqs,
+ ), name='post_search'),
+)
+
diff -r 62e994990709 settings.py
--- a/settings.py Tue Dec 01 14:00:11 2009 +0200
+++ b/settings.py Thu Dec 17 22:33:47 2009 +0100
@@ -145,7 +145,20 @@ INSTALLED_APPS = (
'openidconsumer',
'openidserver',
'revcanonical',
+ # 'tagging_autocomplete',
+ # 'haystack'
)
+
+# set False if you have ./manage.py update_index in your cron tasks
+#HAYSTACK_SEARCH_AUTOUPDATE = True
+#HAYSTACK_SITECONF = 'apps.search' # don't change it
+#HAYSTACK_SEARCH_RESULTS_PER_PAGE = 21
+#HAYSTACK_SEARCH_ENGINE = '' # whoosh | solr | xapian | dummy
+# see http://haystacksearch.org/docs/tutorial.html#configuration
+#HAYSTACK_WHOOSH_PATH = ''
+#HAYSTACK_SOLR_URL = ''
+#HAYSTACK_XAPIAN_PATH = ''
+

APPEND_SLASH = False
REMOVE_WWW = True
diff -r 62e994990709 templates/search/indexes/blog/post_text.txt
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/templates/search/indexes/blog/post_text.txt Thu Dec 17 22:33:47
2009 +0100
@@ -0,0 +1,2 @@
+{{ object.text }}
+
diff -r 62e994990709 templates/search/search.html
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/templates/search/search.html Thu Dec 17 22:33:47 2009 +0100
@@ -0,0 +1,36 @@
+{% extends "base.html" %}
+{% load i18n %}
+{% load highlight %}
+
+{% block title %}{% trans "Search results for:" %} {{ query }}{%
endblock %}
+
+{% block content %}
+
+<h1>{% trans "Search results for:" %} {{ query }}</h1>
+
+{% if not page.object_list %}
+ <p>{% trans "No results found" %}</p>
+{% endif %}
+
+{% if page.object_list %}
+ <ul class="search_results_list">
+ {% for result in page.object_list %}
+ <li>
+ <strong><a href="{{ result.object.get_absolute_url }}">
{{ result.name }}</a></strong><br/>
+ {% highlight result.body with query max_length 256
html_tag "span" %}
+ </li>
+ {% endfor %}
+ </ul>
+
+ <div class="pagination" style="float:right">
+ {% if page.has_previous %}
+ <a href="{{ base }}?q={{ query|urlencode }}&amp;page=
{{ page.previous_page_number }}">&laquo; {% trans "Newer posts" %}</a>
+ {% endif %}
+ |
+ {% if page.has_next %}
+ <a href="{{ base }}?q={{ query|urlencode }}&amp;page=
{{ page.next_page_number }}">{% trans "Older posts" %} &raquo;</a>
+ {% endif %}
+ </div>
+{% endif %}
+
+{% endblock %}
diff -r 62e994990709 urls.py
--- a/urls.py Tue Dec 01 14:00:11 2009 +0200
+++ b/urls.py Thu Dec 17 22:33:47 2009 +0100
@@ -52,6 +52,7 @@ urlpatterns += patterns(
url(r'^robots.txt$', include('robots.urls')),
url(r'^feeds/', include('feed.urls')),
url(r'^tagging_autocomplete/', include
('tagging_autocomplete.urls')),
+ #url(r'^search/', include('search.urls')),
)

if appcheck.watchlist:


regards,
Robert Gawron

RobertGawron

unread,
Dec 19, 2009, 5:16:46 AM12/19/09
to byteflow-hackers
Hi byteflow hackers,

"After all I think it's a bug so I've sent mail to haystack mailing
group with sample + bugfix proposition. Post is waiting for
moderation. "

FYI, that really was a bug in haystack and it was solved.

http://groups.google.com/group/django-haystack/browse_thread/thread/adc1f2901873b265/ec38a0bb447ee69b?show_docid=ec38a0bb447ee69b

http://github.com/toastdriven/django-haystack/issues/closed/#issue/157

regards,
Robert Gawron

Alexander Solovyov

unread,
Dec 21, 2009, 6:38:54 AM12/21/09
to byteflow...@googlegroups.com
On 2009-12-19, RobertGawron wrote:

> Hi byteflow hackers,

> http://groups.google.com/group/django-haystack/browse_thread/thread/adc1f2901873b265/ec38a0bb447ee69b?show_docid=ec38a0bb447ee69b

> http://github.com/toastdriven/django-haystack/issues/closed/#issue/157

Thanks, I've pushed update with latest version of haystack few minutes ago.

--
Alexander

Reply all
Reply to author
Forward
0 new messages