Thanks everyone. Querying ElasticSearch directly and/or writing a custom backend are both great ideas. I realized quickly that forking the wagtailsearch app was a fool's errand. There are a lot of ways to skin this cat, but here's the technique I came up with (creating a custom search app that utilizes/modifies a small amount of wagtail core). I can now not only return mixed results from wagtail and non-wagtail models, but have custom sub-templates for each content type in the results, e.g. for a Page-derived result we can display seo_description, while in a Profile result we can show "about" text and a user avatar. Here's a little recipe/sketch in case it's useful to anyone in the future.
On your non-wagtail model, inherit from index.Indexed:
from wagtail.wagtailsearch import index
...
class Profile(models.Model, index.Indexed):
and define the search fields you want indexed:
# Wagtail/elastic search
search_fields = (
index.SearchField('user', partial_match=True, boost=10),
index.SearchField('about', partial_match=True, boost=10),
index.SearchField('site_org_title', partial_match=True),
index.SearchField('site_personal_title', partial_match=True),
)
Also make sure your custom model has defined `get_absolute_url()` (you'll need it in the search results):
def get_absolute_url(self):
return reverse('people_profile_detail', args=[str(self.user.username)])
Edit a profile and test that the content is being added to the elastic index:
from wagtail.wagtailsearch.backends import get_search_backend
s = get_search_backend()
from people.models import Profile
s.search("foobar", Profile)
Create a new app in your project called "search"
./manage.py startapp search
and add it to INSTALLED_APPS.
In your main urls.py, override the default wagtail search with your own:
from search import urls as search_urls
url(r'^search/', include(search_urls)),
Modify your
search.views.py to get and append results from other models, then chain them together. We use itertools' "chain" function to combine multiple queries into a single result set. Note that we also create and append a "appname_modelname" property to each instance in the results, which we'll use for specifying customized display templates in the results.
from itertools import chain
from wagtail.wagtailsearch.backends import get_search_backend
...
# Search
if query_string != '':
page_results = models.Page.search(
query_string,
show_unpublished=show_unpublished,
search_title_only=search_title_only,
extra_filters=extra_filters,
path=path if path else request.site.root_page.path
)
# Also query non-wagtail models
s = get_search_backend()
profile_results = s.search(query_string, Profile)
search_results = list(chain(profile_results, page_results))
# Append a template name to each element so we can render content types differently
for s in search_results:
s.template_name = '{app}_{model}'.format(
app=s.get_indexed_instance()._meta.app_label,
model=s.get_indexed_instance()._meta.model_name
)
Now in your search_results.html, call different templates per content type in the results loop:
{% for result in search_results %}
<div class="panel panel-default">
<div class="panel-body">
{% with template_name=result.template_name|stringformat:"s"|add:".html" %}
{% include "search/includes/"|add:template_name %}
{% endwith %}
<p><small>page type: {{result.template_name}}</small></p>
</div>
</div>
{% empty %}
<li>No results found</li>
{% endfor %}
The individual result templates are standard stuff.
This approach gives you a lot of control and flexibility, but does raise interesting questions about how to combine results. In this example we end up showing Page-derived content first, then all Profile results. If you wanted to combine them, how would you order them since Profiles (probably) don't have a publish date, and may not be considered as important, etc. Every use case is different, and depends on what fields are / are not available (this in turn explains why wagtail only provides Page-based search by default - too many open questions when you venture outside of that).
I may go for a more refined approach in the future but this works and is fairly clean.
./s