Cost of Delta indexing on Heroku

17 views
Skip to first unread message

Tom Smyth

unread,
Sep 10, 2014, 10:54:24 AM9/10/14
to thinkin...@googlegroups.com
Hi there. I'm wondering about the cost of delta indexing on Heroku. The Flying Sphinx add-on says the 'Ceramic' plan is the cheapest one to support delta indexing, but it's $55/mo vs $12/mo for the 'Wooden' plan. Is this because of the requirement for a worker dyno? That is to say:
  1. is the cost of the extra worker dyno mentioned here included in the $55/mo, or
  2. is the cost of the worker dyno on top of the $55, or
  3. can you actually do delta indexing using ts-resque-delta with the Wooden plan?

Thanks!

Pat Allan

unread,
Sep 10, 2014, 11:13:58 AM9/10/14
to thinkin...@googlegroups.com
Hi Tom

The worker dyno is *not* included in the $55, and there’s no way to use delta indices with the Wooden plan.

I realise that means there’s a *significant* jump in cost if you want to use deltas - I spent a lot of time working through the numbers to make sure the servers would be powerful enough at each level and the plans reasonably priced when I first got Flying Sphinx off the ground, but I just couldn’t find a way to have the resources required for delta indexing at such a low price point.

An alternative, though, is to switch to real-time indices. They can be used with any plan level, the extra overhead from a Flying Sphinx perspective is minimal, and they *don’t* require a worker dyno. The one catch is that Flying Sphinx does not back up real-time data, as backups are tied to processing of indices, and that doesn’t apply with real-time indices. I do plan to fix this situation, it’s just a bit complex, but I’ll hopefully get to it soon. That said, crashes are reasonably infrequent (*touch wood*) on the Wooden plans, and almost non-existent on the higher plans.

… here’s hoping I’ve not pushed my luck too far with the server gods…

To recap: deltas with a Wooden plan is sadly not possible (as much as I wish it could be) but real-time indices are definitely worth looking into.

Any further questions, do let me know.

Kind regards,

— 
Pat

--
You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to thinking-sphi...@googlegroups.com.
To post to this group, send email to thinkin...@googlegroups.com.
Visit this group at http://groups.google.com/group/thinking-sphinx.
For more options, visit https://groups.google.com/d/optout.

Tom Smyth

unread,
Sep 11, 2014, 9:10:17 AM9/11/14
to thinkin...@googlegroups.com
Interesting!

I saw real-time indexing but then I saw this on https://devcenter.heroku.com/articles/flying_sphinx#delta-indexing-ruby-only:

"Please note: at this point in time, Flying Sphinx does not yet support Sphinx’s realtime indices. If you’d like this feature, please contact Flying Sphinx support."

So is that note outdated? Or am I currently talking to Flying Sphinx support? :)

As for the backup situation, am I correct in hearing you as saying that I basically would need to run ts:generate each time the server starts up? That doesn't seem horrible. My current indices only take a few seconds to build. I'm new to Heroku though -- is there any way to automate that process? I imagine there would be...

Then, for the capbilities of realtime indices, I'm assuming they're basically a superset of regular indices? Put another way, we can use the same index file, but just add
:with => :real_time?

Thanks, Pat, for all you do.

Tom

Pat Allan

unread,
Sep 25, 2014, 2:45:11 AM9/25/14
to thinkin...@googlegroups.com
Woah, this one slipped back in my queue (I’ve been in SF to speak at a conference, and my inbox is now well and truly out of control). Sorry Tom!

That note is indeed outdated (and yes, I run Flying Sphinx and maintain Thinking Sphinx).

You would only need to run ts:generate (well, fs:regenerate) if the Flying Sphinx server you’re hosted on goes down (which is noted on http://status.flying-sphinx.com). There’s been a couple of issues lately, but for the most part it’s smooth sailing.

As for how real-time indices are set up - very similar to SQL-backed indices, but you’re dealing with methods, not columns. This means a few things: firstly, all method chains for fields and attributes should return a single value (unless it’s a multi-value attribute, then it should return an array). Defining methods to help the manual aggregation is recommended. For example: post has many comments, and you want to index the comment text in your post model:

    # in app/models/post.rb
    def comments_text
      comments.pluck(:text).join(‘ ‘)
    end

    # in app/indices/post_index.rb
    indexes comments_text

Attribute types need to be specified explicitly, as there’s no database to refer to (methods, not columns), hence types can not be determined automatically.

    has comment_ids, :type => :integer, :multi => true

And yes, instead of :with => :active_record, use :with => :real_time

Any further questions, do ask - I’ll try to be more prompt in my responses!

— 
Pat

Tom Smyth

unread,
Sep 25, 2014, 7:55:37 AM9/25/14
to thinkin...@googlegroups.com
Thanks so much Pat. When I get around to migrating to Heroku you may hear from me again :)

--
You received this message because you are subscribed to a topic in the Google Groups "Thinking Sphinx" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/thinking-sphinx/vmDCBZR-sws/unsubscribe.
To unsubscribe from this group and all its topics, send an email to thinking-sphi...@googlegroups.com.

To post to this group, send email to thinkin...@googlegroups.com.
Visit this group at http://groups.google.com/group/thinking-sphinx.
For more options, visit https://groups.google.com/d/optout.



--
Tom Smyth
Worker-Owner, Sassafras Tech Collective
Specializing in innovative, usable tech for social change 
sassafras.coop · @sassafrastech
Reply all
Reply to author
Forward
0 new messages