records stop being indexed every few days

9 views
Skip to first unread message

David Krmpotic

unread,
Oct 25, 2014, 6:49:32 PM10/25/14
to thinkin...@googlegroups.com
Hi, I use TS 3.1.1 and Sphinx 2.2.5 on the server and this is my index definition:

ThinkingSphinx::Index.define :post, with: :active_record, delta: true do
  indexes :text
  indexes :tags
  
  has :user_id
  has :created_at
end

After upgrading both TS and Sphinx every few days of webapp usage, I notice that new records are no longer indexed.

I checked and they are really not in the sphinx post_core or post_delta indices.

After reindexing and restarting sphinx, it's ok for a few days then the same happens.

How to troubleshoot this further?

thank you
david

Pat Allan

unread,
Oct 26, 2014, 12:01:52 AM10/26/14
to thinkin...@googlegroups.com
Hi David

In recent versions of Thinking Sphinx (v3.1.1 definitely, and I think 3.1.0 too) guard files are put in place while indexing occurs, to avoid an index being processed multiple times at once. Unfortunately, these guard files aren't cleared out when an exception is raised during indexing... have a look in the folder of your index files, should be easy enough to spot.

In the upcoming v3.1.2 release, there'll be better logging to note if these guard files are blocking indexing requests, and they'll also be cleared out if an exception is raised.

— 
Pat

--
You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to thinking-sphi...@googlegroups.com.
To post to this group, send email to thinkin...@googlegroups.com.
Visit this group at http://groups.google.com/group/thinking-sphinx.
For more options, visit https://groups.google.com/d/optout.

David Krmpotic

unread,
Oct 28, 2014, 3:20:54 PM10/28/14
to thinkin...@googlegroups.com
Pat, thank you for the fast response...

Actually in this case I'm at fault because of my new deployment strategy the sphinx folder with indices got stranded on each deploy... now I want to symlink it but I'm having some trouble with this:


it says I can use:

indices_location: "RAILS_ROOT/tmp/sphinx"

in thinking_sphinx.yml but this doesn't get expanded in generated sphinx configuration files....

So how can I use RAILS_ROOT there and is the manual outdated?

thank you

David Krmpotic

unread,
Oct 28, 2014, 3:22:57 PM10/28/14
to thinkin...@googlegroups.com

david@eclipse:~/Projects/tb (master)$ bundle exec rake ts:configure

Generating configuration to /Users/david/Projects/tb/config/development.sphinx.conf

david@eclipse:~/Projects/tb (master)$ cat config/development.sphinx.conf | grep RAILS

  path = RAILS_ROOT/tmp/sphinx/post_core

  path = RAILS_ROOT/tmp/sphinx/post_delta

Pat Allan

unread,
Oct 28, 2014, 4:47:31 PM10/28/14
to thinkin...@googlegroups.com
Hi David

Sorry, the docs there should be more clear - RAILS_ROOT is just a placeholder for people to put their own app directory in. It’s not a magic variable that gets replaced within the TS code. You *can* use ERB within thinking_sphinx.yml, but in this case that wouldn’t quite work, because you really shouldn’t have these files within the app’s current Rails.root - they need to be in a shared directory.

I would avoid the need for symlinks, and just use shared folders instead.

Cheers

— 
Pat

David Krmpotic

unread,
Oct 28, 2014, 4:59:41 PM10/28/14
to thinkin...@googlegroups.com
Pat,

Thank you for fast response... I have /tmp already linked to a shared directory on each release (I'm using https://github.com/mina-deploy/mina)... also production.sphinx.conf is symlinked

so:
/var/www/tb/current/tmp -> /var/www/tb/shared/tmp
/var/www/tb/current/production.sphinx.conf -> /var/www/tb/shared/config/production.sphinx.conf

Is this what you ment or is there even more optimal way? Because here I'm using shared directories and it works nicely.

In thinking_sphinx.yml I now use:

indices_location: "<%= Rails.root %>/tmp/sphinx"

regards,
david

--
You received this message because you are subscribed to a topic in the Google Groups "Thinking Sphinx" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/thinking-sphinx/hLW-kZupy6g/unsubscribe.
To unsubscribe from this group and all its topics, send an email to thinking-sphi...@googlegroups.com.

Pat Allan

unread,
Oct 28, 2014, 5:02:21 PM10/28/14
to thinkin...@googlegroups.com
Hi David

In this situation, I would do the following instead:

  indices_location: /var/www/tb/shared/sphinx

Thus, no symlinks, and the location for the Sphinx files wouldn’t change.

You’d want to do the same thing for log and pid files as well :)

Cheers

— 
Pat

David Krmpotic

unread,
Oct 28, 2014, 5:14:57 PM10/28/14
to thinkin...@googlegroups.com
hmm what about on my local machine then? There is different.. that's what was bothering me...

David Krmpotic

unread,
Oct 28, 2014, 5:19:18 PM10/28/14
to thinkin...@googlegroups.com
and also in the future I or someone else may deploy it into some other directory on their server.. but with mina it's common practice to just set this:

set :shared_paths, ['config/database.yml', '.env', '.ruby-version', 'tmp', 'log', 'config/production.sphinx.conf']

PS: pid and logs already work correctly because they are placed in /log directory which is symlinked to /shared/log

only index wasn't ok sitting in the /db .. and generated config file (well that would be regenerated if missing, but still now it's better)

Pat Allan

unread,
Oct 28, 2014, 5:25:37 PM10/28/14
to thinkin...@googlegroups.com
I’d only be putting in these custom settings for production/staging environments, and leave development using the defaults.

As for shared paths, I guess you could symlink it - I think Sphinx will work fine in that scenario too. Not sure I’d put it in tmp from a semantics perspective, but maybe instead have db/sphinx symlinked as well.

— 
Pat

David Krmpotic

unread,
Oct 28, 2014, 5:31:29 PM10/28/14
to thinkin...@googlegroups.com
I see. The only thing is that then I have duplication. I only want this:

set :deploy_to, '/var/www/tb'

to be specified in the deploy file and nowhere else.

So I'll keep it like this.... yes your suggestion about db/sphinx is good but I think I don't quite like it there because this would the be the only thing of this kind that is not either in /tmp or /log ... when looking for it earlier, I first checked those two places and then found out that it actually lived in db.... It could be either way, maybe I'll put it back there :)

THANK YOU
david

David Krmpotic

unread,
Oct 28, 2014, 5:35:06 PM10/28/14
to thinkin...@googlegroups.com
actually you're right, of course... if I keep it there, then I don't even have to add indices setting etc.

I thought about this before but for some reason didn't think that I could symlink just db/sphinx, not the entire db directory.. not sure why I thought that.

ok, great, this is it..
regards
david

David Krmpotic

unread,
Nov 11, 2014, 8:06:02 AM11/11/14
to thinkin...@googlegroups.com
Hi, it's me again...

I bumped into an issue with this recommended approach...

See I get configuration like this:

log = /var/www/tb/shared/log/production.searchd.log

  query_log = /var/www/tb/shared/log/production.searchd.query.log

  pid_file = /var/www/tb/shared/log/production.sphinx.pid

  workers = threads

  binlog_path = /var/www/tb/shared/tmp/binlog/production


which is good, but see indices path:


path = /var/www/tb/releases/33/db/sphinx/production/post_core


not sure why these expanded to point to a specific release instead of /var/www/tb/shared/db/sphinx/....

db/sphinx is symlinked to  /var/www/tb/shared/db/sphinx/


so this setup works until I deploy more than 5 times and the release gets deleted (I keep last 5 versions), after that the indices path points to non-existing folder.




To unsubscribe from this group and stop receiving emails from it, send an email to thinking-sphinx+unsub...@googlegroups.com.
To post to this group, send email to thinking-sphinx@googlegroups.com.

-- 
You received this message because you are subscribed to a topic in the Google Groups "Thinking Sphinx" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/thinking-sphinx/hLW-kZupy6g/unsubscribe.
To unsubscribe from this group and all its topics, send an email to thinking-sphinx+unsub...@googlegroups.com.
To post to this group, send email to thinking-sphinx@googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to thinking-sphinx+unsub...@googlegroups.com.
To post to this group, send email to thinking-sphinx@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "Thinking Sphinx" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/thinking-sphinx/hLW-kZupy6g/unsubscribe.
To unsubscribe from this group and all its topics, send an email to thinking-sphinx+unsubscribe@googlegroups.com.
To post to this group, send email to thinking-sphinx@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to thinking-sphinx+unsubscribe@googlegroups.com.
To post to this group, send email to thinking-sphinx@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "Thinking Sphinx" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/thinking-sphinx/hLW-kZupy6g/unsubscribe.
To unsubscribe from this group and all its topics, send an email to thinking-sphinx+unsubscribe@googlegroups.com.
To post to this group, send email to thinking-sphinx@googlegroups.com.

Pat Allan

unread,
Nov 11, 2014, 9:17:14 PM11/11/14
to thinkin...@googlegroups.com
Hi David

If you’re symlinking the directory, then that’s just for persisting files - it doesn’t influence the path that Thinking Sphinx generates (it’s still using the default, which has the release in the path). So, I think setting indices_location is the best approach here.

— 
Pat

David Krmpotic

unread,
Nov 12, 2014, 12:45:55 PM11/12/14
to thinkin...@googlegroups.com
Pat,

all clear now... 

thank you again
david

To unsubscribe from this group and all its topics, send an email to thinking-sphi...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages