ELSA/Sphinx Issue: searchd not listening on port 9306, process is running.

1,091 views
Skip to first unread message

Lancelot

unread,
Feb 5, 2015, 3:53:36 PM2/5/15
to securit...@googlegroups.com
Bear with me, I'm only two weeks into SO, but loving it. Have not fully wrapped my head around ELSA\Sphinx\searchd\indexer relationships just yet, so could be missing something simple.

ELSA only shows data up to 1-27.

sostat shows syslog-ng running and listening on 514, mysql running and listening on 3306, but Sphinx running and NOT listening on 9306 (connection refused).

ps ax|grep searchd shows the process running

spinix.conf looks good: listen = 0.0.0.0:9306:/mysql41 and listen = 0.0.0.0:9312

grep -a7 nodes /etc/elsa_web.conf shows spinix_port correctly set to 9306

Running the indexer manually while pointing to sphinx.conf with --all showed some corruption, which mysqlcheck verified on syslogs_index_1.frm; I ran recover on that and then re-verified it was no longer in crashed/corrupt state.


Not sure where else to look, or what I'm not potentially understanding?

Lancelot

unread,
Feb 5, 2015, 4:00:39 PM2/5/15
to securit...@googlegroups.com
Forgot to attach sostat, now attached.
sostat-redacted.txt

Doug Burks

unread,
Feb 6, 2015, 8:09:56 AM2/6/15
to securit...@googlegroups.com
You have lots of buffers in queue. My guess would be that you had a
power outage or other ungraceful shutdown which not only corrupted
sphinx but also mysql. Take a look at the logs in
/nsm/elsa/data/elsa/log/ for any additional clues and see if your
symptoms match this thread:
https://groups.google.com/d/topic/security-onion/O3uBjCR5jYk/discussion

On Thu, Feb 5, 2015 at 4:00 PM, Lancelot <rkfr...@gmail.com> wrote:
> Forgot to attach sostat, now attached.
>
> --
> You received this message because you are subscribed to the Google Groups "security-onion" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to security-onio...@googlegroups.com.
> To post to this group, send email to securit...@googlegroups.com.
> Visit this group at http://groups.google.com/group/security-onion.
> For more options, visit https://groups.google.com/d/optout.



--
Doug Burks
Need Security Onion Training or Commercial Support?
http://securityonionsolutions.com

Lancelot

unread,
Feb 6, 2015, 10:54:50 AM2/6/15
to securit...@googlegroups.com
Thanks Doug! The searchd.log gave the hint:


[Thu Feb 5 22:57:20.997 2015] [25144] listening on all interfaces, port=9306
[Thu Feb 5 22:57:20.997 2015] [25144] listening on all interfaces, port=9312
[Thu Feb 5 22:57:21.020 2015] [25144] WARNING: index 'real_1': preload: invalid meta file /nsm/elsa/data/sphinx/real_1.meta; NOT SERVING
[Thu Feb 5 22:57:21.023 2015] [25144] WARNING: index 'real_2': preload: invalid meta file /nsm/elsa/data/sphinx/real_2.meta; NOT SERVING
[Thu Feb 5 22:57:24.971 2015] [25144] FATAL: invalid meta file /var/lib/sphinxsearch/data/binlog.meta

Removing binlog.meta solved the issue and allowed sphinxsearch to start up and is now listening on 9306.

I noticed in the ELSA web interface that I have 4 extra days worth of data now, but now instead of stopping at 1-27 it stops at 2-1. sostat still shows all the buffer files. I have been patient for 90 minutes and nothing sseems to have changed. There is 3862 buffer files. Thoughts on next steps to get those buffers read and inserted into SQL?

mysqlcheck -A shows all DBs as OK

Lancelot

unread,
Feb 6, 2015, 11:29:29 AM2/6/15
to securit...@googlegroups.com
I'm not knowledgeable enough yet to know if there are any clues in here, but I notice the pid is there in some entries, and null in others, and some have null for start/end. Is it best to just nuke ELSA and start fresh, I don't mind losing the data at this point if I can start fresh adn good going forward.If this is the best route, what is the preferred method of dropping tables/deleting files?

http://i.imgur.com/hYwDwEV.png

Lancelot

unread,
Feb 6, 2015, 1:11:40 PM2/6/15
to securit...@googlegroups.com
As a test I took a snapshot of the vm, truncated the buffers table, and manually removed the buffer files in /elsa/temp. Doesn't really seem like it helped, and the buffer seems to be filling up again. Where should I be looking next to discover why the buffer is not being read?

sostat ELSA:
http://i.imgur.com/P3cWHVt.png

node.log:
http://imgur.com/gF9yAnk

searchd.log looks good now

Heine Lysemose

unread,
Feb 6, 2015, 2:11:22 PM2/6/15
to securit...@googlegroups.com

Could you post another sudo sostat-redacted... I'm curious about your stat on mysql...

Regards,
Lysemose

Lancelot

unread,
Feb 6, 2015, 3:20:10 PM2/6/15
to securit...@googlegroups.com
Attached.
sostat-redacted2.txt

Lancelot

unread,
Feb 6, 2015, 5:58:37 PM2/6/15
to securit...@googlegroups.com
I notice these errors in nodes.log:

ERROR [2015/02/06 22:54:20] /opt/elsa/web/../node//Indexer.pm (3028) Indexer::_get_index_schema 17508 [undef]


FATAL: failed to load header: failed to open /nsm/elsa/data/sphinx/temp_71.sph: No such file or directory.

Looking in that directory, I don't see any .sph files (but I do see all of hte other .sp* files). At this point I'm very willing to start fresh on ELSA/Sphinx if someone would help me with the direction to do that.

Lancelot

unread,
Feb 9, 2015, 9:42:03 AM2/9/15
to securit...@googlegroups.com
Any additional thoughts on this, I really don't mind starting over with ELSA and wiping all data concerned, in order to get it working. What are my options for going that route, as I don't see any documentation on "rebooting ELSA/Sphinx/syslog."

Doug Burks

unread,
Feb 9, 2015, 4:46:49 PM2/9/15
to securit...@googlegroups.com
You could try looking at /usr/bin/sosetup to see how it configures
ELSA. If all else fails, you can simply re-run Setup (note this will
wipe all of your config and data).

On Mon, Feb 9, 2015 at 9:42 AM, Lancelot <rkfr...@gmail.com> wrote:
> Any additional thoughts on this, I really don't mind starting over with ELSA and wiping all data concerned, in order to get it working. What are my options for going that route, as I don't see any documentation on "rebooting ELSA/Sphinx/syslog."
>
> --
> You received this message because you are subscribed to the Google Groups "security-onion" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to security-onio...@googlegroups.com.
> To post to this group, send email to securit...@googlegroups.com.
> Visit this group at http://groups.google.com/group/security-onion.
> For more options, visit https://groups.google.com/d/optout.



Lancelot

unread,
Feb 10, 2015, 10:12:36 AM2/10/15
to securit...@googlegroups.com
Looks like I have this resolved. Thanks for everyone that attempted to lend assistance.

For anyone else that could have a similar issue, it would appear there may be a "way out" and start fresh with ELSA without a need to use the nuke option and re-run Setup. These are just my rookie thoughts, but what worked for me:

There looked to be a few ownership issues on ELSA files after I followed the KB on moving NSM/Mysql directories to other storage. I installed SO on another VM, ran through setup, and used that base image to compare against permissions set on my NAS... most were correct, but a few where incorrect in the /nsm/elsa/data/sphinx directory. I fixed the ownership issues and then in mysql I dropped the syslog_data database and then re-created it, and restarted syslog-ng. Everything picked up (starting fresh) from there OK. I will note I did this all in baby-steps to start out with (dropping a few individual tables at a time, monitoring node.log, etc. Each little step looked to get things churnning along a bit better, but what I feel really did it was simply dropping the entire database and letting the index rebuild from scratch, as otherwise I seemed to always have a mis-match between sphinx index and headers.
Reply all
Reply to author
Forward
0 new messages