barnyard failing to start upon pulled pork update

772 views
Skip to first unread message

Stephane Chazelas

unread,
Apr 6, 2012, 11:37:37 AM4/6/12
to securit...@googlegroups.com
Hiya,

Since the upgrade to 20120326,

Upon pulledpork update, barnyard2 fails to start properly.

As a work around, I've updated the pulledpork_update.sh to do a
full "service nsm restart" instead of just restarting snort and
barnyard2. That's working for now but it's not ideal.

I've started to try and find an explanation, but I don't
have any more time to look into it.

In case anyone has the same issue, they could start
investigating based on my findings below.

In short snort segfaults which it didn't do before (looks like
issue 216) and barnyard2 seems to be reprocessing the same snort
entries that it had before (though I'm not sure of that),
submits something to snort_agent, snort_agent makes an INSERT
SQL query which fails because one with the same primary key had
been done before.

(from our internal issue report):

Stephane Chazelas 2012-04-02 10:22:46 BST

Barnyard fails with:
With
ERROR: sguil: Expected Confirm 1393328 and got: Failed to insert 1393328:
mysqlexec/db server: Duplicate entry '3-1393328' for key 'PRIMARY'

in the log file.

It then needs to be restarted, and even then, it doesn't always work or not for
long. Need to figure out what's going on.

Stephane Chazelas 2012-04-03 11:07:59 BST

Enabled mysql log. barnyard has failed again, will have a look.

Stephane Chazelas 2012-04-03 12:30:42 BST

ERROR: sguil: Expected Confirm 1395795 and got: Failed to insert 1395795:
mysqlexec/db server: Duplicate entry '3-1395795' for key 'PRIMARY'
# ls --full-time -l /var/log/nsm/onion-eth1/barnyard2.log
-rw-r--r-- 1 root root 1932 2012-04-03 07:01:59.431239042 +0000
/var/log/nsm/onion-eth1/barnyard2.log

Corresponds to when pulledpork was run
Apr 3 07:01:01 onion CRON[31060]: (root) CMD ( date >>
/var/log/nsm/pulledpork.log ; /usr/local/bin/pulledpork_update.sh >>
/var/log/nsm/pulledpork.log)
Apr 3 07:01:54 onion kernel: [1514252.579857] snort[12703]: segfault at 0 ip
b6d4b64e sp bfcbae10 error 4 in libsf_engine.so.0.0.0[b6d49000+e000]

snort has been segfaulting since the upgrade to securityonion 20120326 (which
includes an update of snort to 2.9.2.1.

Program received signal SIGTERM, Terminated.
0xb770f416 in __kernel_vsyscall ()
(gdb) cont
Continuing.
[Thread 0xa5fb2b70 (LWP 3063) exited]

Program received signal SIGSEGV, Segmentation fault.
0xb6cf464e in ?? () from /usr/local/lib/snort_dynamicengine/libsf_engine.so
(gdb) bt
#0 0xb6cf464e in ?? () from /usr/local/lib/snort_dynamicengine/libsf_engine.so
#1 0x080b5de1 in ?? ()
#2 0x08076345 in ?? ()
#3 0x08076673 in ?? ()
#4 0x080791eb in ?? ()
#5 0x08079586 in ?? ()
#6 0xb7499bd6 in __libc_start_main () from /lib/tls/i686/cmov/libc.so.6
#7 0x0804ba91 in ?? ()
(gdb)

It's upon quitting that one of the threads segfaults.

Now, back to the duplicate mysql entry:

120403 7:01:59 1980 Query SELECT MAX(cid) FROM event WHERE sid=3
1980 Query INSERT INTO `event_onion-eth1_20120403` (sid,
cid, unified_event_id, unified_event_ref, unified_ref_time, signature,
signature_gen, signature_id, signature_rev, timestamp, priority, class,
status, src_ip, dst_ip, ip_proto, ip_ver, ip_hlen, ip_tos, ip_len, ip_id,
ip_flags, ip_off, ip_ttl, ip_csum) VALUES ('3', '1395795', '2128', '2128',
'2012-04-03 07:00:08', 'ET POLICY GNU/Linux APT User-Agent Outbound likely
related to package management', '1', '2013504', '2', '2012-04-03 07:00:08',
'1', 'policy-violation', '0', '', '', '', '', '', '', '', '',
'', '', '', '')

That's presumably after barnyard restart.

Before that we had had:

120403 5:15:59 1980 Query INSERT INTO `event_onion-eth1_20120403` (sid,
cid, unified_event_id, unified_event_ref, unified_ref_time, signature,
signature_gen, signature_id, signature_rev, timestamp, priority, class,
status, src_ip, dst_ip, ip_proto, ip_ver, ip_hlen, ip_tos, ip_len, ip_id,
ip_flags, ip_off, ip_ttl, ip_csum, src_port, dst_port) VALUES ('3', '1395795',
'2022', '2022', '2012-04-03 05:01:51', 'smtp: Attempted command buffer
overflow', '124', '1', '1', '2012-04-03 05:01:51', '1', 'attempted-admin',
'0', '405535850', '168430087', '6', '4', '5', '0', '1500', '15984',
'2', '0', '49', '56581', '51489', '25')

1980 is still on, it's not barnyard. Anyway, barnyard doesn't write to the
securityonion_db database, only to snortby. Instead it sends information to the
sguil snort_agent which itself writes to the securityonion_db database.

And we do see in snort_agent.log:
Sensor Data Rcvd: Failed sock5 1395795 {mysqlexec/db server: Duplicate entry
'3-1395795' for key 'PRIMARY'}

So I suspect barnyard just reports the error returned by snort_agent.

We need to figure out why barnyard resends those old events.

Stephane Chazelas 2012-04-03 13:40:33 BST

barnyard2 uses the /etc/nsm/onion-eth1/barnyard2.waldo file to record where it
left.

# tail /var/log/nsm/onion-eth1/barnyard2.log

Using waldo file '/etc/nsm/onion-eth1/barnyard2.waldo':
spool directory = /nsm/sensor_data/onion-eth1
spool filebase = snort.unified2
time_stamp = 1333359583
record_idx = 8351
Opened spool file '/nsm/sensor_data/onion-eth1/snort.unified2.1333359583'
ERROR: sguil: Expected Confirm 1395795 and got: Failed to insert 1395795:
mysqlexec/db server: Duplicate entry '3-1395795' for key 'PRIMARY'

# od -tu4 /etc/nsm/onion-eth1/barnyard2.waldo
0000000 1836281391 1852142383 1601335155 1635017060
0000020 1768845103 1697476207 3238004 0
0000040 0 0 0 0
*
0002000 1919905395 1853173364 1701406313 12900
0002020 0 0 0 0
*
0004000 1333359583 8351
0004010

Looks like it failed to update it last time it was terminated.

If I increase the record_idx by 1, it still fails on the same 1395795

Then installed the snort unified perl module from
http://code.google.com/p/snort-unified-perl/source/browse/, adapted an example
to dump that unified2 file and find:

8349, sensor_id=0, event_id=2128, tv_sec=1333436408, tv_usec=367304,
sig_id=2013504, sig_gen=1, sig_rev=2, class=33, pri=1, sip=168435326,
dip=2915181220, sp=47649, dp=80, protocol=6, pkt_action=0
8350, sensor_id=0, event_id=2129, tv_sec=1333433902, tv_usec=196531,
sig_id=2012647, sig_gen=1, sig_rev=3, class=33, pri=1, sip=168435337,
dip=3341802134, sp=47098, dp=80, protocol=6, pkt_action=0
8351, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333433902,
pkt_usec=196531, linktype=1, pkt_len=328
8352, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333433958,
pkt_usec=64095, linktype=1, pkt_len=328
8353, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333434013,
pkt_usec=977541, linktype=1, pkt_len=328
8354, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333434069,
pkt_usec=969139, linktype=1, pkt_len=328
8355, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333434125,
pkt_usec=882989, linktype=1, pkt_len=328
8356, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333434181,
pkt_usec=494996, linktype=1, pkt_len=328
8357, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333434237,
pkt_usec=461671, linktype=1, pkt_len=328
8358, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333434293,
pkt_usec=390564, linktype=1, pkt_len=328
8359, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333434303,
pkt_usec=220983, linktype=1, pkt_len=328
8360, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333434359,
pkt_usec=179561, linktype=1, pkt_len=328
8361, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333434415,
pkt_usec=126277, linktype=1, pkt_len=328
8362, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333434470,
pkt_usec=566322, linktype=1, pkt_len=328
8363, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333434526,
pkt_usec=629793, linktype=1, pkt_len=328
8364, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333434581,
pkt_usec=998943, linktype=1, pkt_len=328
8365, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333434637,
pkt_usec=442912, linktype=1, pkt_len=328
8366, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333434692,
pkt_usec=724264, linktype=1, pkt_len=328
8367, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333434748,
pkt_usec=664819, linktype=1, pkt_len=328
8368, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333434804,
pkt_usec=659636, linktype=1, pkt_len=328
8369, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333434860,
pkt_usec=560246, linktype=1, pkt_len=328
8370, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333434916,
pkt_usec=46858, linktype=1, pkt_len=328
8371, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333434971,
pkt_usec=682782, linktype=1, pkt_len=328
8372, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333435027,
pkt_usec=79405, linktype=1, pkt_len=328
8373, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333435082,
pkt_usec=927005, linktype=1, pkt_len=328
8374, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333435138,
pkt_usec=245034, linktype=1, pkt_len=328
8375, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333435194,
pkt_usec=258282, linktype=1, pkt_len=328
8376, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333435204,
pkt_usec=110455, linktype=1, pkt_len=328
8377, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333435259,
pkt_usec=648018, linktype=1, pkt_len=328
8378, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333435315,
pkt_usec=659047, linktype=1, pkt_len=328
8379, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333435371,
pkt_usec=360010, linktype=1, pkt_len=328
8380, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333435426,
pkt_usec=921458, linktype=1, pkt_len=328
8381, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333435482,
pkt_usec=886938, linktype=1, pkt_len=328
8382, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333435538,
pkt_usec=897847, linktype=1, pkt_len=328
8383, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333435594,
pkt_usec=868454, linktype=1, pkt_len=328
8384, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333435650,
pkt_usec=493944, linktype=1, pkt_len=328
8385, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333435705,
pkt_usec=798380, linktype=1, pkt_len=328
8386, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333435761,
pkt_usec=504688, linktype=1, pkt_len=328
8387, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333435816,
pkt_usec=931397, linktype=1, pkt_len=328
8388, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333435872,
pkt_usec=213809, linktype=1, pkt_len=328
8389, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333435928,
pkt_usec=170669, linktype=1, pkt_len=328
8390, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333435983,
pkt_usec=942001, linktype=1, pkt_len=328
8391, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333436040,
pkt_usec=27236, linktype=1, pkt_len=328
8392, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333436095,
pkt_usec=474657, linktype=1, pkt_len=328
8393, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333436105,
pkt_usec=60898, linktype=1, pkt_len=328
8394, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333436160,
pkt_usec=437769, linktype=1, pkt_len=328
8395, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333436216,
pkt_usec=307616, linktype=1, pkt_len=328
8396, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333436271,
pkt_usec=777718, linktype=1, pkt_len=328
8397, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333436327,
pkt_usec=180797, linktype=1, pkt_len=328
8398, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333436382,
pkt_usec=908078, linktype=1, pkt_len=328
8399, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333436438,
pkt_usec=324539, linktype=1, pkt_len=328
8400, sensor_id=0, event_id=2129, tv_sec=1333433902, pkt_sec=1333436494,
pkt_usec=189633, linktype=1, pkt_len=328

So all those records are based on the same event see how event 2129 is actually
before event 2128 (though packets for this event are seen after).

Chazelas 2012-04-04 13:34:09 BST

Recompiled snort with debugging.

Program received signal SIGTERM, Terminated.
0x08090251 in fpEvalRTN (rtn=0xad474b0, p=0xbfe16c80, check_ports=1) at
fpdetect.c:519
519 if(!rtn->rule_func->RuleHeadFunc(p, rtn, rtn->rule_func,
check_ports))
(gdb) cont
Continuing.
[Thread 0xa5b98b70 (LWP 17800) exited]

Program received signal SIGSEGV, Segmentation fault.
0xb6d1b701 in FreeOneRule (data=0xb6bb1100) at sf_snort_detection_engine.c:1033
1033 for (i = 0; rule->options[i] != NULL; i++)
(gdb) bt
#0 0xb6d1b701 in FreeOneRule (data=0xb6bb1100) at
sf_snort_detection_engine.c:1033
#1 0x080cd193 in DynamicRuleListFree (head=0xb875fa8) at sp_dynamic.c:284
#2 0x0807e1b7 in SnortCleanup (exit_val=0) at snort.c:3434
#3 0x0807dc00 in CleanExit (exit_val=0) at snort.c:3146
#4 0x0807e543 in SignalCheck () at snort.c:3588
#5 0x0807d99b in PacketLoop () at snort.c:2919
#6 0x08079f4b in SnortMain (argc=18, argv=0xbfe17764) at snort.c:764
#7 0x08079e39 in main (argc=18, argv=0xbfe17764) at snort.c:687
(gdb) p rule
$1 = (Rule *) 0xb6bb1100
(gdb) p *rule
$2 = {ip = {protocol = 6 '\006', src_addr = 0xad4 <Address 0xad4 out of
bounds>, src_port = 0xae2 <Address 0xae2 out of bounds>, direction = 0 '\000',
dst_addr = 0xae6 <Address 0xae6 out of bounds>, dst_port = 0xaf0 <Address
0xaf0 out of bounds>}, info = {genID = 3, sigID = 12636, revision = 4,
classification = 0xaf4 <Address 0xaf4 out of bounds>, priority = 0, message
= 0xb1c <Address 0xb1c out of bounds>, references = 0x207c, meta = 0x0},
options = 0x0, evalFunc = 0,
initialized = 0 '\000', numOptions = 0, noAlert = 0 '\000', ruleData = 0x0}
(gdb) p i
$3 = 0
(gdb) p rule->options
$4 = (RuleOption **) 0x0
(gdb) detach
Detaching from program: /usr/local/bin/snort, process 16667
(gdb)

Looks like some double free. Will try valgrind.

Can't use valgrind as it's getting nowhere.

--
Stephane

Doug Burks

unread,
Apr 6, 2012, 11:57:07 AM4/6/12
to securit...@googlegroups.com
Hi Stephane,

If it looks like Issue 216, you might try modifying
pulledpork_update.sh to stop Snort BEFORE pulledpork downloads the new
ruleset. This should allow Snort to shutdown cleanly without the
segfault and may prevent Barnyard from trying re-process the same
event. This means that you will have a period of no detection until
Snort is restarted at the end of the script, but you may consider that
a better alternative than your current situation.

Hope that helps!

Thanks,
Doug

--
Doug Burks | http://securityonion.blogspot.com
Don't miss SANS SEC503 Intrusion Detection In-Depth in
Augusta GA 6/11 - 6/16 | 10% discount for ISSA Members!
http://augusta.issa.org/drupal/SANS-Augusta-2012

scott runnels

unread,
Apr 6, 2012, 12:06:17 PM4/6/12
to securit...@googlegroups.com
Stephane, 

I've seen the "Expected Confirm n and got: Failed to insert n:" when there's a problem with the spool file.  I will usually rm the corresponding spool file and barnyard2 will fire back up.  
I do, however, think there's an underlying problem and I'm currently trying to work through it on one of my production systems.  I don't believe the issue I'm seeing is related to 216 though.  

v/r
Scott

Stephane Chazelas

unread,
Apr 6, 2012, 12:17:03 PM4/6/12
to securit...@googlegroups.com
2012-04-06 12:06:17 -0400, scott runnels:

> Stephane,
>
> I've seen the "Expected Confirm n and got: Failed to insert
> n:" when there's a problem with the spool file. I will
> usually rm the corresponding spool file and barnyard2 will
> fire back up.
>
> I do, however, think there's an underlying problem and I'm
> currently trying to work through it on one of my production
> systems. I don't believe the issue I'm seeing is related to
> 216 though.
[...]

Thanks Scott,

when you say "spool file", to you mean the unified2 snort log
output or something else?

In my case, it's a different unified2 log file every day, and it
happens consistently every time I call pulledpork_update.sh.

Note that 216 and this issue (barnyard crashing upon snort_agent
error) have appeared both at the same time for me.

--
Stephane

scott runnels

unread,
Apr 6, 2012, 12:25:03 PM4/6/12
to securit...@googlegroups.com
Hi Stephane,

The barnyard2 log file will go something like this,
>
> Opened spool file '/nsm/sensor_data/HOSTNAME-NIC/snort.unified2.1333642490'
> Waiting for new data
> Closing spool file '/nsm/sensor_data/HOSTNAME-NIC/snort.unified2.1333642490'. Read 11675 records
> Opened spool file '/nsm/sensor_data/HOSTNAME-NIC/snort.unified2.1333695725'
> Waiting for new data
> ERROR: sguil: Expected Confirm 98501 and got: Failed to insert 98501: mysqlexec/db server: Duplicate entry '3-98501' for key 'PRIMARY'

The last "Opened spool file…" line should indicate the unified2 log file that is having problems.

I'm about to do more testing with running pulledpork_update.

v/r
Scott

Stephane Chazelas

unread,
Apr 30, 2012, 9:27:13 AM4/30/12
to securit...@googlegroups.com
2012-04-06 12:06:17 -0400, scott runnels:
> Stephane,
>
> I've seen the "Expected Confirm n and got: Failed to insert n:" when there's a problem with the spool file. I will usually rm the corresponding spool file and barnyard2 will fire back up.
> I do, however, think there's an underlying problem and I'm currently trying to work through it on one of my production systems. I don't believe the issue I'm seeing is related to 216 though.
>
[...]

I've not gone to the bottom of this yet, but for me the issue
seems to be related to MySQL and possibly incorrect caching of
queries:

$ telnet localhost 8000
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
SidCidRequest onion-eth1
SidCidResponse 3 1552342


$ mysql --defaults-extra-file=/etc/mysql/debian.cnf securityonion_db
mysql> SELECT MAX(cid) FROM event WHERE sid=3;
+----------+
| MAX(cid) |
+----------+
| 1552342 |
+----------+
1 row in set (0.00 sec)

mysql> SELECT max(cid) FROM event WHERE sid=3;
+----------+
| max(cid) |
+----------+
| 1553339 |
+----------+
1 row in set (0.01 sec)


Same query except for "max" case, two different results.

If I alter the query in any way, I get the correct answer, but
the same query as made by sguild returns the incorrect one:

mysql> SELECT MAX(cid) FROM event WHERE sid = 3;
+----------+
| MAX(cid) |
+----------+
| 1553339 |
+----------+
1 row in set (0.00 sec)

mysql> SELECT MAX(cid) FROM event WHERE sid=3;
+----------+
| MAX(cid) |
+----------+
| 1552342 |
+----------+
1 row in set (0.00 sec)

And it does return it very quickly which makes me think of a caching issue.

How weird.

--
Stephane

Stephane Chazelas

unread,
Apr 30, 2012, 9:35:06 AM4/30/12
to securit...@googlegroups.com
2012-04-30 14:27:13 +0100, Stephane Chazelas:
[...]
> And it does return it very quickly which makes me think of a caching issue.
[...]

Indeed:

mysql> SELECT MAX(cid) FROM event WHERE sid=3;
+----------+
| MAX(cid) |
+----------+
| 1552342 |
+----------+
1 row in set (0.00 sec)

mysql> reset query cache;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT MAX(cid) FROM event WHERE sid=3;
+----------+
| MAX(cid) |
+----------+
| 1553339 |
+----------+
1 row in set (0.01 sec)


I'll add the "reset query cache" in the barnyard2 restart script
to see if it helps with this issue.

But that MySQL /bug/ is a bit worrying I have to say.

--
Stephane

Jed

unread,
Jun 22, 2012, 3:26:15 PM6/22/12
to security-onion
Hi,

I have kind of the same problem. After the Pulledpork Update, barnyard
stop working.
I am using Suricata instead of Snort, so the Issue 216 doesn't help me
(I don't download SO rules).

But after the Pulledpork update, barnyard stop logging alerts in the
database.
It is still running, so as Suricata, but no alert has been logged.

I have looked the barnyard log file (/var/log/nsm/sf-secu-onion-eth1/
barnyard2.log.20120622053159), and I have seen this following error
message : ERROR: Connecton closed by client

Doug Burks

unread,
Jun 26, 2012, 3:00:45 PM6/26/12
to securit...@googlegroups.com
Hi Stephane,

Did adding "reset query cache" help?

I wonder if this is related:
"Transparent Caching – Caching is fully transparent to the
application, and what is very important it does not change semantics
of the queries – you always get actual query results. Really there are
some chevats – if you’re not using query_cache_wlock_invalidate=ON
locking table for write would not invalidate query cache so you can
get results even
if table is locked and is being prepared to be updated. So if you’re
using query cache in default configuration you can’t assume locking
table for write will mean no one will be able to read it – results
still can come from query cache unless you enable
query_cache_wlock_invalidate=ON."
http://www.mysqlperformanceblog.com/2006/07/27/mysql-query-cache/

Have you considered enabling query_cache_wlock_invalidate OR disabling
query cache altogether (I wonder what kind of performance impact that
might have)?

Thanks,
Doug

Stephane Chazelas

unread,
Jun 26, 2012, 3:28:55 PM6/26/12
to securit...@googlegroups.com
2012-06-26 15:00:45 -0400, Doug Burks:
> Hi Stephane,
>
> Did adding "reset query cache" help?

Hi Doug,

yes, it helps as barnyard2 doesn't crash anymore upon the
barnyard restart upon pulledpork_update. But it now crashes
randomly with the same kind of errors. And I can verify the
query cache (for that particular query) is invalid when that
happens.

> I wonder if this is related:
> "Transparent Caching – Caching is fully transparent to the
> application, and what is very important it does not change semantics
> of the queries – you always get actual query results. Really there are
> some chevats – if you’re not using query_cache_wlock_invalidate=ON
> locking table for write would not invalidate query cache so you can
> get results even
> if table is locked and is being prepared to be updated. So if you’re
> using query cache in default configuration you can’t assume locking
> table for write will mean no one will be able to read it – results
> still can come from query cache unless you enable
> query_cache_wlock_invalidate=ON."
> http://www.mysqlperformanceblog.com/2006/07/27/mysql-query-cache/
>
> Have you considered enabling query_cache_wlock_invalidate OR disabling
> query cache altogether (I wonder what kind of performance impact that
> might have)?
[...]

That doesn't seem related. That says that locking a table
doesn't stop you from getting data from it through the query
cache (if I understand correctly). But in my case, the query
cache is plain invalid long after the tables have been modified.

I didn't try disabling the query cache and expect there be a
significant impact. And I'm pretty sure it would fix the issue
as it's pretty clear what's happening.

The thing is I suspect there's a problem in my setup. If the
query cache occasionally behaved like that, it would be a widely
known mysql issue.

That's why I asked here if anybody had had the same issue and
I've not had any feedback yet.

What I'll try to do at some point is wipe the mysql DBs and
restore them from a dump to see if that helps.

--
Stephane
Reply all
Reply to author
Forward
0 new messages