Yesterday the website had some problems, users just saw a blank screen
for a few hours, and eventually this changed to being a 404. The
problem seems to be related to macid state loading at app start. I had
a large number of 0 byte checkpoint and event files (below).
In the server logs, what I was seeing was that patch tag was starting,
seg faulting, and then restarting again. (It is run from inside a bash
while true loop).
When I moved the 0 byte state files to a safe place, the website
became usable again.
Any ideas what could be the underlying problem, or why moving the
checkpoint files away fixed things?
I am going to try more frequent checkpointing, mightybyte's
alternative macid library which supposedly uses a factor ten less
memory (at cost of more cpu iiuc) and maybe upgrade memory.
thartman@patch-tag:~/0bytestate>ls -lth /home/thartman/0bytestate/ |
tac | grep -i 'check' -A2 | head -n25
-rw-r--r-- 1 root root 0 Feb 25 22:44 checkpoints-0000041820
-rw-r--r-- 1 root root 0 Feb 25 22:44 events-0000041821
-rw-r--r-- 1 root root 0 Feb 25 22:45 events-0000041822
--
-rw-r--r-- 1 root root 0 Mar 6 05:12 checkpoints-0000041914
-rw-r--r-- 1 root root 0 Mar 6 05:13 events-0000041915
-rw-r--r-- 1 root root 0 Mar 6 05:18 events-0000041916
--
-rw-r--r-- 1 root root 0 Mar 29 05:48 checkpoints-0000042021
-rw-r--r-- 1 root root 0 Mar 29 06:06 events-0000042024
-rw-r--r-- 1 root root 0 Mar 29 06:07 events-0000042029
--
-rw-r--r-- 1 root root 0 Jun 14 02:02 checkpoints-0000043624
-rw-r--r-- 1 root root 0 Jun 14 02:02 events-0000043625
-rw-r--r-- 1 root root 0 Jun 14 02:02 events-0000043626
--
-rw-r--r-- 1 root root 0 Jun 14 02:04 checkpoints-0000043634
-rw-r--r-- 1 root root 0 Jun 14 02:04 events-0000043635
-rw-r--r-- 1 root root 0 Jun 14 02:04 events-0000043636
-rw-r--r-- 1 root root 0 Jun 14 02:04 checkpoints-0000043636
-rw-r--r-- 1 root root 0 Jun 14 02:04 events-0000043637
-rw-r--r-- 1 root root 0 Jun 14 02:04 events-0000043638
--
-rw-r--r-- 1 root root 0 Jun 14 02:06 checkpoints-0000043655
-rw-r--r-- 1 root root 0 Jun 14 02:15 events-0000043656
--
Need somewhere to put your code? http://patch-tag.com
Want to build a webapp? http://happstack.com
Sweet! More people should do that! Otherwise we want know what to fix!
> Yesterday the website had some problems, users just saw a blank screen
> for a few hours, and eventually this changed to being a 404. The
> problem seems to be related to macid state loading at app start. I had
> a large number of 0 byte checkpoint and event files (below).
>
> Any ideas what could be the underlying problem, or why moving the
> checkpoint files away fixed things?
I don't think happstack-state does the ideal thing we writing
checkpoint files at the moment. It should probably write the
checkpoint file to a temporary file, and the rename it to be a
checkpoint file once the write has completed successfully. I am
guessing that something funky happened that caused a checkpoint file
to go bad, and then it crashed when trying to read the 0-byte
checkpoint file from then on. So, going back to the last good state
fixed things? We should be able to test what happens if you have a
0-byte checkpoint file pretty easily..
I have added this to the 0.7 release on the roadmap
http://code.google.com/p/happstack/wiki/RoadMap?ts=1276523252&updated=RoadMap.
Though I would happily take a patch before then ;)
> I am going to try more frequent checkpointing, mightybyte's
> alternative macid library which supposedly uses a factor ten less
> memory (at cost of more cpu iiuc) and maybe upgrade memory.
How does that work?
- jeremy
Oh, maybe you mean his compact-ixset stuff ? Perhaps should should
make that more widely known/available ?
- jeremy
Yes, I think that's what he means. I've mentioned it a couple times.
I only got the code compiling--never got around to testing it. That
work is also old now. About a month ago I upgraded my app to the
latest version of happstack and was quite pleased to discover that my
memory usage dropped by 50%! It might be better to rewrite
compact-ixset based on the most recent ixset that has this improved
behavior.