[ale] md2_raid1

470 views
Skip to first unread message

drifter

unread,
Dec 13, 2009, 12:29:03 PM12/13/09
to ALE
My box suddenly slowed down, so I checked with TOP and a program called
md2_raid1 is sucking 20% of the cpu cycles -- and making it difficult to even
type this email. :( This box is running FC11

-------------------------------------------------------------------------------------------

Next stop was the log file and this is at the end of /var/log/messages:

Dec 13 11:21:08 Sarge kernel: imklog 3.22.1, log source = /proc/kmsg
started.
Dec 13 11:21:08 Sarge rsyslogd: [origin software="rsyslogd"
swVersion="3.22.1" x-pid="1362" x-info="http://www.rsyslog.com"] (re)start

Dec 13 11:41:02 Sarge kernel: md: data-check of RAID array md2

Dec 13 11:41:02 Sarge kernel: md: minimum _guaranteed_ speed: 1000
KB/sec/disk.

Dec 13 11:41:02 Sarge kernel: md: using maximum available idle IO bandwidth
(but not more than 200000 KB/sec) for data-check.

Dec 13 11:41:02 Sarge kernel: md: using 128k window, over a total of
202001152 blocks.

Dec 13 11:41:02 Sarge kernel: md: delaying data-check of md0 until md2 has
finished (they share one or more physical units)

Dec 13 11:41:02 Sarge kernel: md: delaying data-check of md1 until md2 has
finished (they share one or more physical units)

--------------------------------------------------------------------------------------------------

Does the OS suspect a problem with one of both of the hard drives?

Turned the box off last night so the uptime is less than 2 hours.

Another program, npviewer, also has been stealing cycles and I killed that
first, hoping that the box would return to normal. Didn't help much.
Checking in the log I see this line:

Dec 13 10:41:42 Sarge kernel: npviewer.bin[2584]: segfault at ff99cd48 ip
ff99cd48 sp bfb6773c error 14

and I have no idea what it is trying to tell me. Note that this was about
an hour before the box started checking the raid array.

Hints would be appreciated.

Sean

_______________________________________________
Ale mailing list
A...@ale.org
http://mail.ale.org/mailman/listinfo/ale
See JOBS, ANNOUNCE and SCHOOLS lists at
http://mail.ale.org/mailman/listinfo

scott

unread,
Dec 13, 2009, 12:34:40 PM12/13/09
to Atlanta Linux Enthusiasts - Yes! We run Linux!

sounds like one of the two drives is having issues.. and the software RAID1 driver is working at keeping the two disks in sync (mirrored). I would look at the logs to see if either drive is reporting issues. Also see if you can can run some SMART tools against the drive to see if it will tell you what is going on. Now it might be a bad cable, bad power to drive, or bad HD controller also. Unless you had a power outage or taking some sort of power spike, those are less likely. cables and HD Controllers generally dont fail without some help.

Thanks and good luck!!

drifter

unread,
Dec 13, 2009, 12:54:46 PM12/13/09
to ALE
I just noticed an additional oddity.
TOP provides an overview of what the box is doing as well as the constantly
changing list of processes using cpu cycles.

TOP is reporting this:
top - 12:40:15 up 2:11, 4 users, load average: 1.37, 1.58, 1.62
Tasks: 168 total, 1 running, 167 sleeping, 0 stopped, 0 zombie
Cpu(s): 5.5%us, 10.8%sy, 0.0%ni, 83.5%id, 0.0%wa, 0.0%hi, 0.3%si,
0.0%st
Mem: 1932224k total, 1170748k used, 761476k free, 100916k buffers
Swap: 4096552k total,

And I am wondering "What 4 users?"
Ought to be nobody but root and kilpatms owning running processes.
Now if I run 'users' it reports four instances of kilpatms. But I am only
logged in once.
Hmmmmm the more questions I ask, the more confused I get.
Maybe I should just stop asking questions. :)

Brian Pitts

unread,
Dec 13, 2009, 1:19:45 PM12/13/09
to Atlanta Linux Enthusiasts - Yes! We run Linux!
On 12/13/2009 12:54 PM, drifter wrote:
> And I am wondering "What 4 users?"
> Ought to be nobody but root and kilpatms owning running processes.
> Now if I run 'users' it reports four instances of kilpatms. But I am only
> logged in once.
> Hmmmmm the more questions I ask, the more confused I get.
> Maybe I should just stop asking questions. :)

There are some other commands you can use to get more info. Try 'w' or
'who'.

--
All the best,
Brian Pitts

Jim Kinney

unread,
Dec 13, 2009, 1:26:25 PM12/13/09
to Atlanta Linux Enthusiasts - Yes! We run Linux!
reboot
--
--
James P. Kinney III
Actively in pursuit of Life, Liberty and Happiness        

Jim Kinney

unread,
Dec 13, 2009, 1:29:09 PM12/13/09
to Atlanta Linux Enthusiasts - Yes! We run Linux!
cat /proc/mdstat will show what is going on with software raid.
npviewer is associated with web browser tools and (stupid system hog) flash player.

If the system is resyncing a raid mirror, let it alone.

Jeff Hubbs

unread,
Dec 13, 2009, 7:09:07 PM12/13/09
to Atlanta Linux Enthusiasts - Yes! We run Linux!
Sounds like you have a syncing RAID1 md device. Cat /proc/mdstat to
investigate.

David Tomaschik

unread,
Jan 31, 2010, 4:54:02 PM1/31/10
to Atlanta Linux Enthusiasts - Yes! We run Linux!
drifter wrote:
> TOP is reporting this:
> top - 12:40:15 up 2:11, 4 users, load average: 1.37, 1.58, 1.62
>
> And I am wondering "What 4 users?"
> Ought to be nobody but root and kilpatms owning running processes.
> Now if I run 'users' it reports four instances of kilpatms. But I am only
> logged in once.
>
<snip>

I use a lot of terminals. My uptime reports:
16:53:00 up 5 days, 20:49, 11 users, load average: 0.67, 0.75, 0.41

The 11 users = 1 X login + 10 terminals open.

--
David Tomaschik, RHCE
Moderator, LinuxQuestions.org
http://www.tuxteam.com
da...@tuxteam.com [GPG: 0x6D428695]


signature.asc
Reply all
Reply to author
Forward
0 new messages