Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
raidz1 problem after removing and inserting hard drive
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  5 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Scott  
View profile  
 More options Mar 1 2012, 2:37 pm
Newsgroups: comp.sys.sun.admin
From: Scott <spack...@gmail.com>
Date: Thu, 1 Mar 2012 11:37:48 -0800 (PST)
Local: Thurs, Mar 1 2012 2:37 pm
Subject: raidz1 problem after removing and inserting hard drive
A Sun Ultra-45, connected via SCSI-LVD to a JetStor 416S JBOD
enclosure,
that talks SCSI to the host and holds (12) SATA 1TB drives.

The JBOD enclosure was used to build (1) raidz1 pool, using all (12)
drives,
about six months ago.
# zpool upgrade -v
This system is currently running ZFS version 4

# cat /etc/release
Soalris 10 5/08 s10s_u5wos_10 SPARC

# uname -a
SunOS bahamas 5.10 Generic_127127-11 sun4u sparc SUNW, A70

Yesterday I started a zpool scrub on the pool.
About 20 minutes into it I pushed a drive enclosure release button,
ejecting a drive (c2t1d0).
I didn't realize the drive was a member of the raidz1 pool.
10 seconds later I re-inserted the drive.
ZFS started resilvering the pool, pushing the one spare drive into
service (c2t2d5).

This resilvering is taking some time; it is expected to finish late
today.

When issuing
# format -e
I could see the drive c2t1d0, but the string representing the disk,
"Hitachi-HDS721010KLA330-R001-931.51GB"
was instead showing a string indicating 0GB.

The physical path for c2t1d0 is
/pci@1e,600000/pci@0/pci@3/pci@0/scsi@8/sd@1,0

I was seeing a lot of kern.warning in /var/adm/messages,

WARNING: /pci@1e,600000/pci@0/pci@3/pci@0/scsi@8/sd@1,0 (sd6)
Corrupt label; wrong magic number

The load on the system was up around 12, and was sluggish to respond
to keyboard
and mouse.

I issued:
zpool status -x
and saw (roughly)
  raid-412S
  raidz1
  c2t0d0 online
  spare DEGRADED
    c2t1d0 UNAVAIL (sd6)
    c2t2d5 ONLINE
  c2t1d1 ONLINE
  c2t1d2 ONLINE
  c2t1d4 ONLINE
  c2t1d5 ONLINE
  c2t2d0 ONLINE
  c2t2d1 ONLINE
  c2t2d2 ONLINE
  c2t1d3 ONLINE
  c2t1d4 ONLINE
  spares
  c2t2d5 INUSE   (sd24)

After about a half-hour, and failing to get the /var/adm/messages
kern.warning to
decrease in frequency, I issued:

# cfgadm -c unconfigure c2::dsk/c2t1d0

That succeeded, the drive was reported offline in /var/adm/messages,
and the kern.warning
messages stopped.

I then tried:

# cfgadm -c configure c2::dsk/c2t1d0

and I get
cfgadm: Hardware specific failure: failed to configure SCSI device: I/
O error

Using cfgadm with a -f option does not change the output.

When I issue:

# cfgadm -l c2::dsk/c2t1d0
I see
Ap_Id            Type     Receptacle     Occupant    Condition
c2::dsk/c2t1d0   disk     Connected      unconfigured  unknown

# zpool status
  pool: raid-412S
  state: DEGRADED
  status: One or mor devices could not be opened.  Sufficient replicas
exist
  for the pool to continue functioning in a degraded state.
  Action: Attach the missing device and online it using 'zpool
online'.
  see: http://www.sun.com/msg/ZFS-8000-D3
  scrub: resilver in progress, 76.75% done, 5h27m to go

Disk's label: Hitachi-HDS721010KLA330-R001-931.51GB
Hitachi Deskstar 1TB 7200rpm SATA 3.0Gb/s P/N 0A35155 Aug-2007
S/N PAG89X7E

Could I get some help on how to get the disk connected again?
At this time I don't think the disk could be burned out just because
I ejected it then inserted it back into the JBOD enclosure 10 seconds
later.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
John D Groenveld  
View profile  
 More options Mar 2 2012, 3:36 pm
Newsgroups: comp.sys.sun.admin
From: groen...@cse.psu.edu (John D Groenveld)
Date: Fri, 2 Mar 2012 20:36:02 +0000 (UTC)
Local: Fri, Mar 2 2012 3:36 pm
Subject: Re: raidz1 problem after removing and inserting hard drive
In article <a60b306a-c4f3-40e7-86f7-291d248dc...@o16g2000yqg.googlegroups.com>,

Scott  <spack...@gmail.com> wrote:
>Yesterday I started a zpool scrub on the pool.
>About 20 minutes into it I pushed a drive enclosure release button,
>ejecting a drive (c2t1d0).

Whoops.

>I didn't realize the drive was a member of the raidz1 pool.
>10 seconds later I re-inserted the drive.
>ZFS started resilvering the pool, pushing the one spare drive into
>service (c2t2d5).

>This resilvering is taking some time; it is expected to finish late
>today.

Did it finish?

If so,
# zpool replace raid-412S c2t1d0 c2t2d5
And then add c2t1d0 back as your new spare:
# zpool add raid-412S spare c2t1d0

John
groenv...@acm.org


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Scott  
View profile  
 More options Mar 2 2012, 8:51 pm
Newsgroups: comp.sys.sun.admin
From: Scott <spack...@gmail.com>
Date: Fri, 2 Mar 2012 17:51:24 -0800 (PST)
Local: Fri, Mar 2 2012 8:51 pm
Subject: Re: raidz1 problem after removing and inserting hard drive
On Mar 2, 12:36 pm, groen...@cse.psu.edu (John D Groenveld) wrote:

It finished (it took about 28 hours).  All lights are quiet on the
JBOD front :)

> If so,
> # zpool replace raid-412S c2t1d0 c2t2d5
> And then add c2t1d0 back as your new spare:
> # zpool add raid-412S spare c2t1d0

I could try that (thanks).
But, the c2t1d0, though visible at the "zpool status" level, is not
there
at the "format -e" level.
Further complicating that, it's listed as unconfigured at the cfgadm
level:
# cfgadm -l c2::dsk/c2t1d0
I see
Ap_Id            Type     Receptacle     Occupant    Condition
c2::dsk/c2t1d0   disk     Connected      unconfigured  unknown

I supposed I'm prejudiced a little by what I want to do vs. what
you're saying
to do, because I'm thinking it was working this way before so it
should work
this way again.

Setting that aside for a moment but staying at the lower layer, what I
think I need to focus on is getting the
disk visible at a lower layer, so that it will show its face at the
format -e layer.
I am thinking after I can get it to show up there then I can proceed
to issue zpool commands.
I can't see it either at the prtconf -v layer.
Does this sound correct?

Regards, Scott


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
John D Groenveld  
View profile  
 More options Mar 3 2012, 9:09 am
Newsgroups: comp.sys.sun.admin
From: groen...@cse.psu.edu (John D Groenveld)
Date: Sat, 3 Mar 2012 14:09:02 +0000 (UTC)
Local: Sat, Mar 3 2012 9:09 am
Subject: Re: raidz1 problem after removing and inserting hard drive
In article <d9bd1fe7-51ae-44e6-966a-23d64ba80...@t16g2000yqt.googlegroups.com>,

Scott  <spack...@gmail.com> wrote:
>Setting that aside for a moment but staying at the lower layer, what I
>think I need to focus on is getting the
>disk visible at a lower layer, so that it will show its face at the
>format -e layer.
>I am thinking after I can get it to show up there then I can proceed
>to issue zpool commands.
>I can't see it either at the prtconf -v layer.
>Does this sound correct?

Which HBA are you using to connect to the JBOD?
$ prtconf -D

Perhaps there's bug that's preventing you from configuring the
c2::dsk/c2t1d0 without bouncing your host.

John
groenv...@acm.org


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Scott  
View profile  
 More options Mar 5 2012, 9:51 pm
Newsgroups: comp.sys.sun.admin
From: Scott <spack...@gmail.com>
Date: Mon, 5 Mar 2012 18:51:52 -0800 (PST)
Local: Mon, Mar 5 2012 9:51 pm
Subject: Re: raidz1 problem after removing and inserting hard drive
The JetStor JBOD enclosure also does RAID, and it seems that the disk
I'm
having problems with was uniquely configured to be a Volume, a RAID-0.
The rest of the disks are configured as "pass through devices".
(the enclosure takes SATA drives and presents them as scsi-attached.)

I reconfigured the disk in question to be like the rest, issued
# cfgadm -v -c configure c2

and got the device back, though under a different device tree:
old: /pci@1e,600000/pci@0/pci@3/pci@0/scsi@8/sd@1,0
new: /pci@1e,600000/pci@0/pci@3/pci@0/scsi@8/sd@0,1

It got a new device name.
old: /dev/dsk/c2t1d0
new: /dev/dsk/c2t0d1

The data, including the four vdevs, were still on the drive:
# zdb -l /dev/dsk/c2t0d1s0
(Lists 4 labels)
so it wouldn't allow me to
# zpool replace raid-412s c2t1d0 c2t0d1

I talked with tech support, who told me I needed to
read doc ID 1005473.1, which says you have to overwrite a used drive
with zeroes in order to use it.
Well, a 1TB drive, writing at 1kB dd block size, would take a very
long time, so I wrote a small script to just overwrite the four vdevs
(two on the beginning of the drive and two on the end).

I then could issue the above zpool replace command.
It's resilvering; probably in another 30 hours the pool will switch
from DEGRADED to ONLINE.

Thanks for the help.

Regards, Scott


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »