Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

identify locked files (dtrace?)

114 views
Skip to first unread message

Philip Brown

unread,
Mar 13, 2015, 6:06:32 PM3/13/15
to
Howdy folks,

we're having an issue on a primary NFS server of ours. We think that sometimes, 1 or more of the 200 nfs clients, goes nuts and requests a bunch of IO in parallel.
The backend system is SAMFS, which automatically locks a file when a write happens.

Trouble is, the locks are backing up the NFS usage so much, that it's DOS'ing the NFS server!

So we need a way to identify what file(s) are getting all choked up while the system is live.

I've found assorted dtrace scripts, such as iofile.d , which says which files have been written to recently.

and lockbyproc.d, which says which processes have locks.

Trouble is, in our case, it's all going to be ONE process on the server "nfsd_kproc".
So we need to be able to identify the file(s) that are locked.
(and ideally how many locks are hung up on the file)

Could anyone help us out please?

Andrew Gabriel

unread,
Mar 13, 2015, 6:27:06 PM3/13/15
to
In article <b3d5f502-5696-4fcd...@googlegroups.com>,
Does this help?

echo ::lminfo | mdb -k

--
Andrew Gabriel
[email address is not usable -- followup in the newsgroup]

Philip Brown

unread,
Mar 13, 2015, 8:10:40 PM3/13/15
to
On Friday, March 13, 2015 at 3:27:06 PM UTC-7, Andrew Gabriel wrote:
> In article <b3d5f502-5696-4fcd...@googlegroups.com>,
> Philip Brown <ph...@bolthole.com> writes:

> > So we need to be able to identify the file(s) that are locked.
> > (and ideally how many locks are hung up on the file)
> >
>
> Does this help?
>
> echo ::lminfo | mdb -k
>

Hi Andrew!


ERm...
i'm not sure.
seems to have truncated-to-uselessness data :(

...
4c0720c6b00 WR 0001 1031 mountd 4c07cb14480
/system/volatile/nf
4c07206fd80 WR 0021 492 ypbind 4c077693540
/system/volatile/da
4c0720b3580 WR 0021 492 ypbind 4c07770e580
/system/volatile/da
...

4c072294000 WR 0001 1034 nfsd 4c07d32da00
/system/volatile/nf



Is there a way for me to find out the filename?

Andrew Gabriel

unread,
Mar 13, 2015, 8:55:58 PM3/13/15
to
In article <83d74d8d-7abb-45c0...@googlegroups.com>,
Philip Brown <ph...@bolthole.com> writes:
> On Friday, March 13, 2015 at 3:27:06 PM UTC-7, Andrew Gabriel wrote:
>> In article <b3d5f502-5696-4fcd...@googlegroups.com>,
>> Philip Brown <ph...@bolthole.com> writes:
>
>> > So we need to be able to identify the file(s) that are locked.
>> > (and ideally how many locks are hung up on the file)
>> >
>>
>> Does this help?
>>
>> echo ::lminfo | mdb -k
>>
>
> Hi Andrew!
>
>
> ERm...
> i'm not sure.
> seems to have truncated-to-uselessness data :(
>
> ...
> 4c0720c6b00 WR 0001 1031 mountd 4c07cb14480
> /system/volatile/nf
> 4c07206fd80 WR 0021 492 ypbind 4c077693540
> /system/volatile/da
> 4c0720b3580 WR 0021 492 ypbind 4c07770e580
> /system/volatile/da
> ...
>
> 4c072294000 WR 0001 1034 nfsd 4c07d32da00
> /system/volatile/nf
>
>
>
> Is there a way for me to find out the filename?

Urgh - prettyness over correctness :-(

Try:
echo ::lminfo | mdb -k | awk 'NF>5{print $6"::vnode2path"}' | mdb -k

Clearly there's a race here between the two mdb commands.
There's probably a way to do it in a single mdb pipeline, but I
don't know how to fish out just the 5th field using mdb dcmds.

Philip Brown

unread,
Mar 13, 2015, 9:19:45 PM3/13/15
to
On Friday, March 13, 2015 at 5:55:58 PM UTC-7, Andrew Gabriel wrote:
> In article <83d74d8d-7abb-45c0...@googlegroups.com>,
>
>
> Try:
> echo ::lminfo | mdb -k | awk 'NF>5{print $6"::vnode2path"}' | mdb -k
>

heh.
I found a fancier version when I searched around for lminfo.

# from http://utcc.utoronto.ca/~cks/space/blog/solaris/ListingFileLocks?showcomments
echo '::walk lock_graph | ::print lock_descriptor_t l_vnode | ::vnode2path' | mdb -k | sort

which works great!.... on normal filesystems.

Unfortunately, the backend for our NFS server is SAMFS. whch is somehow immune to ::walk lock_graph
Samfs apparently has its OWNNNN special rwlock tree :(


It is visible with dtrace, I think, since one of the various dtrace sample scripts listed nfsd_kproc as having something.

Unfortunately...I dont know how to translate that stuff, to *filenames*.

Sighhh..

Casper H.S. Dik

unread,
Mar 14, 2015, 6:06:14 AM3/14/15
to
Philip Brown <ph...@bolthole.com> writes:

>ERm...
>i'm not sure.
>seems to have truncated-to-uselessness data :(

>...
>4c0720c6b00 WR 0001 1031 mountd 4c07cb14480
>/system/volatile/nf
>4c07206fd80 WR 0021 492 ypbind 4c077693540
>/system/volatile/da
>4c0720b3580 WR 0021 492 ypbind 4c07770e580
>/system/volatile/da
>...

>4c072294000 WR 0001 1034 nfsd 4c07d32da00
>/system/volatile/nf

Unfortunately, ::lminfo doesn't work with pipes but you can try:

>Is there a way for me to find out the filename?


echo ::lminfo | mdb -k | awk '{print $6 "::print vnode_t v_path"}' | mdb -k

which is a shame because mdb should be able to do that without help
from awk.

As of Solaris 11.1 (I think, perhaps even in Solaris 11 FCS but certainly not
in the last OpenSolaris release) the v_path nodes are fairly reliable.

Casper

Casper H.S. Dik

unread,
Mar 14, 2015, 6:06:58 AM3/14/15
to
and...@cucumber.demon.co.uk (Andrew Gabriel) writes:

>Clearly there's a race here between the two mdb commands.
>There's probably a way to do it in a single mdb pipeline, but I
>don't know how to fish out just the 5th field using mdb dcmds.

I don't think mdb wants to pipe ::lminfo.

Casper

Philip Brown

unread,
Mar 14, 2015, 3:52:10 PM3/14/15
to
On Saturday, March 14, 2015 at 3:06:14 AM UTC-7, Casper H. S. Dik wrote:

>
> >Is there a way for me to find out the filename?
>
>
> echo ::lminfo | mdb -k | awk '{print $6 "::print vnode_t v_path"}' | mdb -k
>
> which is a shame because mdb should be able to do that without help
> from awk.
>
> As of Solaris 11.1 (I think, perhaps even in Solaris 11 FCS but certainly not
> in the last OpenSolaris release) the v_path nodes are fairly reliable.
>

Hmm. thanks for trying.
Unfortunately, it does not seem to show show samfs locks :(

i guess there's some sort of samfs-internal-only rwlock thing.
All I know so far, is that it involves samfs_rwlock_common()
0 new messages