Re: [Lustre-devel] Lustre dcache clean up and distributed locking

7 views
Skip to first unread message

Fan Yong

unread,
Jan 22, 2012, 11:11:38 AM1/22/12
to Andreas Dilger, haiyin...@emc.com, Bryon Neitzel, Xuezh...@emc.com, sfai...@emc.com, lustre...@lists.lustre.org
On 1/21/12 2:06 AM, Andreas Dilger wrote:
> Hi all, please enjoy your New Years holiday. This issue can wait until your return.
>
> Fan Yong,
> About DNE locking (I write this now only to avoid forgetting it, no immediate answer is needed) - in order to handle the split name/permission issue, the current plan is for the client to get the LOOKUP lock for the FID on both the master MDT and the remote MDT.
Hi Andreas,

Why two LOOKUP locks on two MDTs, why not one LOOKUP lock on inode
holder MDT and the other name lock (new) on name entry holder MDT? Any
special advantage? I think both methods need client-side support (for
protocol changes).


Cheers,
Fan Yong
>
> If the link is removed from the master MDT, then the FID LOOKUP lock will be cancelled there, but if another hard link is remove from a different MDT then the only the LOOKUP from the other MDT needs to be cancelled.
>
> This will be similar to Tao's proposal for a separate lock bit for the name, in the DNE case where the name is remote from the inode. It still suffers from some inefficiency in case of multiple hard links from the same master MDT to a remote inode (canceling the FID LOOKUP lock when unlinking one name will force it to be refetched for the other links), but since hard links are so rare this should not significantly impact performance.
>
> Cheers, Andreas
>
> On 2012-01-20, at 10:10, Fan Yong<yong...@whamcloud.com> wrote:
>> Excellent work. I just added some comments inside your document. Please
>> check.
>>
>>
>> Best Regards,
>> Fan Yong
>> Whamcloud, Inc.
>>
>> On 1/20/12 9:34 AM, haiyin...@emc.com wrote:
>>> Hi Andreas,
>>>
>>> Tao is on vacation now. It might be difficult for him to check emails due to limited internet access at hometown.
>>> For urgent issue, you folks could send email to his gmail account berg...@gmail.com.
>>>
>>> Thanks,
>>> Haiying
>>>
>>> -----Original Message-----
>>> From: Andreas Dilger [mailto:adi...@whamcloud.com]
>>> Sent: 2012年1月20日 1:39
>>> To: Peng, Tao
>>> Cc: faibish, sorin; Tang, Haiying; Liu, Xuezhao; lais...@whamcloud.com; yong...@whamcloud.com; gr...@whamcloud.com; e...@whamcloud.com
>>> Subject: Re: Lustre dcache clean up and distributed locking
>>>
>>> On 2012-01-17, at 3:21 AM,<tao....@emc.com> <tao....@emc.com> wrote:
>>>> Thanks Siyao and Oleg for answering my dcache revalidation question on lustre mailing list. I updated the design to reflect it.
>>>>
>>>> Please see attachment.
>>> Tao,
>>> Fan Yong is also taking a more detailed look at your document and will
>>> hopefully have a chance to reply before the New Year holidays.
>>>
>>> Also, we are just working on landing the patches to add support for Linux
>>> 2.6.38 for the Lustre client. One of the patches relates to the lockless
>>> dcache changes that were introduced in that kernel. If you are interested
>>> to review this patch, and become more familiar with the Lustre development
>>> process, you should visit http://review.whamcloud.com/1865 for the patch.
>>>
>>> You need to create an account in Gerrit using OpenID (Google, mostly), and
>>> an account in our bug tracking system (http://jira.whamcloud.com) if you
>>> haven't already.
>>>
>>>>> -----Original Message-----
>>>>> From: Andreas Dilger [mailto:adi...@whamcloud.com]
>>>>> Sent: Tuesday, January 17, 2012 4:16 PM
>>>>> To: Peng, Tao
>>>>> Cc: faibish, sorin; Tang, Haiying; Liu, Xuezhao; Lai Siyao; Fan Yong; Oleg Drokin; Eric Barton
>>>>> Subject: Re: Lustre dcache clean up and distributed locking
>>>>>
>>>>> On 2012-01-16, at 9:25 PM,<tao....@emc.com> wrote:
>>>>>> I finally started to work on Lustre dcache cleanup and locking. After reading Lustre, ocfs2 and VFS
>>>>> dcache related code, I came to a design for cleaning up Lustre dcache code and doing distributed
>>>>> locking under dcache. For distributed locking, the main idea is to add a new inode bitlock DENTRY lock
>>>>> to just protect valid dentry, instead of letting LOOKUP lock handle multiple purpose, which makes
>>>>> client unable to know whether file is deleted or not when server cancels LOOKUP lock. Instead, client
>>>>> holds PR mode DENTRY lock on all valid denties and when server cancels it, client knows the file is
>>>>> deleted.
>>>>>> For details, please see the attachments (I attached both word and pdf versions as I am not sure if
>>>>> which is more convenient for you:). And please help to review and comment. Thank you!
>>>>>
>>>>> Hi Tao,
>>>>> I'm passing this on to the engineers that are most familiar with the DLM and client dcache code in
>>>>> Lustre. After a quick first read through your document, your investigation of the client code is very
>>>>> insightful and looks like it will be able to remove some of the significant complexity that has grown
>>>>> over time in the llite code.
>>>>>
>>>>> Cheers, Andreas
>>>>> --
>>>>> Andreas Dilger Whamcloud, Inc.
>>>>> Principal Engineer http://www.whamcloud.com/
>>>>>
>>>>>
>>>> <dcache_distributed_locking-v2.docx>
>>> Cheers, Andreas
>>> --
>>> Andreas Dilger Whamcloud, Inc.
>>> Principal Engineer http://www.whamcloud.com/
>>>
>>>
>>>
>>>
>>>
>> <dcache_distributed_locking-v2_comment.docx>

_______________________________________________
Lustre-devel mailing list
Lustre...@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-devel

Andreas Dilger

unread,
Jan 22, 2012, 12:33:42 PM1/22/12
to Fan Yong, haiyin...@emc.com, Bryon Neitzel, Xuezh...@emc.com, sfai...@emc.com, lustre...@lists.lustre.org
On 2012-01-22, at 9:11, Fan Yong <yong...@whamcloud.com> wrote:

> On 1/21/12 2:06 AM, Andreas Dilger wrote:
>>
>> Fan Yong,

>> About DNE locking - in order to handle the split name/permission issue, the current plan is for the client to get the LOOKUP lock for the FID on both the master MDT and the remote MDT.


> Hi Andreas,
>
> Why two LOOKUP locks on two MDTs, why not one LOOKUP lock on inode holder MDT and the other name lock (new) on name entry holder MDT? Any special advantage? I think both methods need client-side support (for protocol changes).

The difference between the two approaches is relatively small in terms of the functional outcome, but in terms of implementation there is a bigger difference.

Two reasons for this:
- we already have LOOKUP locks, so less code changes are needed
- having a single lock per FID is easier to manage compared to one lock per name

Consider that chmod/chgrp/chown/ACL operations are more common in practice than hard links, then having a single LOOKUP lock is more efficient.

Even with hard links, if the client does a lookup on a link on the master, it can find the FID from the dirent and may have the inode data cached from the remote MDT.

I agree that name locks may have advantages in some cases, but I don't know if that is worth the extra complexity.

Cheers, Andreas

tao....@emc.com

unread,
Jan 30, 2012, 2:44:52 AM1/30/12
to lais...@whamcloud.com, gr...@whamcloud.com, haiyin...@emc.com, Xuezh...@emc.com, sfai...@emc.com, lustre...@lists.lustre.org

Hi Siyao,

 

Yes, ll_lookup_lock should be able to be removed. And looks like it will be handled by Yang Sheng in ‘LU-506 kernel: FC15 - Dcache RCU-walk changes’ as commented by Fan Yong (http://review.whamcloud.com/#patch,sidebyside,1865,2,lustre/include/linux/lustre_patchless_compat.h). I will leave it as is to let Sheng handle it.

 

Thanks,

Tao

 

From: Lai Siyao [mailto:lais...@whamcloud.com]
Sent: Friday, January 20, 2012 10:56 AM
To: Oleg Drokin
Cc: Andreas Dilger; Peng, Tao; faibish, sorin; Tang, Haiying; Liu, Xuezhao; Fan Yong; Eric Barton
Subject: Re: Lustre dcache clean up and distributed locking

 

Hi Tao,

 

The reason why ll_lookup_lock is introduced is because in some kernels (eg. 2.6.32 for RHEL6), d_rehash_cond() doesn't exist and only d_rehash can be used, which needs unlock dcache_lock before entry, therefore there is a race here, and a similar global lock like dcache_lock is needed. The improvement can be done here is when d_rehash_cond() is exported, ll_lookup_lock lock/unlock can be a null operation, and in recent kernels, dcache_lock is removed, ll_lookup_lock should be able to be removed too.

 

I like most of the cleanups, but I don't see much benefits introducing a DENTRY bit lock, because a dentry without LOOKUP lock can hardly be used, and inodebit lock coverage upgrade/downgrade is more useful here.

 

Bye,

- Lai

On Fri, Jan 20, 2012 at 1:11 AM, Oleg Drokin <gr...@whamcloud.com> wrote:

Hello!


On Jan 17, 2012, at 3:16 AM, Andreas Dilger wrote:

> On 2012-01-16, at 9:25 PM, <tao....@emc.com> wrote:
>> I finally started to work on Lustre dcache cleanup and locking. After reading Lustre, ocfs2 and VFS dcache related code, I came to a design for cleaning up Lustre dcache code and doing distributed locking under dcache. For distributed locking, the main idea is to add a new inode bitlock DENTRY lock to just protect valid dentry, instead of letting LOOKUP lock handle multiple purpose, which makes client unable to know whether file is deleted or not when server cancels LOOKUP lock. Instead, client holds PR mode DENTRY lock on all valid denties and when server cancels it, client knows the file is deleted.
>>
>> For details, please see the attachments (I attached both word and pdf versions as I am not sure if which is more convenient for you:). And please help to review and comment. Thank you!

Thanks for looking into this.
It is indeed possible to add another bit for just the dentry, but the problem is it would need to be granted in conjunction with
the LOOKUP bit anyway (along with permission information), if we don't do this, then we would need to send an additional RPC to get
this information very shortly after lookup in a lot of times and that would be pretty slow. Of course there is no way to get different
lock mode for every bit, so in this scheme the lock mode swould really be the same.
And the (currently) big problem with multiple bit locks is we don't have a way to cancel just subset of a lock (or degrade a lock coverage
I should say). This was planned for quite a while ago, but somehow never ended up at the top of the priority list so was never implemented after all.
There are minor comments on the logic flow after that too. E.g. the locks in lustre are unlocked right away on the client side, but just placed in the LRU and remain on the client in this sort of unreferenced state. When there is a conflict, the server sends the AST RPC and then the lock
is actually released (cancelled) by client. Currently the AST is not delivered to a lock that has non-zero usage counters.
Additionally due to the way we implement locks, it's not possible to have a separate lock for every name of the file.

For your cleanup list, I think that is good. Note that item 'c' was already implemented some time ago, you can see it here:
https://bugzilla.lustre.org/show_bug.cgi?id=16417 the patch is this: https://bugzilla.lustre.org/attachment.cgi?id=24200
It's probably somewhat stale now. It did not work correctly back at the time due to some other logic, but should work now.
You can read the comments in that bug for some more info.

Bye,
   Oleg
--
Oleg Drokin
Senior Software Engineer
Whamcloud, Inc.

 

tao....@emc.com

unread,
Jan 30, 2012, 2:46:14 AM1/30/12
to yong...@whamcloud.com, berg...@gmail.com, haiyin...@emc.com, br...@whamcloud.com, Xuezh...@emc.com, sfai...@emc.com, lustre...@lists.lustre.org
Hi Fan Yong,

Sorry for the late reply. I got your email earlier but didn't have chance to read the document due to very limited internet access.

Thank you very much for taking detailed look at the document and giving so many valuable comments.

About adding new inode bitlock, I must admit that I missed hardlink case. Although the lock mode can be changed to be the same as LOOKUP lock, I agree that inode bitlock cannot sufficiently protect every name/dentry separately for hardlinked files. And I agree, as you suggested in the document, adding a name entry based lock can solve it.

IIUC, to add a name entry based lock, both client and server should be modified, and LDLM need to be changed to manage the new lock information. Like new inode bitlock, the name entry based lock will as well introduce interoperability issues as a result of on wire protocol change. And in DNE environment as Andreas explained, the original plan is to get LOOKUP on both master MDT and remote MDT. If we introduce name entry based lock, DNE case may need some change as well.

I will take a closer look at the approach. In the meanwhile, could you please confirm if it is the right way to go? The benefit of above complexity seems to be better distributed dentry locking under dcache framework.

What do others think? Andreas?

Thanks and Best Regards,
Tao

> -----Original Message-----
> From: Fan Yong [mailto:yong...@whamcloud.com]
> Sent: Saturday, January 21, 2012 1:11 AM
> To: Peng, Tao; berg...@gmail.com
> Cc: Tang, Haiying; adi...@whamcloud.com; faibish, sorin; Liu, Xuezhao; lais...@whamcloud.com;
> gr...@whamcloud.com; e...@whamcloud.com; Bryon Neitzel
> Subject: Re: Lustre dcache clean up and distributed locking
>
> Hi Tao,
>
> Excellent work. I just added some comments inside your document. Please check.
>
>
> Best Regards,
> Fan Yong
> Whamcloud, Inc.
>
> On 1/20/12 9:34 AM, haiyin...@emc.com wrote:
> > Hi Andreas,
> >
> > Tao is on vacation now. It might be difficult for him to check emails due to limited internet access
> at hometown.
> > For urgent issue, you folks could send email to his gmail account berg...@gmail.com.
> >
> > Thanks,
> > Haiying
> >
> > -----Original Message-----
> > From: Andreas Dilger [mailto:adi...@whamcloud.com]
> > Sent: 2012年1月20日 1:39
> > To: Peng, Tao
> > Cc: faibish, sorin; Tang, Haiying; Liu, Xuezhao;
> > lais...@whamcloud.com; yong...@whamcloud.com; gr...@whamcloud.com;
> > e...@whamcloud.com
> > Subject: Re: Lustre dcache clean up and distributed locking
> >
> > On 2012-01-17, at 3:21 AM, <tao....@emc.com> <tao....@emc.com> wrote:
> >> Thanks Siyao and Oleg for answering my dcache revalidation question on lustre mailing list. I
> updated the design to reflect it.
> >>
> >> Please see attachment.
> > Tao,
> > Fan Yong is also taking a more detailed look at your document and will
> > hopefully have a chance to reply before the New Year holidays.
> >
> > Also, we are just working on landing the patches to add support for
> > Linux
> > 2.6.38 for the Lustre client. One of the patches relates to the
> > lockless dcache changes that were introduced in that kernel. If you
> > are interested to review this patch, and become more familiar with the
> > Lustre development process, you should visit http://review.whamcloud.com/1865 for the patch.
> >
> > You need to create an account in Gerrit using OpenID (Google, mostly),
> > and an account in our bug tracking system (http://jira.whamcloud.com)
> > if you haven't already.
> >
> >>> -----Original Message-----
> >>> From: Andreas Dilger [mailto:adi...@whamcloud.com]
> >>> Sent: Tuesday, January 17, 2012 4:16 PM
> >>> To: Peng, Tao
> >>> Cc: faibish, sorin; Tang, Haiying; Liu, Xuezhao; Lai Siyao; Fan
> >>> Yong; Oleg Drokin; Eric Barton
> >>> Subject: Re: Lustre dcache clean up and distributed locking
> >>>
> >>> On 2012-01-16, at 9:25 PM, <tao....@emc.com> wrote:
> >>>> I finally started to work on Lustre dcache cleanup and locking.
> >>>> After reading Lustre, ocfs2 and VFS
> >>> dcache related code, I came to a design for cleaning up Lustre
> >>> dcache code and doing distributed locking under dcache. For
> >>> distributed locking, the main idea is to add a new inode bitlock
> >>> DENTRY lock to just protect valid dentry, instead of letting LOOKUP
> >>> lock handle multiple purpose, which makes client unable to know
> >>> whether file is deleted or not when server cancels LOOKUP lock. Instead, client holds PR mode
> DENTRY lock on all valid denties and when server cancels it, client knows the file is deleted.
> >>>> For details, please see the attachments (I attached both word and
> >>>> pdf versions as I am not sure if
> >>> which is more convenient for you:). And please help to review and comment. Thank you!
> >>>
> >>> Hi Tao,
> >>> I'm passing this on to the engineers that are most familiar with the
> >>> DLM and client dcache code in Lustre. After a quick first read
> >>> through your document, your investigation of the client code is very
> >>> insightful and looks like it will be able to remove some of the significant complexity that has
> grown over time in the llite code.
> >>>
> >>> Cheers, Andreas
> >>> --
> >>> Andreas Dilger Whamcloud, Inc.
> >>> Principal Engineer http://www.whamcloud.com/
> >>>
> >>>
> >> <dcache_distributed_locking-v2.docx>
> >
> > Cheers, Andreas
> > --
> > Andreas Dilger Whamcloud, Inc.
> > Principal Engineer http://www.whamcloud.com/
> >
> >
> >
> >
> >

Andreas Dilger

unread,
Jan 30, 2012, 6:27:52 PM1/30/12
to tao....@emc.com, haiyin...@emc.com, br...@whamcloud.com, Xuezh...@emc.com, sfai...@emc.com, lustre...@lists.lustre.org
On 2012-01-30, at 12:46 AM, <tao....@emc.com> wrote:
> Sorry for the late reply. I got your email earlier but didn't have chance to read the document due to very limited internet access.
>
> Thank you very much for taking detailed look at the document and giving so many valuable comments.
>
> About adding new inode bitlock, I must admit that I missed hardlink case. Although the lock mode can be changed to be the same as LOOKUP lock, I agree that inode bitlock cannot sufficiently protect every name/dentry separately for hardlinked files. And I agree, as you suggested in the document, adding a name entry based lock can solve it.
>
> IIUC, to add a name entry based lock, both client and server should be modified, and LDLM need to be changed to manage the new lock information. Like new inode bitlock, the name entry based lock will as well introduce interoperability issues as a result of on wire protocol change. And in DNE environment as Andreas explained, the original plan is to get LOOKUP on both master MDT and remote MDT. If we introduce name entry based lock, DNE case may need some change as well.
>
> I will take a closer look at the approach. In the meanwhile, could you please confirm if it is the right way to go? The benefit of above complexity seems to be better distributed dentry locking under dcache framework.
>
> What do others think? Andreas?

Tao, before we go down the road of implementing such a change, there would have to be clear benefits to the performance and/or kernel integration before we could accept the interoperability issues and added code complexity.

The benefits of having a per-name lock, as I see them, are:
- can be cancelled separately from other name locks
- can be changed independently from the inode permissions (LOOKUP)
- potential ease of integration with the Linux dcache

The drawbacks of introducing a per-name lock:
- only useful for hard-linked files (uncommon case)
- added code complexity for a new lock type
- incompatibility with existing client/server
- additional code to handle incompatibility transparently
- lock enqueue may need extra RPC for each name lock (different resource)
- lock cancellation may need an extra RPC for each resource

The way I see it is that the DENTRY lock would need to be a different resource for each name on a file. The only way this is going to be useful AFAICS is if the client enqueues each DENTRY lock on a separate resource, so that it can be granted and cancelled independently of the LOOKUP lock (which is on the FID and would be common for all names that reference this inode). That would either need an extra RPC for every filename lookup (to enqueue both the DENTRY and LOOKUP locks) or a new LDLM_ENQUEUE RPC that can request and be granted two different locks (FID [seq, oid, ver, 0], and FID+name hash [seq, oid, ver, hash]) in a single RPC.

So, the main question is how much better the client dcache/VFS integration will be with the DCACHE lock compared to the existing mechanism with the LOOKUP (FID) lock? Is it impossible (or very difficult) to use the existing mechanism for the new lockless dcache code? Or is it a matter of this being a somewhat cleaner implementation, but not critical for moving forward?

I'm aware that our dcache code has been too complex for some time, but I haven't looked recently to see how much of this is due to using/supporting old kernel APIs, and how much of it is due to the use of the LOOKUP lock on the FID protecting too much of the state on the client.

Cheers, Andreas

Fan Yong

unread,
Jan 31, 2012, 7:02:14 AM1/31/12
to tao....@emc.com, haiyin...@emc.com, br...@whamcloud.com, Xuezh...@emc.com, sfai...@emc.com, lustre...@lists.lustre.org
On 1/31/12 7:27 AM, Andreas Dilger wrote:
> On 2012-01-30, at 12:46 AM,<tao....@emc.com> wrote:
>> Sorry for the late reply. I got your email earlier but didn't have chance to read the document due to very limited internet access.
>>
>> Thank you very much for taking detailed look at the document and giving so many valuable comments.
>>
>> About adding new inode bitlock, I must admit that I missed hardlink case. Although the lock mode can be changed to be the same as LOOKUP lock, I agree that inode bitlock cannot sufficiently protect every name/dentry separately for hardlinked files. And I agree, as you suggested in the document, adding a name entry based lock can solve it.
>>
>> IIUC, to add a name entry based lock, both client and server should be modified, and LDLM need to be changed to manage the new lock information. Like new inode bitlock, the name entry based lock will as well introduce interoperability issues as a result of on wire protocol change. And in DNE environment as Andreas explained, the original plan is to get LOOKUP on both master MDT and remote MDT. If we introduce name entry based lock, DNE case may need some change as well.
>>
>> I will take a closer look at the approach. In the meanwhile, could you please confirm if it is the right way to go? The benefit of above complexity seems to be better distributed dentry locking under dcache framework.
>>
>> What do others think? Andreas?
> Tao, before we go down the road of implementing such a change, there would have to be clear benefits to the performance and/or kernel integration before we could accept the interoperability issues and added code complexity.
>
> The benefits of having a per-name lock, as I see them, are:
> - can be cancelled separately from other name locks
> - can be changed independently from the inode permissions (LOOKUP)
> - potential ease of integration with the Linux dcache
>
> The drawbacks of introducing a per-name lock:
> - only useful for hard-linked files (uncommon case)
> - added code complexity for a new lock type
> - incompatibility with existing client/server
> - additional code to handle incompatibility transparently
> - lock enqueue may need extra RPC for each name lock (different resource)
> - lock cancellation may need an extra RPC for each resource
>
> The way I see it is that the DENTRY lock would need to be a different resource for each name on a file. The only way this is going to be useful AFAICS is if the client enqueues each DENTRY lock on a separate resource, so that it can be granted and cancelled independently of the LOOKUP lock (which is on the FID and would be common for all names that reference this inode). That would either need an extra RPC for every filename lookup (to enqueue both the DENTRY and LOOKUP locks) or a new LDLM_ENQUEUE RPC that can request and be granted two different locks (FID [seq, oid, ver, 0], and FID+name hash [seq, oid, ver, hash]) in a single RPC.
>
> So, the main question is how much better the client dcache/VFS integration will be with the DCACHE lock compared to the existing mechanism with the LOOKUP (FID) lock? Is it impossible (or very difficult) to use the existing mechanism for the new lockless dcache code? Or is it a matter of this being a somewhat cleaner implementation, but not critical for moving forward?
>
> I'm aware that our dcache code has been too complex for some time, but I haven't looked recently to see how much of this is due to using/supporting old kernel APIs, and how much of it is due to the use of the LOOKUP lock on the FID protecting too much of the state on the client.
>
> Cheers, Andreas
Hi Tao,

To introduce either per-name lock or DENTRY lock is mainly for removing
Lustre special flag "DCACHE_LUSTRE_INVALID" to simply Lustre code. But
if such simplification causes more complex ldlm logic, we have to
consider whether it is worth to remove "DCACHE_LUSTRE_INVALID". It is
not impossible to make "DCACHE_LUSTRE_INVALID" accepted by linux kernel.
On the other hand, "DCACHE_LUSTRE_INVALID" can work together with
lockless dcache in new linux kernel also.

So to push the project move forward, we can divide it as two parts
(according to your document): Lustre dcache code cleanup, and some new
ldlm logic to remove "DCACHE_LUSTRE_INVALID". Before we reach an
agreement on the second part, we can start from the first part to
cleanup Lustre dcache code. How do you think?

Cheers,
--
Nasf

tao....@emc.com

unread,
Feb 3, 2012, 2:37:05 AM2/3/12
to yong...@whamcloud.com, adi...@whamcloud.com, haiyin...@emc.com, br...@whamcloud.com, Xuezh...@emc.com, sfai...@emc.com, lustre...@lists.lustre.org
> -----Original Message-----
> From: Fan Yong [mailto:yong...@whamcloud.com]
> Sent: Tuesday, January 31, 2012 8:02 PM
> To: Peng, Tao
> Cc: Andreas Dilger; berg...@gmail.com; Tang, Haiying; faibish, sorin; Liu, Xuezhao;
> lais...@whamcloud.com; gr...@whamcloud.com; e...@whamcloud.com; br...@whamcloud.com; lustre-
> de...@lists.lustre.org
> Subject: Re: Lustre dcache clean up and distributed locking
>
Thank you, Andreas and Yong. I tend to agree with you that we should avoid the complexity if there is no obvious benefits. However, as we need to make sure the code is acceptable by kernel community, I will try to consolidate a design for adding the new type of lock but just don't implement it at this point. And if later we are asked (by kernel community developers) to do it, we just need to do the implementation.

Please stay tuned.

Best,
Tao

Fan Yong

unread,
Feb 3, 2012, 11:17:39 AM2/3/12
to tao....@emc.com, haiyin...@emc.com, br...@whamcloud.com, Xuezh...@emc.com, sfai...@emc.com, lustre...@lists.lustre.org

Hi Tao,

We has another way to remove Lustre special dentry flags
"DCACHE_LUSTRE_INVALID". To indicate the dentry in dcache without LOOKUP
lock protected but not unhashed yet, we can add new flags in
"ll_dentry_data" (pointed by "dentry::d_fsdata"). Since it is Lustre
private date, no need to worry about whether the new flags will conflict
with any other VFS standard "dentry::d_flags" or not, and maybe save
some effort to implement the complex DENTRY lock in your original design.

Cheers,
Nasf

Peng Tao

unread,
Feb 5, 2012, 8:50:16 AM2/5/12
to Fan Yong, haiyin...@emc.com, br...@whamcloud.com, Xuezh...@emc.com, sfai...@emc.com, lustre...@lists.lustre.org
Brilliant! Then it is lustre private flag and should not receive any
objection from VFS layer.

Thanks,
Tao

--
Thanks,
Tao

tao....@emc.com

unread,
Feb 9, 2012, 2:15:55 AM2/9/12
to yong...@whamcloud.com, haiyin...@emc.com, br...@whamcloud.com, Xuezh...@emc.com, sfai...@emc.com, lustre...@lists.lustre.org
Hi Fan Yong,

Thanks for your great suggestion. I updated the design doc with following changes, please help to review. Thank you!

Changelog:
1. Macro cleanups are removed. I prefer to do it separately while cleaning up other obsolete macros.
2. Remove DENTRY LDLM lock proposal. Because DCACHE_LUSTRE_INVALID flag can be hidden from generic VFS layer, it is no longer necessary to introduce such complex change.
3. Add a section to describe how to add LLD_LUSTRE_INVALID flag to ll_dentry_data structure to replace existing DCACHE_LUSTRE_INVALID flag.

Cheers,
Tao
dcache_cleanup_v3.docx

tao....@emc.com

unread,
Feb 9, 2012, 2:35:55 AM2/9/12
to yong...@whamcloud.com, haiyin...@emc.com, br...@whamcloud.com, Xuezh...@emc.com, sfai...@emc.com, lustre...@lists.lustre.org
> -----Original Message-----
> From: Peng, Tao
> Sent: Thursday, February 09, 2012 3:16 PM
> To: 'Fan Yong'
> Cc: adi...@whamcloud.com; berg...@gmail.com; Tang, Haiying; faibish, sorin; Liu, Xuezhao;
> lais...@whamcloud.com; gr...@whamcloud.com; e...@whamcloud.com; br...@whamcloud.com; lustre-
> de...@lists.lustre.org
> Subject: RE: Lustre dcache clean up and distributed locking
>
>
> > -----Original Message-----
> > From: Fan Yong [mailto:yong...@whamcloud.com]
> > Sent: Saturday, February 04, 2012 12:18 AM
> > To: Peng, Tao
> Hi Fan Yong,
>
> Thanks for your great suggestion. I updated the design doc with following changes, please help to
> review. Thank you!
>
> Changelog:
> 1. Macro cleanups are removed. I prefer to do it separately while cleaning up other obsolete macros.
> 2. Remove DENTRY LDLM lock proposal. Because DCACHE_LUSTRE_INVALID flag can be hidden from generic VFS
> layer, it is no longer necessary to introduce such complex change.
> 3. Add a section to describe how to add LLD_LUSTRE_INVALID flag to ll_dentry_data structure to replace
> existing DCACHE_LUSTRE_INVALID flag.
>
BTW, I notice that there are some conflicts with Siyao's latest RCU walk patch proposal (http://review.whamcloud.com/#change,1865). The main conflict seems to be about how to handle parent UPDATE lock cancelation. I will update the doc to remove the confliction later when Siyao new patchset is out.

Thanks,
Tao
> Cheers,
> Tao

Andreas Dilger

unread,
Feb 10, 2012, 3:37:25 PM2/10/12
to <tao.peng@emc.com>, haiyin...@emc.com, br...@whamcloud.com, Xuezh...@emc.com, sfai...@emc.com, lustre...@lists.lustre.org
On 2012-02-09, at 12:15 AM, <tao....@emc.com> <tao....@emc.com> wrote:

> Fan Yong <yong...@whamcloud.com> wrote:
>> We has another way to remove Lustre special dentry flags
>> "DCACHE_LUSTRE_INVALID". To indicate the dentry in dcache without LOOKUP
>> lock protected but not unhashed yet, we can add new flags in
>> "ll_dentry_data" (pointed by "dentry::d_fsdata"). Since it is Lustre
>> private date, no need to worry about whether the new flags will conflict
>> with any other VFS standard "dentry::d_flags" or not, and maybe save
>> some effort to implement the complex DENTRY lock in your original design.
>
> Hi Fan Yong,
>
> Thanks for your great suggestion. I updated the design doc with following changes, please help to review. Thank you!
>
> Changelog:
> 1. Macro cleanups are removed. I prefer to do it separately while cleaning up other obsolete macros.
> 2. Remove DENTRY LDLM lock proposal. Because DCACHE_LUSTRE_INVALID flag can be hidden from generic VFS layer, it is no longer necessary to introduce such complex change.
> 3. Add a section to describe how to add LLD_LUSTRE_INVALID flag to ll_dentry_data structure to replace existing DCACHE_LUSTRE_INVALID flag.

Hi Tao,
these cleanups look quite reasonable now. Will you be working on making
these changes for Lustre? The feature freeze for the Lustre 2.2 release
was Jan 7, so any non-bugfix changes will need to land for Lustre 2.3.

However, that doesn't prevent you from working on these patches and sending
them to Gerrit for inspection, testing, and getting them ready for landing.
You should base your patch on top of the most recent dcache patch in

http://review.whamcloud.com/1865

If you haven't looked already, please read the following wiki pages:

http://wiki.whamcloud.com/display/PUB/Submitting+Changes
http://wiki.whamcloud.com/display/PUB/Using+Gerrit

for details on how to submit patches to Gerrit for review/testing/landing.

Cheers, Andreas
--
Andreas Dilger Whamcloud, Inc.

Principal Lustre Engineer http://www.whamcloud.com/

Peng Tao

unread,
Feb 11, 2012, 8:40:36 AM2/11/12
to Andreas Dilger, haiyin...@emc.com, br...@whamcloud.com, Xuezh...@emc.com, sfai...@emc.com, lustre...@lists.lustre.org
On Sat, Feb 11, 2012 at 4:37 AM, Andreas Dilger <adi...@whamcloud.com> wrote:
> On 2012-02-09, at 12:15 AM, <tao....@emc.com> <tao....@emc.com> wrote:
>> Fan Yong <yong...@whamcloud.com> wrote:
>>> We has another way to remove Lustre special dentry flags
>>> "DCACHE_LUSTRE_INVALID". To indicate the dentry in dcache without LOOKUP
>>> lock protected but not unhashed yet, we can add new flags in
>>> "ll_dentry_data" (pointed by "dentry::d_fsdata"). Since it is Lustre
>>> private date, no need to worry about whether the new flags will conflict
>>> with any other VFS standard "dentry::d_flags" or not, and maybe save
>>> some effort to implement the complex DENTRY lock in your original design.
>>
>> Hi Fan Yong,
>>
>> Thanks for your great suggestion. I updated the design doc with following changes, please help to review. Thank you!
>>
>> Changelog:
>> 1. Macro cleanups are removed. I prefer to do it separately while cleaning up other obsolete macros.
>> 2. Remove DENTRY LDLM lock proposal. Because DCACHE_LUSTRE_INVALID flag can be hidden from generic VFS layer, it is no longer necessary to introduce such complex change.
>> 3. Add a section to describe how to add LLD_LUSTRE_INVALID flag to ll_dentry_data structure to replace existing DCACHE_LUSTRE_INVALID flag.
>
> Hi Tao,
Hi Andreas,

> these cleanups look quite reasonable now.  Will you be working on making
> these changes for Lustre?  The feature freeze for the Lustre 2.2 release
> was Jan 7, so any non-bugfix changes will need to land for Lustre 2.3.
>

Yes, it is on my todo list. Our plan is for me to first work on
removing the outdated configure macros and related code that are there
to support older (older than 2.6.18-latest) kernels. Then I will work
on the dcache cleanups based on top of the most recent dcache RCU walk
patch.

> However, that doesn't prevent you from working on these patches and sending
> them to Gerrit for inspection, testing, and getting them ready for landing.
> You should base your patch on top of the most recent dcache patch in
>
> http://review.whamcloud.com/1865
>
> If you haven't looked already, please read the following wiki pages:
>
> http://wiki.whamcloud.com/display/PUB/Submitting+Changes
> http://wiki.whamcloud.com/display/PUB/Using+Gerrit
>
> for details on how to submit patches to Gerrit for review/testing/landing.
>

Thanks for sharing the links. I will make sure all patches follow the
guidelines.

Best,
Tao

Yong Fan

unread,
Feb 11, 2012, 10:58:42 AM2/11/12
to tao....@emc.com, haiyin...@emc.com, br...@whamcloud.com, Xuezh...@emc.com, sfai...@emc.com, lustre...@lists.lustre.org
Hi Tao,

The new design looks OK me.
BTW, you have removed the section about "ll_lookup_lock" from the first version document. So you will not change related logic, right? I think this part should be fixed in new kernels.

Cheers,
Nasf

Peng Tao

unread,
Feb 11, 2012, 10:46:00 PM2/11/12
to Yong Fan, haiyin...@emc.com, br...@whamcloud.com, Xuezh...@emc.com, sfai...@emc.com, lustre...@lists.lustre.org
Hi Yong,

On Sat, Feb 11, 2012 at 11:58 PM, Yong Fan <yong...@whamcloud.com> wrote:
> Hi Tao,
>
> The new design looks OK me.

Thanks for reviewing!

> BTW, you have removed the section about "ll_lookup_lock" from the first
> version document. So you will not change related logic, right? I think this
> part should be fixed in new kernels.

"ll_lookup_lock" is removed in Siyao's latest dcache patch and
replaced by inode lock. So I removed the section about
"ll_lookup_lock" from the design and I will not change related logic.

Cheers,
Tao

tao....@emc.com

unread,
Feb 22, 2012, 10:13:12 PM2/22/12
to yong...@whamcloud.com, haiyin...@emc.com, br...@whamcloud.com, Xuezh...@emc.com, sfai...@emc.com, lustre...@lists.lustre.org
Hi all,

I cleaned up the design a bit more to reflect the changes in Siyao's latest dcache patch. Some of the previous mentioned dcache cleanups are removed because they were already addressed in the dcache scalability patch. And now the remaining task is mainly about hiding DCACHE_LUSTRE_INVALID flag. Please see the attachment for details. I also uploaded it to LU-1113.

Cheers,
Tao
dcache_cleanup_v4.docx
Reply all
Reply to author
Forward
0 new messages