[GSOC] Bdbuf improvements

21 views
Skip to first unread message

Xiang Cui

unread,
Mar 29, 2012, 6:59:58 AM3/29/12
to rtems...@rtems.org
Hi

I'm interested in Bdbuf improvements as my GSOC project this year. The
main goal of the project is to change existing replacement strategy
(LRU) to more advanced cache algorithms(for example LIRS). The IO
performance will benefit from this change. It will also bring some API
change in Disk and Driver module. Here are some details and comparison
about the cache algorithms:

LIRS[1] means Low Inter-reference Recency Set, which is a cache
algorithm introduced by Song Jiang and Xiaodong Zhang in
SIGMETRICS'02. The reason for LRU to behave poorly is that LRU makes a
bold assumption a block that has not been accessed the longest would
wait for relatively longest time to be accessed again. This assumption
cannot capture the access patterns exhibited in these workloads with
weak locality.The algorithm takes into consideration the
Inter-Reference Recency of pages as the dominant factor for eviction.
The Inter-Reference Recency (IRR) of a page refers to the number of
other pages accessed between two consecutive references to that page.
It is assumed that if current IRR of a page is large, then the next
IRR of the block is likely to be large again and hence the page is
suitable for eviction as per Belady's MIN. It needs to be noted that
the page with high IRR selected for eviction may also have been
recently used[2].

Song Jiang concludes the LIRS in his presentation as :
Effectively use deeper access history without explicit regularity
detection and high cost operations.
Outperform exiting replacement policies.
Its implementation as simple as LRU.
Applicable to virtual memory and database buffer management.

LIRS is used by MySQL.

The Adaptive Replacement Cache (ARC) is an adaptive page replacement
algorithm developed at the IBM Almaden Research Center [3]. It seems
that the performance of ARC is better than LIRS, but the algorithm is
patented (by IBM).

There also may other advanced cache algorithms like CLOCK-Pro [4],CAR
(CLOCK with Adaptive Replacement )algorithms, 2Q algorithm and etc. In
CLOCK-PRO paper, it is indicated that CLOCK-PRO is superior than LIRS
and other buffer algorithms (including ARC)[5].


I am not certain whether the above algorithms are all appropriate for
Block Buffer. I also take a look at Linux file cache. It seems that
Linux does not use a such complex strategy[6]. I believe the advanced
cache algorithms will gain the performance in IO heavy application
with adding some complexity and overload as a trade off. Since the
LIRS is widely used in database system, I think it's appropriate for
the Block Device Buffer Management in RTEMS. To apply these
algorithms, the existing API also need some change.

As the direction of the wiki[7], here is my draft plan:
The very first thing is to create a set of benchmarks and the test
cases for Block Device Buffer Management.
Then I will implement the algorithms in C language as a demo. To apply
the algorithms to RTEMS, some API will be changed. So I will carefully
discuss the change in the community.
The next thing should be running the benchmarks and test cases. Memory
footprint and other overload will be compared between two algorithms
if possible.
If everything goes well the new cache algorithm will be added as an
experimental option. And more optimization will be added.

Last year I helped to add the file system POSIX compatibility test
suite to RTEMS. I have more free time than last summer, so I think
this project is OK for me. I just start to read the Block Device
Buffer Management source code. Any advice are welcomed. After
gathering the feedback, I will complete my proposal as soon as
possible.

[1]LIRS
http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-02-6.pdf
[2]A Comparison of Page Replacement Algorithms
http://www.ijetch.org/papers/218-T697.pdf
[3]ARC: A Self-Tuning, Low overhead Replacement Cache
http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.13.5210
[4] Clock-Pro
http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/abs05-3.html
[5]A discuss about buffer cache in postgresql DB maill list
http://www.mail-archive.com/pgsql-...@postgresql.org/msg173951.html
[6]Linux cache swap
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=blob_plain;f=mm/swap.c;hb=refs/heads/master
[7]Bdbuf improvements
http://www.rtems.org/wiki/index.php/Bdbuf_improvements
_______________________________________________
rtems-users mailing list
rtems...@rtems.org
http://www.rtems.org/mailman/listinfo/rtems-users

Sebastian Huber

unread,
Mar 30, 2012, 6:01:12 AM3/30/12
to rtems...@rtems.org
Hello Xiang Cui,

it is good to hear that you are interested in this topic. I think all RTEMS
users with block devices will profit from changes in this area.

On 03/29/2012 12:59 PM, Xiang Cui wrote:
[...]


> Last year I helped to add the file system POSIX compatibility test
> suite to RTEMS.

Thanks a lot, your tests were very helpful for my recent file system changes.

> I have more free time than last summer, so I think
> this project is OK for me. I just start to read the Block Device
> Buffer Management source code. Any advice are welcomed. After
> gathering the feedback, I will complete my proposal as soon as
> possible.

The most important part of the bdbuf documentation is the state machine here:

http://www.rtems.org/onlinedocs/doxygen/cpukit/html/group__rtems__bdbuf.html#details

This diagram is the specification. If you need more states or change
transitions, then do this first here. With this diagram you can also show the
correctness.

There are several tests for the bdbuf:

testsuites/libtests/block*

The test testsuites/libtests/block05 tests the state machine.

--
Sebastian Huber, embedded brains GmbH

Address : Obere Lagerstr. 30, D-82178 Puchheim, Germany
Phone : +49 89 18 90 80 79-6
Fax : +49 89 18 90 80 79-9
E-Mail : sebasti...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.

Joel Sherrill

unread,
Mar 30, 2012, 10:34:17 AM3/30/12
to Sebastian Huber, rtems...@rtems.org
On 03/30/2012 05:01 AM, Sebastian Huber wrote:
> Hello Xiang Cui,
>
> it is good to hear that you are interested in this topic. I think all RTEMS
> users with block devices will profit from changes in this area.
>
> On 03/29/2012 12:59 PM, Xiang Cui wrote:
> [...]
>> Last year I helped to add the file system POSIX compatibility test
>> suite to RTEMS.
> Thanks a lot, your tests were very helpful for my recent file system changes.
Agreed. I only wish we had had more time to resolve the issues
you uncovered while you had time to be active.

Sebastian.. are there any issues outstanding with the
existing fstests? I haven't checked recently.


>> I have more free time than last summer, so I think
>> this project is OK for me. I just start to read the Block Device
>> Buffer Management source code. Any advice are welcomed. After
>> gathering the feedback, I will complete my proposal as soon as
>> possible.
> The most important part of the bdbuf documentation is the state machine here:
>
> http://www.rtems.org/onlinedocs/doxygen/cpukit/html/group__rtems__bdbuf.html#details
>
> This diagram is the specification. If you need more states or change
> transitions, then do this first here. With this diagram you can also show the
> correctness.
>
> There are several tests for the bdbuf:
>
> testsuites/libtests/block*
>
> The test testsuites/libtests/block05 tests the state machine.
>

This code is important and performance is important.

I don't know if it is listed in the project description but one of my
long time desires for RTEMS has been pluggable resource
scheduling algorithms. We now have pluggable CPU scheduling
algorithms. Having pluggable cache replacement algorithms
and disk access scheduling algorithms is in the same vein.

Just as preemptive priority based thread scheduling is the
right solution 90+% of the time, an algorithm like the
Adaptive Cache Replacement one is probably the right
solution most of the time. But these are embedded
systems with dedicated applications and usage patterns.
Perhaps in some applications, an algorithm with different
characteristics is a better choice.

So any work that makes the management algorithm
easily pluggable is desirable.

Disclaimer: My dissertation was on disk caching and disk
scheduling algorithms to avoid priority inversions. If you
want to avoid priority inversion, then your choice of
algorithm is different than if you solely focus on
minimizing cache misses.

--
Joel Sherrill, Ph.D. Director of Research& Development
joel.s...@OARcorp.com On-Line Applications Research
Ask me about RTEMS: a free RTOS Huntsville AL 35805
Support Available (256) 722-9985

Sebastian Huber

unread,
Mar 30, 2012, 10:48:43 AM3/30/12
to RTEMS
On 03/30/2012 04:34 PM, Joel Sherrill wrote:
> Sebastian.. are there any issues outstanding with the
> existing fstests? I haven't checked recently.

The fstests are fine. Some file systems (e.g. IMFS) have non-POSIX conform
lseek() and ftruncate().

Maybe if we have a budget for this, then I want to move the tests in a library
so file systems outside the RTEMS sources can be tested with them as well (e.g.
YAFFS2).

--
Sebastian Huber, embedded brains GmbH

Address : Obere Lagerstr. 30, D-82178 Puchheim, Germany
Phone : +49 89 18 90 80 79-6
Fax : +49 89 18 90 80 79-9
E-Mail : sebasti...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.

Reply all
Reply to author
Forward
0 new messages