[ANNOUNCE] BFQ-v7r2 for 3.0.0 - 3.13.0, plus 3.10.8+

Paolo Valente

unread,

Feb 14, 2014, 1:03:43 PM2/14/14

to bfq-i...@googlegroups.com

Hi,
you may have the impression that we entered a frenetic code-development
phase with BFQ. Well, you are right. Why?

Because at last it seems that we have found the time to pack BFQ for
submission to lkml. And, while breaking down BFQ into a set of logically
distinct patches (for review), we have bumped into a small piece of BFQ
that we could not describe clearly. Finally, we have understood,
embarrassingly, that we let the heuristic for soft real-time
applications perform one important step in a confused way ...

So we fixed that, and here is the result: v7r2. Besides being
conceptually clear, the new version of this step also provided little
additional benefits in terms of responsiveness and latency for soft
real-time applications. If you are interested into more details, I
pasted an excerpt of the changelog at the end of this email.

Thanks to the indispensable contribution of Arianna, patches are again
available for a countless number of kernel versions.

Finally, I took the opportunity to further update the test-result page
with the results for v7r2, obtained using the just-released new version
of the benchmark suite:
http://www.algogroup.unimo.it/people/paolo/disk_sched/results.php
In particular, this page now also contains a link to an extra result
page, with responsiveness results for two more applications and, above
all, frame-drop-rate results for our video-playing tests. I do suggest
you to have a look at these results, if only to better know what you are
putting in your kernels :)

Paolo

Excerpt from the changelog:

BUGFIX/IMPROVEMENT. One of the requirements for an application to be
deemed as soft real-time is that it issues its requests in batches, and
stops doing I/O for a well-defined amount of time before issuing a new
batch. Imposing this minimum idle time allows BFQ to filter out I/O-bound
applications that may otherwise be incorrectly deemed as soft real-time
(under the circumstances described in detail in the comments to the
function bfq_bfqq_softrt_next_start()). Unfortunately, BFQ could however
start counting this idle time from two different events: either from the
expiration of the queue, if all requests of the queue had also been already
completed when the queue expired, or, if the previous condition did not
hold, from the first completion of one of the still outstanding requests.
In the second case, an application had more chances to be deemed as soft
real-time.
Actually, there was no reason for this differentiated treatment. We
addressed this issue by defining more precisely the above requirement for
an application to be deemed as soft real-time, and changing the code
consequently: a well-defined amount of time must elapse between the
completion of *all the requests* of the current pending batch and the
issuing of the first request of the next batch (this is, in the end, what
happens with a true soft real-time application). This change further
reduced false positives, and, as such, improved responsiveness and reduced
latency for actual soft real-time applications.

post-factum

unread,

Feb 15, 2014, 3:43:19 AM2/15/14

to bfq-i...@googlegroups.com

Works OK for me. Thanks for the update.

Пʼятниця, 14 лютого 2014 р. 20:03:43 UTC+2 користувач paolo написав:

Holger Hoffstätte

unread,

Feb 15, 2014, 3:55:17 AM2/15/14

to bfq-i...@googlegroups.com

Great work as usual - so I did as you said, read the results and have a question. :)

The first test (Seagate HDD) shows that BFQ wins noticeably in 10r-seq and 5r5w-seq, probably because of its superior bandwidth allocation. However I'm curious about the (admittedly pathological!) 10r-rand case, where BFQ shows pretty much exactly ~50% bandwidth of the other schedulers. Any explanation for this? I though that BFQ had a bimodal behaviour and was able to switch from bandwidth to time expenditure model for very seeky workloads (like CFQ), but that does not seem to happen here. Is this
some fairness pressure at work?

Should this workload not converge at the same result for all schedulers since they all should (in theory) do "equally badly" with such a seeky test?

Thanks!
Holger

AnAkIn

unread,

Feb 15, 2014, 12:49:50 PM2/15/14

to bfq-i...@googlegroups.com

Do the 3.13 patches currently work with 3.14rc2 as well? :)

--
You received this message because you are subscribed to the Google Groups "bfq-iosched" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bfq-iosched...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Arianna Avanzini

unread,

Feb 15, 2014, 2:02:56 PM2/15/14

to bfq-i...@googlegroups.com

Il giorno sabato 15 febbraio 2014 18:49:50 UTC+1, AnAkIn ha scritto:

Do the 3.13 patches currently work with 3.14rc2 as well? :)

Hello,

thank you for your interest in BFQ. Actually, the 3.13 patches do not compile correctly on 3.14-rc2. To let them compile, an additional small patch must be applied on top of them.

I am attaching to this mail the patch needed to compile BFQ-v7r2 for 3.13 on a 3.14-rc2 kernel. It should apply without issues with both "git am" and "patch -p1" and let BFQ-v7r2 compile without warnings.

Please do not hesitate to report about any issue you might encounter.

0001-block-bfq-Switch-from-BFQ-v7r2-for-3.13.0-to-BFQ-v7r2-for-3.14.0-rc2.patch

Paolo Valente

unread,

Feb 15, 2014, 2:28:31 PM2/15/14

to bfq-i...@googlegroups.com

Il giorno 15/feb/2014, alle ore 09:55, Holger Hoffstätte ha scritto:

>
> Great work as usual - so I did as you said, read the results and have a question. :)
>
> The first test (Seagate HDD) shows that BFQ wins noticeably in 10r-seq and 5r5w-seq, probably because of its superior bandwidth allocation. However I'm curious about the (admittedly pathological!) 10r-rand case, where BFQ shows pretty much exactly ~50% bandwidth of the other schedulers.

Good point! Thanks for highlighting it.

> Any explanation for this? I though that BFQ had a bimodal behaviour and was able to switch from bandwidth to time expenditure model for very seeky workloads (like CFQ), but that does not seem to happen here.

Actually, it happens.

> Is this
> some fairness pressure at work?

Exactly. BFQ cares more than the other schedulers about fairness also for processes doing random I/O. As also confirmed by some quick tests and investigations just done by Arianna, the fairness-related step of BFQ that causes this lower throughput with rotational disks, is always guaranteeing a minimum disk-idling time also for these processes. If you are curious and want to better understand why this lowers throughput I will be glad to provide you with more details.

Anyway, it would be relatively easy to relax this constraint for these processes and get the same, still extremely low, throughput of the other schedulers. The problem is that it is hard, in general, to asses whether such a gain in terms of throughput would be worth the price to pay in terms in of fairness. In particular, if the throughput is an issue, then the (only) actual solution to get a non-negligible throughput with such a workload seems to be to switch to a flash-based device.

We have added further investigations on this topic to our TODO list.

Thanks,
Paolo

>
> Should this workload not converge at the same result for all schedulers since they all should (in theory) do "equally badly" with such a seeky test?
>
> Thanks!
> Holger
>
>
>

AnAkIn

unread,

Feb 15, 2014, 6:29:21 PM2/15/14

to bfq-i...@googlegroups.com

Thanks, it works with this new patch !

I hope it'll be merged in the linux kernel:)

Holger Hoffstätte

unread,

Feb 14, 2014, 1:35:01 PM2/14/14

to bfq-i...@googlegroups.com

On 02/14/14 19:03, Paolo Valente wrote:
> Finally, I took the opportunity to further update the test-result page
> with the results for v7r2, obtained using the just-released new version
> of the benchmark suite:
> http://www.algogroup.unimo.it/people/paolo/disk_sched/results.php
> In particular, this page now also contains a link to an extra result
> page, with responsiveness results for two more applications and, above
> all, frame-drop-rate results for our video-playing tests. I do suggest
> you to have a look at these results, if only to better know what you are
> putting in your kernels :)

Great work as usual - so I did as you said, read the results and have a
question. :)

The first test (Seagate HDD) shows that BFQ wins noticeably in 10r-seq
and 5r5w-seq, probably because of its superior bandwidth allocation.
However I'm curious about the (admittedly pathological!) 10r-rand case,

where BFQ shows pretty much ~50% bandwidth of the other schedulers. Any

Paolo Valente

unread,

Mar 2, 2014, 11:32:40 AM3/2/14

to bfq-i...@googlegroups.com

Hi,
you may think we have calmed down on working on BFQ for a while, at last ….

Well, you are wrong. We have concluded that BFQ cannot stand the affront of achieving half the very low throughput of the other schedulers with random workloads. Even if it happens because BFQ privileges fairness and latency guarantees over poor throughput.

So we have been working on a some 'magic' solution that preserves both guarantees and throughput with random workloads, since a few days after Holger highlighted a loss of throughput with the latter workloads.

We have already something that seems to start working … However, as you can imagine, the operation we are trying to perform is not easy at all, so it is still early to declare that we have succeeded.

Paolo

Josh King

unread,

Mar 3, 2014, 10:51:36 AM3/3/14

to bfq-i...@googlegroups.com

Look forward to testing it!

Josh

On 03/02/2014 10:32 AM, Paolo Valente wrote:
> Hi,

> you may think we have calmed down on working on BFQ for a while, at last ....

>
> Well, you are wrong. We have concluded that BFQ cannot stand the affront of achieving half the very low throughput of the other schedulers with random workloads. Even if it happens because BFQ privileges fairness and latency guarantees over poor throughput.
>
> So we have been working on a some 'magic' solution that preserves both guarantees and throughput with random workloads, since a few days after Holger highlighted a loss of throughput with the latter workloads.
>

> We have already something that seems to start working ... However, as you can imagine, the operation we are trying to perform is not easy at all, so it is still early to declare that we have succeeded.

Message has been deleted

AnAkIn

unread,

Mar 6, 2014, 12:06:49 PM3/6/14

to bfq-i...@googlegroups.com

Great :)

Will you try to get it mainlined for linux 3.15 ?

To unsubscribe from this group and stop receiving emails from it, send an email to bfq-iosched+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "bfq-iosched" group.

To unsubscribe from this group and stop receiving emails from it, send an email to bfq-iosched+unsubscribe@googlegroups.com.

Reply all

Reply to author

Forward