With the recent interest in varying the cpu schedulers in linux, this set of patches provides a modular framework for adding multiple boot-time selectable cpu schedulers. William Lee Irwin III came up with the original design and I based my patchset on that.
This code was designed to touch the least number of files, be completely arch-independant, and allow extra schedulers to be coded in by only touching Kconfig, scheduler.c and scheduler.h. It should incur no overhead when run and will allow you to compile in only the scheduler(s) you desire. This allows, for example, embedded hardware to have a tiny new scheduler that takes up minimal code space.
This works by taking all functions that will be common to all scheduler designs and moving them from kernel/sched.c into kernel/scheduler.c.
Then it adds the scheduler driver struct into scheduler.h which is a set of pointers to functions that will all have per-scheduler versions. include/linux/scheduler.h has the definitions for the scheduler driver structure
kernel/sched.c remains as the default cpu scheduler in the same place to minimise the patch size and portability of the patch set.
All variables of the task_struct that could be unique to a different scheduler are now in a private struct held within a union in task_struct. rt_priority and static_priority are kept global for userspace interface and for the possibility of adding run-time switching later on.
The main disadvantage of this design is that there will (initially) be a lot of code duplication by different scheduler designs in their own private space. This will mean that if a new scheduler uses the same smp balancing algorithm then it will need to be modified to keep in sync with changes to the default scheduler. If, for example, you modified just the dynamic priority component of the current scheduler and left the runqueue and task_struct the same, you could make it depend on the default scheduler and point most functions to that one.
However, the same disadvantage can be a major advantage. The fact that so much of the scheduler is privatised means that wildly different designs can be plugged in without any reference to the number of runqueues, frame schedulers could be plugged in, shared runqueues (eg on numa designs), and we could even keep new balancing in a "developing" arm of the scheduler that can be booted by testers and so on.
What is left to do is add a per-scheduler entry into /sys which can be used to modify the unique controls each scheduler has, and write up some documentation for this and staircase.
Anyway the patches will follow shortly, and then (not surprisingly) a port of the staircase scheduler to plug into this framework which can also be used as an example for others wishing to code up or port their schedulers.
While I have tried to build this on as many configurations as possible, I am sure breakage will creep in given the type of modification so I apologise in advance.
Patches for those who want to download them separately here: