Priority scheduling in Minix

That was the topic of the second assignment for DISO (Diseño e implementación de Sistemas Operativos, or Operating Systems Design and Implementation). After finishing the implementation of an idea on how to do it, we had to run a program twice, change the priority of one of them, and verify that one printed its pid more times than the other.

The problem statement was clear and provided quite a bit of guidance on how to do it. We had also attended a lab class, so no major complications should have arisen. Although that didn’t really turn out to be the case.

It would be really nice to know how the Linux kernel people debug it. Pablo Pessolani, the theory professor and head of DISO, explained to us that the technique consists of using the very well-known printf function, as simple as that. After thinking about it for a bit, the truth is that there doesn’t seem to be any other alternative. Following the same strategy, that’s how we tested the modified code of Minix 2.0.2 to complete the second assignment.

However, I was still curious about how Linux hackers test their modifications. I thought they might have some more comfortable and adequate way. So I looked into what techniques they use, out of simple curiosity, and the discovery surprised me a bit.

Pablo told us that he programmed drivers for Linux. If he says the printf function is used, we have to believe him. And that’s the case. Debugging is done with traces, or at least that’s one of the simplest ways to do it. Since the Linux kernel doesn’t link with the standard C library, glibc, it uses its own function for the same purpose, called printfk. It can be used in almost any part of it, except during the boot process, since the console must be initialized.

I got this information from this article. I love the conclusion that the mentioned text reaches:

As a conclusion, let me mention a suggestive reflection that Robert Love makes in his excellent book: when we start hacking the kernel, we will often use the printf() function instead of printk(), we won’t be able to avoid the natural tendency from many years of using that function in our user space programs. But someday, writing a simple user program, we will make the opposite mistake, we will use printk() instead of printf(). When that day comes, we can say that we are true kernel hackers.

kernel-labs.org is a highly recommended site for searching information, in Spanish, about the Linux kernel.

Getting back on topic, and remembering that limitation of using printfk in Linux, perhaps in Minix there’s something similar. I remember when I touched a part of the Memory Manager, adding a printf, and the kernel simply wouldn’t boot. I needed to see the ready process list (rdy_head[USER_Q]), since I didn’t understand why something wasn’t working. So what I did was, in the code to display that list, call the printf function only when the sh process was running. It worked.

Minix has a ready process queue, as I said before. The information structure of each process has a pointer to the next ready process. Thus, that queue is accessible from an array, rdy_head[USER_Q] (head of the queue). rdy_head[SERVER_Q], for example, is a pointer to the first ready server (Memory Manager, File System Manager, etc). The same with rdy_head[TASK_Q], but for tasks. In reality, Minix does use priorities at a global level, let’s say, because it first checks if there are tasks ready, if not, it runs the ready servers. If there are no ready servers, only then does it move to user programs. That’s where we had to implement a priority scheduler, since what Minix does is, simply, after the timeslice of the running process ends, put it at the end of the queue and run the next one.

The assignment statement told us to add a field, priority, to the process structure, to store the priority of each process, which could later be modified through a system call. How would you implement such functionality without starving the other processes? That is, so that everyone receives at least some CPU time, despite the existence of higher priority processes, while honoring the right of some to run for a longer period of time.

The professor told us that, for our idea, a single field (priority) wasn’t enough. So we added another one, called penalty. We came up with the idea of penalizing processes in the following way: Initialize all processes, including those forked from high-priority processes, with the priority field at 15 (the lowest priority), and the penalty field at zero. When a process’s timeslice ends, its priority is added to the penalty field, and it’s repositioned in the ready process list, sorted by both values: priority and penalty. That is, a process with priority 10 and penalty 0, upon finishing execution, would have a penalty of 10, it would be repositioned in the list, and if it still has the highest priority (remember they’re ordered by the result of priority + penalty) it continues having the CPU, and when its quantum expires again it would have a value of 20 in its penalty field. This way, at some point it would have a priority + penalty value greater than the next ready process, which would effectively cause a change in the next process to run. This initial idea was going to work, but not without some tweaks.

One problem that can arise is that the penalty field reaches the limit of the long data type. In that case, if I’m not mistaken, the penalty would become the smallest value (which is negative), and the process would always be placed at the beginning of the queue, monopolizing the processor for a very long time. We could have used an unsigned long type for the penalty field, although that wouldn’t work either, for the same reason.

Another thing: what happens if the penalty field is around, on average, 10000, and a new process is created? The answer is that the new process (with penalty equal to zero) would have the CPU until the sum of its base priority plus the penalty exceeds 10000. At some point the effect would end, but it’s annoying, obviously. The solution (there are others, but this is the one we chose) is to reset all penalty fields of all processes to zero, under certain conditions. This cannot be done randomly, because if the list consists of processes with the following priorities: 1 – 1 – 5 – 15 – 1, and all penalties are reset to zero, that last process with priority 1 wouldn’t be getting CPU time as it should.

The sufficient condition (the tests seemed to indicate that we weren’t wrong about this) is the following: if a process, when its timeslice ends, is placed at the end of the queue, and the next process has a lower priority, then all penalty fields are set to zero.

In the end, there are many drawbacks and things to think about. Maybe we missed some case that causes an error, but our implementation, at the time of the assignment submission, seemed to work.

You can download the source code of the modified files, plus the document we submitted, from the apUnTN page (a project by some classmates from the university, where you can find several assignments from various courses), at this link.