As a headcase, in my spare time (among other things) I’m writing an operating system kernel. There is nothing much at this moment because I’m digging into boot process of x86 system1. And, to commit my knowledge so far, I’ll explain first simple but really important steps of booting trivial kernel.
For illustrations I’m gonna use “Hello world” kernel that is written in NASM assembly (grab the source from github):
global start ; the entry symbol for ELF
MAGIC_NUMBER equ 0x1BADB002 ; define the magic number constant
FLAGS equ 0x0 ; multiboot flags
CHECKSUM equ -MAGIC_NUMBER ; calculate the checksum
; (magic number + checksum + flags should equal 0)
section .text: ; start of the text (code) section
align 4 ; the code must be 4 byte aligned
dd MAGIC_NUMBER ; write the magic number to the machine code,
dd FLAGS ; the flags,
dd CHECKSUM ; and the checksum
start: ; the loader label (defined as entry point in linker script)
mov ebx, 0xb8000 ; VGA area base
mov ecx, 80*25 ; console size
; Clear screen
mov edx, 0x0020; space symbol (0x20) on black background
clear_loop:
mov [ebx + ecx], edx
dec ecx
cmp ecx, -1
jnz clear_loop
; Print red 'A'
mov eax, ( 4 << 8 | 0x41) ; 'A' symbol (0x41) print in red (0x4)
mov [ebx], eax
.loop:
jmp .loop ; loop forever
This kernel works with VGA buffer - it clears the screen from the old BIOS messages and print capital ‘A’ letter in red. After it, it just loop forever.
Compile it with
nasm -f elf32 kernel.S -o kernel.o
nasm
generates object file, which is NOT suitable for executing because its
addresses need to be relocated from base address 0x0
, combined with other
section, resolve external symbols and so on. This is a job of the linker
program.
When compiling program for userspace application gcc
will invoke linker for
you with default linker script. But for kernel space code you must provide your
own link script that will tell where to put various sections of the code. Our
kernel code has only .text
section, no stack or heap, and multiboot header is
hardcoded into .text
section. So link script is pretty simple:
ENTRY(start) /* the name of the entry label */
SECTIONS {
. = 0x00100000; /* the code should be loaded at 1 MB */
.text ALIGN (0x1000) : /* align at 4 KB */
{
*(.text) /* all text sections from all files */
}
}
I’ve already touched linking part in Restricting program memory article.
Basically, we’re saying “Start our code at 1MiB and put section .text
in the
beginning with 4K alignment. Entry point is start
”.
Link like this:
ld -melf_i386 -T link.ld kernel.o -o kernel
And run kernel directly with QEMU:
$ qemu-system-i386 -kernel kernel
You’ve got it:
When computer is being power up it starts executing code according to its “reset
vector”. For modern x86 processors it is 0xFFFFFFF0
. At this address motherboard
sets jump instruction to the BIOS code. CPU is in “Real mode” (16 bit
addressing with segmentation (up to 1MiB), no protection, no paging).
BIOS does all the usual work like scan for devices and initializes it and finds bootable device. After bootable device found it passes control to bootloader on this device.
Bootloader loads itself from disk (in case of multi-stage) finds kernel and load it into memory. In the dark old days every OS had its own format and rules, so there was a variaty of incompatible bootloaders. But now there is a Multiboot specification that gives your kernel some guarantees and amenities in exchange to comply the specification and provide Multiboot header.
Dependence on Multiboot specification is a big deal because it helps make the life MUCH easier and this is how:
In general, booting multiboot compliant kernel is simple, especially if it’s in ELF format:
0x1BADB002
)In our kernel’s text section we’ve done it:
MAGIC_NUMBER equ 0x1BADB002 ; define the magic number constant
FLAGS equ 0x0 ; multiboot flags
CHECKSUM equ -MAGIC_NUMBER ; calculate the checksum
; (magic number + checksum + flags should equal 0)
section .text: ; start of the text (code) section
align 4 ; the code must be 4 byte aligned
dd MAGIC_NUMBER ; write the magic number to the machine code,
dd FLAGS ; the flags,
dd CHECKSUM ; and the checksum
We didn’t specify any flags because we don’t need anything from bootloader like memory maps and stuff, and bootloader doesn’t need anything from us because we’re in ELF format. For other formats you must supply loading address in its multiboot header. Multiboot header is pretty simple:
Now lets boot our kernel like a serious guys.
First, we create ISO image with help of grub2-mkrescue
. Create hierarchy like
this:
isodir/
└── boot
├── grub
│ └── grub.cfg
└── kernel
Where grub.cfg is:
menuentry "kernel" {
multiboot /boot/kernel
}
And then invoke grub2-mkrescue
:
grub2-mkrescue -o hello-kernel.iso isodir
And now we can boot it in any PC compatible machine:
qemu-system-i386 -cdrom hello-kernel.iso
We’ll see grub2 menu, where we can select our “kernel” and see the red ‘A’ letter.
Isn’t it great?
My brain hurts: all these real/protected mode, A20 line, segmentation, etc. are so quirky. I hope ARM booting is not that complicated. ↩︎