Contents
The next version of Multiboot Specification (draft)
This is a draft of the next version of Multiboot Specification, the proposal for the boot sequence standard. The final document will be reformatted in Texinfo, and rewritten with more strict words.
NOTE: the discussion on Multiboot Specification is performed in the mailing list grub-devel. If you have any idea or question, please post a mail rather than adding a comment here.
See Multiboot Specification for the details of the current version.
Introduction to Multiboot Specification
Background
Every operating system ever created tends to have its own boot loader. Installing a new operating system on a machine generally involves installing a whole new set of boot mechanisms, each with completely different install-time and boot-time user interfaces. Getting multiple operating systems to coexist reliably on one machine through typical chaining mechanisms can be a nightmare. There is little or no choice of boot loaders for a particular operating system -- if the one that comes with the operating system doesn't do exactly what you want, or doesn't work on your machine, you're screwed.
While we may not be able to fix this problem in existing commercial operating systems, it shouldn't be too difficult for a few people in the free operating system communities to put their heads together and solve this problem for the popular free operating systems. That's what this specification aims for. Basically, it specifies an interface between a boot loader and a operating system, such that any complying boot loader should be able to load any complying operating system. This specification does not specify how boot loaders should work -- only how they must interface with the operating system being loaded.
Although the previous version of Multiboot Specification was successful to address this problem, we found another problem -- every architecture requires its own boot loader. Different boot loaders have different interfaces and functions, so the user may not reuse experience on an architecture for another, and operation systems end up with reimplementing bootstrap code from scratch for every architecture. This fact indicates that we would need a single specification of a boot protocol over multiple architectures as well as operation systems. Thus we decided to write up a new generation of Multiboot Specification, having portability in mind.
Target Architectures
This specification is designed to support multiple architectures. For now, PowerPC and x86-based architectures are supported, but it is not difficult to extend this specification to other architectures.
Target Operating Systems
This specification is targeted toward free 32-bit and 64-bit operating systems that can be fairly easily modified to support the specification without going through lots of bureaucratic rigmarole. The particular free operating systems that this specification is being primarily designed for are Linux, FreeBSD, NetBSD, Mach, and VSTa. It is hoped that other emerging free operating systems will adopt it from the start, and thus immediately be able to take advantage of existing boot loaders. It would be nice if commercial operating system vendors eventually adopted this specification as well, but that's probably a pipe dream.
Boot Sources
It should be possible to write compliant boot loaders that load the OS image from a variety of sources, including floppy disk, hard disk, and across a network.
Disk-based boot loaders may use a variety of techniques to find the relevant OS image and boot module data on disk, such as by interpretation of specific file systems (e.g. the BSD/Mach boot loader), using precalculated block lists (e.g. LILO), loading from a special boot partition (e.g. OS/2), or even loading from within another operating system (e.g. the VSTa boot code, which loads from DOS). Similarly, network-based boot loaders could use a variety of network hardware and protocols.
It is hoped that boot loaders will be created that support multiple loading mechanisms, increasing their portability, robustness, and user-friendliness.
Boot Time Operating System Configuration
It is often necessary for one reason or another for the user to be able to provide some configuration information to an operating system dynamically at boot time. While this specification should not dictate how this configuration information is obtained by the boot loader, it should provide a standard means for the boot loader to pass such information to the operating system.
Simplifying Operating System Development
OS images should be easy to generate. Ideally, an OS image should simply be an ordinary 32-bit or 64-bit executable file in whatever file format the operating system normally uses. It should be possible to nm or disassemble OS images just like normal executables. Specialized tools should not be required to create OS images in a special file format. If this means shifting some work from the operating system to a boot loader, that is probably appropriate, because all the memory consumed by the boot loader will typically be made available again after the boot process is created, whereas every bit of code in the OS image typically has to remain in memory forever. On PC, the operating system should not have to worry about getting into 32-bit mode initially, because mode switching code generally needs to be in the boot loader anyway in order to load operating system data above the 1MB boundary, and forcing the operating system to do this makes creation of OS images much more difficult.
Unfortunately, there is a horrendous variety of executable file formats even among free Unix-like PC-based operating systems -- generally a different format for each operating system. Most of the relevant free operating systems use some variant of a.out format, but some are moving to ELF. It is highly desirable for boot loaders not to have to be able to interpret all the different types of executable file formats in existence in order to load the OS image -- otherwise the boot loader effectively becomes operating system specific again.
This specification adopts a compromise solution to this problem. Multiboot-compliant OS images always contain a magic Multiboot header (see OS image format), which allows the boot loader to load the image without having to understand numerous a.out variants or other executable formats. This magic header does not need to be at the very beginning of the executable file, so kernel images can still conform to the local a.out format variant in addition to being Multiboot-compliant.
Boot Modules
Many modern operating system kernels, such as those of VSTa and Mach, do not by themselves contain enough mechanism to get the system fully operational: they require the presence of additional software modules at boot time in order to access devices, mount file systems, etc. While these additional modules could be embedded in the main OS image along with the kernel itself, and the resulting image be split apart manually by the operating system when it receives control, it is often more flexible, more space-efficient, and more convenient to the operating system and user if the boot loader can load these additional modules independently in the first place.
Thus, this specification should provide a standard method for a boot loader to indicate to the operating system what auxiliary boot modules were loaded, and where they can be found. Boot loaders don't have to support multiple boot modules, but they are strongly encouraged to, because some operating systems will be unable to boot without them.
Terminology
- must
- We use the term must, when any boot loader or OS image needs to follow a rule -- otherwise, the boot loader or OS image is not Multiboot-compliant.
- should
- We use the term should, when any boot loader or OS image is recommended to follow a rule, but it doesn't need to follow the rule.
- may
- We use the term may, when any boot loader or OS image is allowed to follow a rule.
- boot loader
- Whatever program or set of programs loads the image of the final operating system to be run on the machine. The boot loader may itself consist of several stages, but that is an implementation detail not relevant to this specification. Only the final stage of the boot loader -- the stage that eventually transfers control to an operating system -- must follow the rules specified in this document in order to be Multiboot-compliant; earlier boot loader stages may be designed in whatever way is most convenient.
- OS image
- The initial binary image that a boot loader loads into memory and transfers control to start an operating system. The OS image is typically an executable containing the operating system kernel.
- OS module
- Other auxiliary files that a boot loader loads into memory along with an OS image, but does not interpret in any way other than passing their locations to the operating system when it is invoked.
- Multiboot-compliant
- A boot loader or an OS image which follows the rules defined as must is Multiboot-compliant. When this specification specifies a rule as should or may, a Multiboot-complaint boot loader/OS image doesn't need to follow the rule.
Operating System Image Format
An OS image may be an ordinary executable file in the standard format for that particular operating system. It should be linked at a load address which avoids reserved areas of the physical address space. It should not use shared libraries or other fancy features.
An OS image must contain an additional header called Multiboot header, besides the headers of the format used by the OS image. The Multiboot header must be contained completely within the first 8192 bytes of the OS image, and must be 64-bit aligned. In general, it should come as early as possible, for example embedded in the beginning of the text segment after the real executable header.
The header must start with a 32-bit magic number 0xe85250d6 in native endian (i.e. "d6 50 52 e8" in little endian, and "e8 52 50 d6" in big endian).
"Native endian" is ambiguous on architectures that support both modes of execution, such as PowerPC, MIPS, and ARM.
A 32-bit number which specifies flags must follow this magic number. The flags are categorised to four classes.
Bit 0-7 indicate flags which request required features and have the same effect on all architectures. If a boot loader does not fulfill the requirement for any of these bits, the boot loader must notify the user and fail in loading the OS image.
Bit 8-15 indicate flags which request optional features and have the same effect on all architectures. Even if a boot loader does not fulfill the requirement for any of these bits, the boot loader may simply ignore them and proceed as usual.
Bit 16-23 indicate flags which request required features and are specific to one or more architectures. If a boot loader does not fulfill the requirement for any of these bits, the boot loader must notify the user and fail in loading the OS image.
Bit 24-31 indicate flags which request optional features and are specific to one or more architectures. Even if a boot loader does not fulfill the requirement for any of these bits, the boot loader may simply ignore them and proceed as usual.
All undefined flags should be set to zero for future use.
Bit 0-7 are all undefined.
Bit 8 requests that the address fields, which are defined below, are valid. If this bit is set, a boot loader must use these addresses instead of obtaining information from the executable format to load the OS image. This information may not be provided if the OS image is in ELF format, but it must be provided if the images is in a.out format or in any other format. A boot loader must be able to load images that either are in ELF format or contain the address information embedded in the Multiboot header. It may directly support other executable formats, such as particular a.out variants.
Bit 9-15 are all undefined.
Bit 16-31 may be defined in architecture-specific sections.
Address fields may follow the flags. These fields are valid only if bit 8 in the flags is set. If the bit is clear and a boot loader does not support the OS image directly, the boot loader must fail in loading the OS image.
Each of the address fields is a 32-bit or 64-bit integer, depending on an architecture. If an architecture-specific section does not define an additional method to specify address size, a boot loader must follow natural address size which is defined on each architecture.
These address fields are defined:
- header_addr
- The address corresponding to the beginning of the Multiboot header -- the physical memory location at which the magic value is supposed to be loaded. This field serves to synchronize the mapping between OS image offsets and physical memory addresses.
- load_addr
- The physical address of the beginning of the text segment. The offset in the OS image file at which to start loading is defined by the offset at which the header was found, minus (header_addr - load_addr). load_addr must be less than or equal to header_addr.
- load_end_addr
- The physical address of the end of the data segment. (load_end_addr - load_addr) specifies how much data to load. This implies that the text and data segments must be consecutive in the OS image; this is true for existing a.out executable formats. If this field is zero or omitted, the boot loader assumes that the text and data segments occupy the whole OS image file.
- bss_end_addr
- The physical address of the end of the bss segment. The boot loader initializes this area to zero, and reserves the memory it occupies to avoid placing boot modules and other data relevant to the operating system in that area. If this field is zero or omitted, the boot loader assumes that no bss segment is present.
- entry_addr
- The physical address to which the boot loader should jump in order to start running the operating system.
They must appear in this order with no padding. They must be used to load an OS image whose format is not supported by a boot loader. This is historically called "a.out kludge", but can be used for any other format, including a raw binary.
Flags and fields specific to i386-pc
Bit 16 indicates that the video mode table defined in Boot Information Format must be available, and a boot loader must interpret video fields. The video fields follow the flags or the address fields if present. Each field is 32-bit, and they must appear in this order:
- mode_type
0 for linear graphics mode or 1 for EGA-standard text mode. Everything else is reserved for future expansion. Note that the boot loader may set a text mode, even if this field contains 0.
- width
- The number of the columns. This is specified in pixels in a graphics mode, and in characters in a text mode. The value zero indicates that the OS image has no preference.
- height
- The number of the lines. This is specified in pixels in a graphics mode, and in characters in a text mode. The value zero indicates that the OS image has no preference.
- depth
- The number of bits per pixel in a graphics mode, and zero in a text mode. The value zero indicates that the OS image has no preference.
Bit 17-31 are all undefined.
Flags and fields specific to x86_64-pc
Bit 16 has the same definition as i386-pc.
Bit 17 indicates that a boot loader must define the address size as 64-bit. All the address fields must be interpreted as 64-bit integers, and the Multiboot information must pass 64-bit addresses. If this bit is not set, a boot loader must define the address size as 32-bit and 64-bit for ELF32 and ELF64, respectively. If the OS image is not in ELF format and the bit is not set, a boot loader must define the address size as 32-bit.
Bit 18-31 are all undefined.
Machine State
The machine state right after an OS image is booted is architecture-dependent. Generally, a register contains a magic number 0x36d76289 in native format which indicates that the OS image is booted by a Multiboot-compliant boot loader, and another register contains an address pointing to a Multiboot information structure defined below. The other registers and controllers must be in a state that an OS image can initialize a system in any way successfully.
This statement needs clarification.
i386-pc
This section should be identical to the current version, besides that EAX must contain 0x36d76289 in little endian.
This conflicts with boot semantics of Open Firmware on i386 (where EAX contains client interface callback pointer).
x86_64-pc
This follows the machine state on i386-pc. All additional registers are undefined.
PowerPC
Initial register state:
r3 |
0x36d76289 |
r4 |
pointer to Multiboot Information structure |
On systems booting from Open Firmware, the following registers are also defined:
r5 |
client interface callback pointer as specified by firmware |
All other registers are undefined.
Multiboot Information Format
Upon entry to the operating system, a register, which is defined in Machine State, must contain the physical address of a Multiboot information data structure, through which a boot loader communicates vital information to an OS image. An OS image may use or ignore any parts of the structure as it chooses; all information passed by the boot loader is advisory only.
The Multiboot information structure and its related substructures may be placed anywhere in memory by the boot loader (with the exception of the memory reserved for the OS image and modules, of course). It is the operating system's responsibility to avoid overwriting this memory until it is done using it.
The format of the Multiboot information structure is a list of tags, and each tag follows this structure:
+-----+--------+-------+ | key | length | value | +-----+--------+-------+
Each field must be 32-bit or 64-bit aligned, depending on the architecture.
- key
- a 32-bit integer which specifies the type of this tag.
- length
a 32-bit integer which specifies the size of this tag in bytes, including the key and length fields.
- value
- opaque data which depends on the type of this tag.
length may be larger than the size of value; this may occur if a tag is extended in a future multiboot version. In that case, the data after value must be ignored by the OS.
All the integers must be natural endianness for the OS image.
Again, "natural endianness" is too vague.
Tags are categorized as common or architecture-specific items. Keys which are less than 0x10000 indicate common tags which are available to all architectures, while keys which are equal to or greater than 0x10000 indicate items specific to one or more architectures.
These sections below describe defined items. An OS image must ignore items when it does not support them. This allows new tags to be added to the specification in the future without breaking older operating systems.
In each sub-section, the structure for a given key is specified. When a value has two numbers delimited by /, the first number is used for 32-bit and the last is used for 64-bit. All integer size is 32-bit or 64-bit, depending on the architecture, if not explicitly specified.
Common Tags
Start
+-----+--------+------+--------+ | key | length | size | number | +-----+--------+------+--------+
This tag specifies extra information about the Multiboot information structure. This item must be the first tag in the list.
- key
- 0x0001
- length
- 8 / 16
- size
- the byte size of the whole Multiboot information structure, including this tag itself.
- number
- the number of items in the Multiboot information structure, including this tag itself. (To be implemented in GRUB 2)
End
+-----+--------+ | key | length | +-----+--------+
This tag specifies extra information about the Multiboot information structure. This tag must be the last tag in the list.
- key
- 0xFFFF
- length
- 0
Boot Loader Name
+-----+--------+------+ | key | length | name | +-----+--------+------+
This tag specifies the name and version of a boot loader used to boot up an OS image. One tag of this type must be present. This tag should contain a string that enables operating systems to distinguish between different bootloaders and different versions of the same bootloader.
- key
- 0x0002
- length
- the length of name, including the last NUL character.
- name
- a C-style NUL-terminated string indicating the name of a boot loader.
Module
+-----+--------+------+------+---------+ | key | length | addr | size | cmdline | +-----+--------+------+------+---------+
This tag specifies an OS module. This item may appear zero or more times. The order of modules is not guaranteed but a boot loader should put the items in the same order as the user specifies to the boot loader, if the modules are specified by the user.
- key
- 0x0003
- length
- 12 / 24 + the length of cmdline, including the last NUL character.
- addr
- the starting address of a module.
- size
- the size of a module in bytes.
- cmdline
- a C-style NUL-terminated string specified to a module.
Memory Map
+-----+--------+------+------+------+ | key | length | addr | size | type | +-----+--------+------+------+------+
This tag specifies a memory region which may or may not be available to an OS image. This item must appear one or more times. The order of memory maps is not guaranteed but a boot loader should sort the items based on the starting addresses.
Tags of this type should be omitted on architectures where the OS is able to retrieve this information from firmware. (Doing do will encourage OS portability across bootloaders, and simplify GRUB development and maintenance.)
- key
- 0x0004
- length
- 20 / 24
- addr
- the starting address of a memory region in 64-bit, regardless of the natural word size of an underlying architecture.
- size
- the size of a memory region in bytes in 64-bit, regardless of the natural word size of an underlying architecture.
- type
the type of a memory region. 1 indicates that this memory region is freely available to an OS image. If an unknown value is encountered here, the tag must be ignored.
FIXME: is it better to define more types? if so, what types?
Symbol Table
FIXME: define a symbol table tag
i386-pc Tags
FIXME: define boot device, drive information, config table, apm table, and vbe information tags.
x86_64-pc Tags
FIXME
PowerPC Tags
No additional tags are needed for PowerPC.
Examples
XXX
The change log of this specification
XXX