The Linux Microcode Update Driver

Some of the recent Intel microprocessors have the capability of correcting specific hardware bugs by loading a sequence of bits called a "microcode update" into the CPU. This feature is available on all processors in the Intel P6 family, including Pentium Pro, Celeron, Pentium II, Pentium III, Pentium II Xeon, Pentium III Xeon, and the newly released Pentium 4. This feature is applicable to both single-processor and multi-processor (SMP) systems.

Some of the recent Intel microprocessors have the capability of correcting specific hardware bugs by loading a sequence of bits called a “microcode update” into the CPU. This feature is available on all processors in the Intel P6 family, including Pentium Pro, Celeron, Pentium II, Pentium III, Pentium II Xeon, Pentium III Xeon, and the newly released Pentium 4. This feature is applicable to both single-processor and multi-processor (SMP) systems.

The loading of a microcode update is usually delegated to the BIOS but can also be performed by the operating system without needing to run in a special mode or reboot after the update is done. It is also possible (and quite common) to have the BIOS apply a microcode update to some revision level and later have the OS upgrade it to a newer revision.

The support for microcode update for the P6 family processors was added to the Linux kernel in February 2000 as of version 2.3.46. Support for the Pentium 4 microcode updates was added to Linux 2.4.0-test12 in December of the same year. The driver (P6 family only) has been backported to Linux 2.2 but has lagged behind the “officially-supported” 2.3/2.4 version for quite a while. A set of fixes (including Pentium 4 support) has been sent by the author to Alan Cox and are likely to go into the official 2.2 series at the time of 2.2.19, which was not released at the time of this writing. The other Unix-like IA32 operating systems known to the author that support microcode update (on P6 family only) are SCO OpenServer 5.0.6 and SCO UnixWare 7.1.1. The Linux implementation was written from scratch in the author’s spare time and was not based on any Unix or non-Unix version.

This article explains the design and implementation of the Linux microcode update device driver as present in Linux 2.4.0-test12 kernel. The source code of the driver is found in the arch/i386/kernel/microcode.c file and will be referred to throughout this article. As is obvious from the pathname, the driver is architecture-specific, so its source file is kept in the arch/i386 directory.

Hardware Specification

Luckily, all of the specs and documentation for the microcode update feature of the IA32 architecture are available at no cost to developers. The task at hand is to select the microcode update matching the parameters of the CPU and then to write it to the CPU, unless the CPU already contains microcode with a revision level greater than or equal to that of the update. This is a very high-level, simplified picture of what we want to achieve because real systems may have more than one CPU and, rarely, but possible in principle, these CPUs may be different, in which case a different microcode must be selected correctly for each processor.

We have left several important questions unanswered above. What are the CPU characteristics that uniquely identify it? How does one query if the CPU already has microcode, and what version it is? How does one go about writing the microcode to the CPU? And finally, what is the physical layout of the microcode data and how should it be validated before use?

The characteristics that uniquely identify an IA32 processor are its family, model, stepping, and processor flags. The first three are available in the cpu_data[] array of struct cpuinfo_x86 structures as fields x86, x86_model, and x86_mask respectively. These structures are initialized at boot time. The value of the processor flags is obtained from a special model-specific register (MSR) called MSR_IA32_ PLATFORM_ID in certain processor models; it is assumed to be 0 in others.

To ask the processor what microcode revision it has, we read from another MSR called MSR_IA32_UCODE_REV. To write microcode update to the processor, we write to an MSR called, not surprisingly, MSR_IA32_UCODE_WRITE. The actual numeric values of these symbolic constants are given in the header file include/asm-i386/msr.h, as shown below:

#define MSR_IA32_PLATFORM_ID 0×17
#define MSR_IA32_UCODE_WRITE 0×79
#define MSR_IA32_UCODE_REV 0x8B

The official names for 0×79 and 0x8B are IA32_BIOS_ UPDATE_TRIG and IA32_BIOS_SIGN_ID but are too long and misleadingly refer to the BIOS, so the author has chosen more meaningful and shorter symbolic names for the Linux implementation.

To read and write an MSR, we use the rdmsr and wrmsr instructions, which under Linux are nicely wrapped into the macro in include/asm-i386/msr.h. (See Listing One.)

Listing One: rdmsr and wrmsr Macros

#define rdmsr(msr,val1,val2) \
__asm__ __volatile__(“rdmsr” \
: “=a” (val1), “=d” (val2) \
: “c” (msr))
#define wrmsr(msr,val1,val2) \
__asm__ __volatile__(“wrmsr” \
: /* no outputs */ \
: “c” (msr), “a” (val1), “d” (val2))

The only unusual thing about the macros in the listing is that rdmsr() modifies the parameters directly, i.e. without using pointer indirection. This may look strange in the code if one forgets that we are dealing with inline assembly code. So, code like this:

int x, y;
rdmsr(0×17, x, y);

modifies the values of x and y with the values of the registers %eax and %edx after the rdmsr() instruction has been executed.

The physical layout of microcode data is described by the struct microcode structure declared in the header file include/asm-i386/processor.h:

struct microcode {
unsignedinthdrver;/*Update headerversion*/
unsigned int rev; /* Microcode revision */
unsignedint date;/* Update creation date */
unsigned int sig; /* Processor signature */
unsigned int cksum; /* Checksum */
unsigned int ldrver; /* Loader version */
unsigned int pf; /* Processor flags */
unsigned int reserved[5];
unsignedint bits[500];/* Actualmicrocode*/

The size of microcode structure is 2048 bytes, of which 2000 bytes are the actual microcode bits which are sent directly to the processor located in the bits field of struct microcode. hdrver is the version number of the update header and is presently set to 1. rev is the microcode update revision. It is used to compare with the revision of the microcode on the CPU and to make sure that the update was successful. The date field is the update creation date in binary format (e.g. 18/07/98 is 0x 07181998). The sig field contains the processor family, model, and stepping which were discussed above. The cksum field is the checksum of the entire struct microcode. The checksum is correct if the summation of the entire 512 double words of structmicrocode results in value zero. ldrver is the version number of the loader needed to correctly load this update. This microcode driver supports only the ldrver version number of 1.pf stands for the processor flags, and the last five integers are reserved for future expansion.

On an SMP system, we must follow the procedure for updating the microcode for each processor separately, using a different microcode update for “mixed-stepping” SMP systems.

Note that the microcode update is lost on CPU reset, i.e. there is no persistent storage to hold the microcode data on the CPU itself. This should not be mistaken with the persistent storage maintained by the BIOS that holds microcode update to be applied to the CPU on each reboot. So effectively, if anything goes wrong during the driver development and we upload the wrong microcode, a simple (cold) reboot procedure will clear the state of all processors back to “no update present” (i.e. “update revision is 0″).

With all this in mind we can now proceed to the much more interesting issues of the actual device driver design and implementation.

Design and Implementation of the Microcode Driver

Every time one thinks of writing a device driver for some piece of hardware, one ought to ask oneself if it is possible to do the same in user space. Writing kernel code should be avoided unless absolutely needed, either for performance reasons or because the task at hand requires privileges not available to processes executing in user mode. In the case of a microcode update, we need read and write access to model-specific registers. It would also probably be a good idea to enforce validation of the microcode before sending it to the CPU in kernel mode. So it would seem that we do need a proper kernel space device driver.

Obviously, any device driver needs some user space program to interact with it, even if it is one of the existing programs like dd(1) or cat(1). It is therefore important to decide early on how much work is to be done in the kernel and how much in user space.

For the Linux implementation of the driver, the author decided to perform all the work of selecting the appropriate microcode chunk, checking the revision, validating the checksum, and applying the microcode in kernel space. The only work left for user space is to convert the microcode from the format supplied by Intel to one that is easier to manipulate in the kernel and to control the kernel driver via the ioctl(2) system call.

This design choice is not the only possible one. To give an example, I will mention a non-Linux implementation of the microcode update feature, namely that of SCO UnixWare 7.x. Since the UnixWare kernel allows the running process to bind itself to the current CPU so that it can safely operate on the data structures corresponding to that CPU (knowing that it will never be scheduled to run on a different CPU), it is possible to implement microcode update almost entirely in user space. I say “almost” because the ability to execute privileged instructions, such as reading and writing MSRs, is still restricted to the code running in the kernel. Under Linux, there is no way to bind a user process to a given CPU (although the scheduler does try to avoid shifting processes between CPUs, but this is not guaranteed), so one is faced with the choice of either using multiple device minor numbers corresponding to different CPUs or using a single minor number but doing all the work in the kernel. The former approach is used by the cpuid and msr Linux drivers. Note that the Linux msr driver did not exist at the time that the microcode driver was written.


The first thing a driver needs to do is register itself with the kernel by occupying a well-known major number that was assigned to it. Since our driver does not need more than one minor number, it may well be a misc driver, which means that it occupies a major number shared by other misc drivers but is given a private minor number reserved for it. Also, if devfs is available, one could make use of it to create a virtual regular file in its namespace with the name cpu/microcode that would correspond to a full pathname of /dev/cpu/microcode. In this case, the only reason to use devfs is that a regular file has an attribute that device special files do not — the size of the file. We can store useful information in the size of the /dev/ cpu/microcode file. For example, if the size of the file is 0 then no microcode has been applied, and if the size is N* 2048 bytes then microcode has been applied to all N CPUs on the system.

Listing Two shows the code for the microcode_init() function. The module_init(microcode_init) line, when expanded, turns the microcode_init() function into init_module() if the driver is compiled as a module. Otherwise, it adds the pointer to microcode_init() into the .initcall.init ELF section, which contains initialization functions of various subsystems and is processed by the init/main.c:do_initcalls() function at boot time.

Listing Two: The microcode_init() Function

static devfs_handle_t devfs_handle;

static int __init microcode_init(void)
int error;
error = misc_register(&microcode_dev);
if (error)
“microcode: can’t misc_register on minor=%d\n”,
devfs_handle = devfs_register(NULL, “cpu/microcode”,
&microcode_fops, NULL);
if (devfs_handle == NULL && error) {
printk(KERN_ERR “microcode: failed to devfs_register()\n”);
goto out;
error = 0;
“IA-32 Microcode Update Driver: v%s <tigran@veritas.com>\n”,
return error;


As we can see from Listing Two, the microcode_init() function attempts to register a misc device on minor MICROCODE_MINOR(184) and a virtual regular file in devfs namespace on the name cpu/microcode with credentials root:root and permissions 0600. If both of these methods fail then the initialization is considered to fail, otherwise it succeeds. This implies that if devfs is available then the microcode driver can be accessed either as a misc device or as a regular file. Since both methods point to the same file_ operations structure (microcode_fops), there are no issues with coherency. Or more precisely, the issues are exactly the same as when dealing with access by multiple processes to a normal device driver. To serialize access to the device, a mutual exclusion primitive called a binary kernel semaphore (or just mutex) is used by the driver:

/*read()/write()/ioctl() are serialized on this
*/ static DECLARE_RWSEM(microcode_rwsem);

We do a down_write() on the mutex in the write() and ioctl() methods and a down_read() in the read() method. This means that multiple processes can simultaneously read the device but only one process can write or issue control requests to it. Also, there can be no readers if there is a process writing or issuing ioctl requests.


Deregistration is dealt with by the microcode_exit() function that is invoked on module removal if the driver is compiled as a module and is ignored when the driver is compiled statically. You can see this in the microcode_ exit() function below:

static void __exit microcode_exit(void)
if (mc_applied)
printk(KERN_INFO “IA-32 Microcode Update
Driver v%s unregistered\n”,


Note that it is not a bug in microcode_exit() to call both misc_deregister() and devfs_unregister(), even though only one of them may have succeeded, because their internal implementation deals correctly with this case. The mc_applied variable holds a copy of the microcode chunks that were successfully applied and must be freed on driver unload. It is allocated on demand in the microcode_write() routine when one tries to send the microcode to the CPUs. It is left allocated if the update fails.

Updating Microcode

The procedure of updating microcode on the CPU is considered privileged and must be reserved only for the superuser. There are many ways to deal with security under Linux. One way is to view security as a “policy,” thus leaving it entirely up to the user space. Of course, user space must abide by the constraints of the usual Unix permission semantics. In this way of doing things, the driver does not need to deal with any permission issues but simply relies on the system administrator to correctly set the permissions on the device node file /dev/cpu/microcode. This is adequate because the mknod(2) system call is restricted to the superuser when used for creating character or block special files. The second way is to view security as a matter of ensuring the sanity of kernel data structures. This implies that whatever the permissions are on the device node, the driver must prohibit access to processes it believes are insufficiently privileged for the operation in question. The microcode driver uses the latter approach; that is, its microcode_open() routine checks permissions, as shown in Listing Three.

Listing Three: Checking Permissions

static int microcode_open(structinode*unused1, structfile*unused2)
return capable(CAP_SYS_RAWIO) ? 0 : -EPERM;

The CAP_SYS_RAWIO capability is the same as is required for accessing /dev/kmem or /proc/kcore files and seems a sensible one for restricting access to /dev/cpu/microcode.

The microcode_write() routine performs the following steps:

  1. Tests the length of the write request: if not a multiple of 2048 bytes then fail with EINVAL.
  2. Obtains the microcode_rwsem semaphore in exclusive (WRITE) mode.
  3. Allocates mc_applied, a buffer of N*2048 bytes (using kmalloc() with GFP_KERNEL) to store the applied microcode, where N is the number of CPUs on the system. This allocation request may fail, in which case we release the semaphore and return ENOMEM to the user.
  4. Allocates a kernel buffer (using vmalloc()) large enough to hold the user-supplied sequence of microcode chunks. If this request fails, we return to user space without freeing mc_applied, hoping that it may be needed later. The vmalloc() function is preferred over the kmalloc() function because the buffers may be very large (on the order of 100-200K), and we do not need a physically contiguous area but only a virtually contiguous one; so vmalloc() can suffice.
  5. Copies the entire sequence of microcode chunks from the user into the vmalloc()ed buffer. If the user-supplied buffer is invalid (e.g. NULL) then the routine fails with EFAULT.
  6. Attempts to perform the microcode update by calling do_microcode_update(). On success, sets the file size of the devfs virtual file to reflect the currently held microcode data. Otherwise returns with EIO.

Now we get to the heart of the driver contained in the routines do_microcode_update() and do_update_ one(). The former deals with the need to update microcode on each CPU of an SMP system and also collects the return status from the operation on each CPU and turns them into a single integer return code. The do_update_ one() function performs the actual update on a single CPU.

To communicate the status of the microcode update operation from each CPU, we use an array of struct update_req structures with one element per CPU. Each element contains two fields, err and slot. If err is 0 then the microcode update succeeded on the corresponding CPU; otherwise it failed. If err is 0 then the slot field contains the index of the chunk of microcode from the sequence supplied by the user.

The do_microcode_ update() routine assumes that the entire operation has failed if it failed on at least one CPU. If the update succeeded on all CPUs, then do_ microcode_update() copies the corresponding struct microcode chunks into the slots of the mc_ applied buffer previously allocated for this.

In order to ensure that do_ update_one() is invoked on every CPU, the do_ microcode_update() function makes use of a special Linux/SMP kernel facility that is named smp_call_function(), which uses the Intel interprocessor interrupt mechanism. The first two arguments to smp_call_function() include: the pointer to the function to be called, an argument to be passed to that function (NULL in our case), and two flags saying “keep retrying until ready’” and “wait until function has completed on other CPUs,” both of which we set to 1.

The reason that the do_microcode_ update() needs to call do_update_ one() directly is that smp_call_ function() has the semantics of “call this on all CPUs except the current one.”

The first task of do_update_one() is to check if the CPU is capable of a microcode update. The CPU is considered incapable if it is not a genuine Intel processor, its family value is less than 6 (386, 486, and Pentium chips fall into this category), or if it is an IA64 processor (Itanium).

do_update_one() calculates the signature of the CPU by combining the family, model, and stepping in a single integer variable sig:

sig = c->x86_mask + (c->x86_model<<4) + (c->x86<<8);

If the CPU model is greater than or equal to 5, or its family number is greater than 6 (indicating a Pentium 4 chip), then the processor flags are obtained from the MSR_IA32_ PLATFORM_ID register and, after some conversion, stored in the local variable pf.

The function then iterates through the array of struct microcode elements, which were copied from the user, and compares the values of the signature and processor flags with those of the current CPU. If they match, the current microcode revision is obtained from the CPU by reading the MSR_IA32_UCODE_REV register. If the CPU’s current microcode revision is greater than or equal to the revision of struct microcode, the update is refused and a kernel message is printed. Otherwise, a checksum is calculated on the entire struct microcode. If the checksum matches, the microcode->bits[] array is written to the processor via the MSR_IA32_UCODE_WRITE register.

The cpuid instruction is executed to prevent the compiler from reordering the rdmsr and wrmsr instructions. The last thing do_update_one() does is re-read the current revision from MSR_IA32_UCODE_REV register and print a message indicating successful update. The revision number contained in this message (hopefully) coincides with the revision of microcode we have just written to the CPU. The update_req->err is then set to 0 and update_ req->slot to the index of microcode in the array of chunks and function returns.


We conclude our consideration of the microcode driver by describing its ioctl() method. The driver (at the time of the writing of this column) implements a single ioctl called MICROCODE_IOCFREE, the purpose of which is to free the kernel buffer containing the last applied microcode (or lazy-allocated but not initialized due to last update failure).

Listing Four shows that mutual exclusion between the ioctl() and write() functions is achieved through the use of the microcode_rwsem semaphore. The typical use of this ioctl would be to free the buffers if the attempt to update microcode has failed and no further attempt is intended. Note also that freeing the buffers here implies resetting the size of the virtual devfs file /dev/cpu/microcode to 0 to prevent processes from attempting to read data that is not there. Also, ENODATA is chosen as the error to return from ioctl() request when no buffer is currently allocated and thus nothing can be freed.

Listing Four: The microcode_ioctl() Function

static int microcode_ioctl(struct inode *inode, struct file *file,
unsigned int cmd, unsigned long arg)
switch(cmd) {
if (mc_applied) {
int bytes = smp_num_cpus * sizeof(struct microcode);
devfs_set_file_size(devfs_handle, 0);
mc_applied = NULL;
printk(KERN_INFO “microcode: freed %d bytes\n”, bytes);
mc_fsize = 0;
return 0;
return -ENODATA;
printk(KERN_ERR “microcode: unknown ioctl cmd=%d\n”, cmd);
return -EINVAL;
return -EINVAL;

There you have it. If you haven’t already done it by now, go get a copy of a 2.4.0-test12 or greater kernel and check out the implementation for yourself. If you are interested in diving deeper into this subject, there are other resources available that can give you more information. We’ve listed some of them in the Resources box. Have fun updating your microcode.


Tigran Aivazian is the author and maintainer of the Linux IA32 microcode update device driver and BFS filesystem driver. He is a random kernel hacker who likes fixing anything he finds broken. He can be reached at tigran@veritas.com.

Comments are closed.