Down Memory Lane, Part One

Most embedded devices boot from flash memory and have data resident on flash-based storage. Here’s how to use flash devices while embedding Linux.
Today, Linux has penetrated the embedded market and is no longer just a desktop operating system. Linux avatars manifest in personal digital assistants (PDAs), music players, set-top boxes, cell phones, stereo components, and even medical-grade devices.
When you flick the power switch of such a self-contained device, it’s more than likely that it boots from flash memory. Moreover, when you click some buttons to save data, it’s also highly likely that your data is persisted in flash memory.
The Memory Technology Devices (MTD) subsystem of the kernel interfaces your system with the various flavors of flash memory found in portable devices. In this month’s “Gearheads” column, let’s use the example of a Linux handheld to learn about MTD.

The Linux Memory Technology Devices Subsystem

Flash memory is rewritable storage that doesn’t need a power supply to retain information. Flash memory banks are usually organized into sectors, and unlike conventional storage, a write to a flash address must be preceded by an erase of the location. However, erases can be performed only at the granularity of individual sectors.
Given its properties, flash memory is best used with device drivers and filesystems that are specially-designed to suit it. The kernel’s MTD subsystem provides support for flash and similar solid-state storage.
The Linux MTD layer, shown in Figure One, consists of the following:
*The MTD core, which is an infrastructure consisting of library routines and data structures used by the rest of the MTD subsystem;
*Map drivers that decide what the processor ought to do when it receives requests for accessing flash;
*Chip drivers that know about the commands required to talk to the flash chip;
*A set of user modules, which interact with user space programs;
*And drivers for certain specific flash devices.
To clarify the internals of Linux MTD and to demystify commonly used flash terminology, let’s look at a sample Linux handheld, as shown in Figure Two.
The device’s flash chip is labeled a NOR device. Flash memory chips generally come in two flavors, NOR and NAND. NOR is the variety usually found on embedded devices, while NAND is the kind normally found inside solid-state mass storage devices, such as USB pen drives. NOR flash chips are connected to the processor via address and data lines like normal RAM; NAND flash is interfaced using a slim sequential interface. Code resident on NOR flash can be executed directly, while that stored on NAND flash must be copied to RAM before execution. NAND flash has the added limitation that it restricts the number of consecutive writes to a flash sector before an erase is necessitated.

Creating the MTD Map

To MTD-enable your device, your first task is to tell MTD where to map your handheld’s flash memory in the processor’s address space. Also, since you cannot use disk partitioning tools like fdisk on flash devices, you must inform MTD about flash partitions that you want to create. These are accomplished by writing a map driver.
The flash layout used in Figure Two has to be translated to an mtd_partition structure. The device in the figure has three flash partitions, one each for the bootloader, the kernel, and the root filesystem. The bootloader partition starts from top of the flash, the kernel partition begins at offset MY_KERNEL_START, and the root filesystem starts at offset MY_FS_START. (Some devices have additional partitions for bootloader parameters and extra filesystems.) The bootloader and kernel are better off residing on read-only partitions to avoid unexpected damage, while the filesystem partition is usually flagged read-write.
Listing One contains the corresponding mtd_partition definition.
Listing One: A Memory Technology Devices (MTD) partition map corresponding to the device shown in Figure One

static struct mtd_partition coolpda__partitions [] =
name: “coolpda_btldr”, /* Bootloader partition */
offset: 0, /* Start from top of the flash */
mask_flags: MTD_WRITEABLE /* Read-Only partition */
}, {
name: “coolpda_krnl”, /* Kernel partition */
offset: MTDPART_OFS_APPEND, /* Start immediately following the bootloader partition */
mask_flags: MTD_WRITEABLE /* Read-Only partition */
name: “coolpda_fs”, /* Filesystem partition */
size: MTDPART_SIZ_FULL, /* Use up the rest of the flash */
offset: MTDPART_OFS_APPEND, /* Start immediately following the kernel partition */

Now that the partition data structure is populated, you can write a basic map driver for the example handheld. First, register your map driver with MTD as shown in Listing Two.
Listing Two: Registering the map driver

static struct device_driver coolpda_map_driver =
.name = “coolpda”,
.probe = coolpda_mtd_probe,
.remove /* Destroy entrypoint */
.suspend /* Power Management related */
.resume /* Power Management related */
.bus /* Bind to platform driver */

/* Driver Initialization */
static int __init coolpda_mtd_init (void)
return driver_register (&coolpda_map_driver);

The probe routine, coolpda_mtd_probe(), forms the crux of the map driver. First, the routine probes the flash via the chip driver layer, since only the chip driver knows how to query the chip and elicit the command-set required to access it. The chip layer can try different permutations of bus widths and interleaves while querying.
In this example, two 8-bit flash banks are connected in parallel to fill the 16-bit processor bus width, so you have a two-way interleave.
/* Defined in linux/mtd/mtd.h */
struct mtd_info * coolpda_mtd;

/* Defined in linux/mtd/map.h */
struct map_info * coolpda_map;

/* Populate coolpda_map with information
* like bus width and entry points for
* performing basic read/write on flash.
* See drivers/mtd/maps/sa1100-flash.c for
* an example

/* .. */

/* Probe using the CFI chip driver */
coolpda_mtd = do_map_probe("cfi_probe",
(Don’t worry if cfi_probe sounds esoteric — it’s discussed shortly.) Next, coolpda_mtd_probe() registers the mtd_partition structure populated earlier with the MTD core:
3 /* there are 3 partitions */);
Now MTD knows how to access your flash device and knows how it’s organized. When you boot the kernel with your map driver compiled in, user space applications can see your bootloader, kernel, and filesystem partitions as /dev/mtd0, /dev/mtd1, and /dev/mtd2, respectively. So, to update the kernel image on the flash device, you can use dd to transfer the image as usual:
$ dd if=bzImage of=/dev/mtd1 

The Chip Driver

As you must’ve noticed, the flash chip used by the handheld in Figure Two is labeled CFI-compliant. CFI is an acronym for the Common Flash Interface, a specification designed to do away with the need for separate drivers to support chips from different vendors. Software can query CFI-compliant flash chips and automatically detect configurations, timing parameters, and the command-set used for communication.
According to the CFI specification, software must write 0×98 to location 0×55 within flash memory to initiate a query. Look at Listing Three to see how MTD implements CFI query.
Listing Three: A code snippet to query Common Flash Interface- compliant flash

/* Snippet from cfi_probe_chip() defined in
* drivers/mtd/chips/cfi_probe.c, modified for
* simplicity

/* Defined in include/linux/mtd/cfi.h */
struct cfi_private * cfi;

/* .. */

/* Ask the device to enter query mode by
* sending 0×98 to offset 0×55
cfi_send_gen_cmd (0×98, 0×55, base, map,
cfi, cfi->device_type, NULL);

/* If the device did not return the ASCII characters
* ’Q’, ’R’ and ’Y’, the chip is not CFI-compliant
if (!qry_present (map, base, cfi)) {
return 0;

/* Elicit chip parameters and the command-set,
* and populate the I structure
cfi_chip_setup (map, cfi);

/* .. */

Once you’ve added the map driver and chosen the right chip driver, you can let higher layers use the flash memory.

Flash Filesystems on Linux

User space applications that perform file I/O must view the flash device as if it were a disk. The MTD layer that achieves this is called “User Modules” and is shown in Figure One. The components constituting this layer are:
*mtdblock. This is a simple block driver that emulates a hard disk over flash memory. You can put any filesystem, say ext3, over the emulated flash disk, as mtdblock hides complicated flash access procedures (like preceding a write by an erase of the corresponding sector) from higher layers. The device nodes created are named /dev/mtdblock0, /dev/mtdblock1, and so on. Hence, to mount a filesystem present in flash partition 2 onto the /mnt directory, you can use mount /dev/mtdblock2 /mnt.
*The File Translation Layer(FTL)and the NAND File Translation Layer(NFTL). These layers perfom a transformation called wear leveling. Flash memory sectors can withstand only a finite number of erase operations (on the order of 100,000). Wear leveling prolongs flash life by distributing memory usage across the chip. Both FTL and NTFL provide pseudo-devices (like mtdblock), over which you can put normal filesystems. The corresponding device nodes are named /dev/ntf1a0, /dev/ntf1a1, and so forth. Certain algorithms used in these module are patented, so there can be restrictions on usage.
*The Journaling Flash File System(JFFS). JFFS Version 2 (called JFFS2) is currently in use, while JFFS3 is under development. Unlike mtdblock or NFTL, you don’t need to use other file systems in tandem with JFFS. JFFS was originally written for NOR flash chips, but JFFS2 support for NAND devices is already part of the 2.6 kernel.
*mtdchar. This component presents a linear view of the underlying flash device, rather than the block-oriented view required by filesystems. Device nodes created by mtdchar are named /dev/mtd0, /dev/mtd1, etc. To put an ext2 filesystem on partition 2, use mke2fs /dev/mtd2.
Since JFFS2 is considered to be the best suited filesystem for flash memory, let’s take a closer look at it.
Normal Linux filesystems are designed for desktop systems that are shutdown gracefully. JFFS2 is designed for embedded systems where power failure can occur abruptly, and where the storage device can tolerate only a finite number of erases.
During flash erase operations, current sector contents are saved in RAM. If the power plug is pulled during the slow erase process, the entire contents of that sector are lost. JFFS2 circumvents this problem by using a log-structured design. New data is appended to the data that is already present on flash. Each JFFS node contains meta data to track disjoint file locations. Memory is periodically reclaimed using garbage collection. Because of this design, new writes do not have to go through a save-erase-write cycle, improving power down reliability. The log-structure also increases flash life span by spreading out writes.
To create a JFFS2 image of a filesystem, go to the root of the corresponding directory structure, and use mkfs.jffs2:
$ mkfs.jffs2 –e  MY_ERASE_SIZE –o  ../jffs2.img
Supplying an accurate value for MY_ERASE_SIZE makes JFFS2 more efficient since the garbage collection algorithm is dependent on sector size.
Filesystem images are usually created on a host machine where you do cross-development and are then transferred to the appropriate flash partition on the target device via a suitable download mechanism (such as, serial port, USB or Ethernet).

Configuring the Kernel for MTD

To enable MTD in your kernel, choose the appropriate configuration options. For the flash chip in Figure Two, you should set the following options:
CONFIG_MTD_COOLPDA_MAP is the option you added to enable the map driver you wrote earlier.
You can reduce kernel footprint by eliminating redundant probing. Since the example device has two parallel 8-bit banks sitting on a 16-bit physical bus (thus resulting in a 2-way interleave), you can optimize using additional options:
CONFIG_MTD_CFI_B2 enables a bus width of 2, while CONFIG_MTD_CFI_L2 sets an interleave of 2.
You’ll want the filesystem partition in Figure Two to be mounted as the root device during boot up. For that, modify your bootloader to append root=/dev/mtdblock2 to the command line string that it passes to the kernel.

Execute In Place

On legacy embedded devices, the bootloader copies the kernel image from flash to RAM prior to booting it. With Execute In Place (XIP), you can run the kernel directly from flash. Since you do away with the extra step of copying the kernel to RAM, your kernel boots faster. The flip side is that your flash memory requirement increases since the kernel has to be stored uncompressed. Before deciding to go the XIP route, also be aware that the slower instruction fetch times from flash will have an impact on runtime performance.

Looking at the Sources

In the kernel tree, the drivers/mtd/ directory contains sources for the MTD layer. Map and chip drivers live in the drivers/mtd/maps/ and drivers/mtd/chips/ subdirectories, respectively. Most MTD data structures are defined in headers present in include/linux/mtd/.
The Linux MTD project page at http://www.linux-mtd.infradead.org/ has FAQs, various pieces of documentation, and information for subscribing to related mailing lists. The web site also contains a Linux MTD, JFFS HOWTO that provides insights into JFFS2 design.
To get a hang of porting bootloaders to boot Linux from flash-based devices, browse through the sources of embedded bootloaders like u-boot (http://sourceforge.net/projects/u-boot), BLOB (http://www.lart.tudelft.nl/lartware/blob), and redboot (http://sources.redhat.com/redboot/).

Sreekrishnan Venkateswaran has been working for IBM India for about ten years. His recent projects include porting Linux to a pacemaker programmer and writing firmware for a lung biopsy machine. You can reach Krishnan at class="emailaddress">krishhna@gmail.com.

Comments are closed.