14Jan2022

Embedded systems firmware demystified pdf

->>>> Click Here to Download <<<<<<<-

Learn algorithms for solving classic computer science problems with this concise guide covering everything from fundamental …. Modern C introduces you to modern day C programming, emphasizing the unique and new features of ….

To really learn data science, you should not only master the tools—data science libraries, frameworks, modules, …. Skip to main content. Start your free trial. Download Note 2. Sharing is caring More. Leave a Reply Cancel reply Comment. Enter your name or username to comment. Enter your email address to comment.

Enter your website URL optional. Search this website Type then hit enter to search. Share via. For the sake of our discussion, we assume that all memory is random access, but not all memory is writable by the CPU. Simply wire it up to the processor and use it. DRAM, on the other hand, requires external hardware to refresh it periodically so that the internal capacitors hold their charge. DRAM technology is much cheaper, but it is also slower and requires additional hardware to keep it running the DRAM controller mentioned earlier.

SRAM is simple but has a higher cost per byte of storage. It is typically used in systems that require small amounts of memory or in systems that need a small amount of fast writable memory like a cache. The big advantage of flash memory over EPROM is that it is in-system programmable, which means that no separate device is needed to modify its contents.

The architecture of flash memory comes in a few varieties, and although modern flash memory is in-system programmable, it is still not as convenient as using RAM. Writing to the individual bytes changes only some of the bits within each byte to a zero state. Each operation erase, write, and so on except read is performed with a special programming algorithm. This algorithm is unique enough that it does not interfere with the typical interaction between the CPU and the memory.

Flash memory is quickly becoming the standard nonvolatile memory choice for new designs. Usually the limit is high , or 1,, cycles , but, nevertheless, it must be considered in the design. Still Others There are several other types of memory, most of which are some derivative of one or more of the above types. These other standards are not as popular, but they typically satisfy some niche in the market.

Some devices actually have the battery built into the plastic; others are nonvolatile simply because the hardware design has battery backup protecting the device; still others provide some type of automatic backup of RAM to on-chip flash when power is removed.

Access is slow, but physical size is extremely small because there is no address or data bus. The functions covered here are in order of importance a power monitor for reset pulse generation, a watchdog timer, a power monitor for SRAM nonvolatility, and a time-of-day clock.

Typical requirements on a reset input line of a processor are that it be held in a constant active usually low state for some duration say ms. In these situations, the power could dip and cause the power supply level to fall, which would in turn cause the CPU to go insane, but would not cause the RC circuit to pull the RESET line low enough to bring the CPU out of its insane state.

In some respects, the RC combination is an analog solution for a digital problem. Fortunately, there are components out there that do just that! There are several different components available that monitor the supply voltage and automatically generate a clean reset pulse into the CPU when the supply drops below a certain level. While this reset mechanism works when the power is cycled cleanly, it can cause problems when power is momentarily interrupted. If the software stops responding or attending to the task at hand, the watchdog timer detects that something is amiss and resets the software automatically.

The system might stop responding as a result of any number of difficult-to-detect hardware or firmware defects. When that function completes, it then returns to the wrong spot leaving the system utterly confused. Runaway pointers firmware or a glitch on the data bus hardware can cause similar crashes.

The watchdog timer is a great protector. The typical watchdog see Figure 1. If the watchdog is not toggled within that period, it pulses one of its output pins. Consequently, if the firmware does not keep the watchdog input line toggling at the specified rate, the watchdog assumes that the firmware has stopped working, complains, and causes the CPU to be restarted. The watchdog timer is a simple re-triggerable timer. When the application is operating normally, it periodically resets the WDT by toggling its input.

Battery-Backed SRAM Not all systems need to maintain the integrity of their SRAM when power is turned off, but this requirement is becoming more and more common because the compo- nents that provide that capability are getting cheaper and easier to use. Not too long ago, embedded systems used an arrangement of discrete components to determine what voltage was higher power supply or battery and properly steer the higher supply to the power pin of the SRAM or the whole system.

Now a handful of companies provide nonvolatile SRAM modules that have the battery and the power supply monitoring circuitry built right into the part. These parts are guaranteed to retain data for up to 10 years with certain restrictions regarding the actual amount of time the internal battery is powering the SRAM, of course. Modules with built-in batteries are often available in versions that are pin-compatible with standard SRAM chips.

If you need time of day in your system, then you need a battery and a time-of-day chip. An exception to this case is if the embedded system knows that it has an external device from which it can get the current time after being reset. Serial Port Drivers Many embedded systems use serial ports as an interface to the outside world.

The serial device on an embedded system has two portions: the protocol and the physical interface. The protocol portion takes care of start and stop bits, bits per character, the width of each bit based on a configured baud rate, and the conversion of the serial bit stream into a parallel byte stream easily digested by the CPU. The physical interface takes care of converting voltage levels on the CPU to the voltage levels needed by the interface.

Embedded systems use two fundamentally different transmission mechanisms for their serial ports: single wire and differential drive transmission. Single Wire Data Transfer Single wire systems dedicate one wire to each direction of data transfer and use a common ground reference.

While simpler than differential drive transmission, single wire transfer has limitations with regard to transmission speed and the length of the wire from sender to receiver. The most common single wire standard is RS RS is by far the most common serial communications standard in the industry.

Differential Drive Interface Differential drive is not as widely used. It requires a few more wires, increases the drive length substantially, and also supports transmission speeds in excess of 1MB. Differential drive interface can drive faster signals through greater distances because of its inherent noise immunity.

These wires are wrapped around each other, which minimizes interference. The receiver determines the state of the signal based on the voltage difference between the two wires. The most common embedded interfaces that use this technique are RS and RS RS is a differential drive replacement for RS With no other changes in firmware, an RS interface could be replaced with RS, and its maximum line drive and line speed would be increased.

RS adds the ability to have more than a single transmitter and receiver on the connection. RS is commonly used on factory floor networks because of the noise immunity and ability to connect multiple devices.

Thus all induced transmission line noise is cancelled at the receiver. Like the serial port, the Ethernet interface is partitioned into two layers protocol and physical.

The physical layer consists of two blocks: a PHY and a transformer. It is becoming more common to see the PHY and Ethernet controller integrated into one device, but the transformer is still separate; hence, the Ethernet interface can consist of two or three distinct devices.

The Ethernet controller is the portion of the interface that does the packet-level work. For outgoing packets, the Ethernet controller calculates the CRC, transfers data from memory to the PHY, adds padding to small packets, and interrupts the CPU to indicate that the packet has been sent. The PHY takes care of the lowest level of the interface protocol. It is responsible for various parameters like bit rate specific to the environment. The transformer provides isolation and electrical conversion of the signals passed over the cable.

Flash Device Options All flash devices are structured as a number of sectors. On some devices, the sectors are all the same size; on others the sector sizes vary. Some have features that allow the firmware to lock a sector so that it cannot be erased unless a physical connection is inserted onto the hardware.

Some devices have reset input lines, and others do not. Densities vary from 64KB to 8MB in a single flash device. Flash Locking Facilities Erasing or writing flash memory involves a special, nontrivial algorithm. Thus, it is relatively safe to assume that this algorithm will not be executed accidentally.

However, it is still nice to have the option to protect certain sectors from misbehaving code. Many of the available flash devices have the ability to protect one or more sectors from write operations. Some of these devices allow you to protect a specified sector or group of sectors by placing the device in an external programming device and applying a high voltage to one of the pins.

Others have a more flexible configuration that uses an external write protect pin and a lock sequence. In this latter type of device, a sector is write-protected or locked by executing a specific command sequence.

Once locked, a sector can only be modified after first being unlocked. This process makes it even more difficult to corrupt a sector accidentally. Locking can be used to assure that some very basic boot code is always available for the CPU regardless of what happens to the programming of the other sectors. If this safeguard is still not enough, an alternate technique can prevent a sector from being modified until a power cycle occurs.

This method is typically accomplished by enabling the write-protect pin and then initiating the lock sequence to a particular device sector. Because there is no unlock sequence that works when the write-protect pin is enabled and the write protect pin can not change state until the next hard reset, the sector is protected from all erroneous writes except those that happen immediately upon boot. Bottom-Boot and Top-Boot Flash Devices Some devices are organized as a few small sectors at the bottom address space, followed by large sectors that fill the remaining space.

Since boot code is typically placed in small sectors, flash memory is some times described as bottom-boot or top-boot, depending on where the smaller sectors are located. Ultimately, one sector contains the memory space that the CPU accesses as a result of a power cycle or reset boot.

This sector is usually referred to as the boot sector. Because some CPUs have a reset vector that is at the top of memory space and others have a reset vector at the bottom of memory space, the flash devices come in bottom-boot and top-boot flavors. A processor that boots to the top of memory space would probably use a top-boot device, and a processor that boots to the bottom of its memory space would be more suited to a bottom-boot device.

When the boot sector of the flash device exists in the boot-time address space of the CPU, it can be protected by making the boot sectors unmodifiable. Since only a small amount of code is typically needed to provide a basic boot, there is little wasted space if you dedicate a small sector to an unmodifiable boot routine.

This supports the ability to make as much of the flash space in-system reprogrammable without losing the security of knowing that if all of the reprogrammable flash was accidentally corrupted, the system would still be able to boot through the small amount of code in the unmodifiable boot sector. In most systems, peripherals share the data and address buses with memory. Thus, understanding the protocol for these buses is important to understanding much of the hardware.

This section explains, in general terms, how the CPU uses the address and data buses to communicate with other parts of the system. To make the discussion more concrete, I describe the operation in terms of the hypothetical machine detailed in the simplified schematics in Figure 1. The result is only slightly more detailed than the typical functional block diagram, but it is also representative of the portion of a real schematic that you would need to understand to work with most embedded processors.

If you can identify the control, data, and address lines in your system, you probably know all you need to know about how to read a schematic.

In this schematic, the signals have been grouped to show how they relate to the various system buses. Notice that nearly all of the CPU pins are dedicated to creating these buses. There are also two other blocks of components on this page: the clock and the reset circuit.

The clock can be a crystal, or it can be a complete clock circuit, depending on the needs of the CPU. The memory devices connect directly to the system buses. Because each device is only 32K, each uses only 15 address lines. Notice how each device is activated by a separate chip select.

The CPU in this design uses bit addresses but transfers data eight bits at a time. Thus, it has a bit address bus and an 8-bit data bus. Using these 16 bits, the processor can address a 64K memory space. In this simple design, the majority of the CPU pins are committed to creating address and data buses. Since each memory component houses only 32K of address space the memory chips have only 15 address lines.

In this design, the low-order 15 address bits are directly connected to these 15 lines on the memory components. If the CPU did not provide conveniently decoded chip select lines, we could have used the high-order bit of the address bus and some additional logic called address decode logic to activate the appropriate memory device.

Whenever the CPU wants to read or write a particular byte of memory, it places the address of that byte on the address lines. If the address is 0x, the CPU would drive all address lines to a low voltage, logic 0 state. When a device is not selected, it is in a high-impedance state electrically disconnected mode. Two more control lines read and write on the CPU control how a selected device connects to the data bus. Thus, 0xBE represents the eight bits Note that the number of hexadecimal digits implies the size of the bus.

The four hexadecimal digits in 0x26A4, on the other hand, suggest that the address bus is 16 bits wide. Because of this implicit relationship between the hex representation and the bus size, it is accepted convention to pad addresses and data values with zeros so that all bits in the bus are specified. For example, when referencing address 1 in a machine with bit addresses, one would write 0x, not 0x1.

Ifyou compare this diagram to the schematic, you can see that the only additional con- nections on the flash device are for power and ground. The CPU uses the read and write signals to control the output drivers on the various memory and peripheral devices, and thus, controls the direction of the data bus.

The CPU-to-flash device interaction can be summarized with the following steps:. CPU places the desired address on the address pins. CPU brings the read line active. CPU brings the appropriate chip select line active. The flash device places the data located at the specified address onto the data bus. CPU reads in the data that has been placed on the bus by the flash device.

CPU releases the chip select line and processes the data. This sequence of steps allows the CPU to retrieve the bytes from memory that ultimately become instructions. The SRAM interface is identical except that a different chip select line is activated. The different chip select lines are configured so that each line is active for a 32K address space.

A write access is essentially the same thing except that now the Write line is used and the data flow is from CPU to memory, not memory to CPU. You have a certain number of address bits dependent on the actual size of the device , 8 data bits 16 or 32 if you were using a different device , and a few control lines read, write, and chip select. All that remains in the schematic is the serial port. In other words, you now understand the fundamentals of a simple microprocessor-based hardware design!

The Power and Pitfalls of Cache For standard programming environments, cache is a blessing. It provides a real speed increase for code that is written to use it properly refer to Figure 1 1.

Caching takesadvantage of a phenomenon known as locality of reference. The ability to pull that small block of memory into a faster memory area is a very effective way to speed up what would otherwise be a relatively slow rate of memory access.

There are different levels of cache, the fastest usually called level 1 cache used by the CPU is located on the CPU chip. Cache is a fast chunk of memory placed between the memory access portion of the CPU and the real memory system in order to provide an enhancement to the access time between the CPU and external memory.

There are several different types of cache implementations. Discussion of these various implementations is beyond the scope of this text.

As the names imply, the two different caches are used for the two different types of CPU memory access: accesses to instruction space and accesses to data space, respectively.

Caches are divided into these two major types because of the difference in the way the CPU accesses these two areas of memory. The CPU reads and writes. The only limitation that cache puts on typical high-level system programmers is that it can be dangerous to modify instruction space, sometimes called self-modifying code. This is because any modification to memory space is done through a data access. The D-cache is involved in the transaction; hence, the instruction destined for physical memory can reside in the D-cache for some undetermined amount of time after the code that actually made the access has completed.

Cache increases performance by allowing the CPU to fetch frequently used values from fast, internal cache instead of from slower external memory.

However, because the cache control mechanism makes different assumptions about how data and instruction spaces are manipulated, self-modifying code can create problems. Cache can also create problems if it masks changes in peripheral registers. Step B shows the transfer of the contents of that D-cache location to physical memory. If the sequence of events was guaranteed to be A-B-C-D, then everything would work fine.

However, this sequence cannot be guaranteed, because that would eliminate the efficiency gained by using the cache. The whole point behind cache is to attempt to eliminate the B and C steps whenever possible.

The ultimate result is that the instruction fetch may be corrupted as a result of skipping step B , step C, or both. For embedded systems, the problem just gets worse. Understanding the above problem makes the secondary problems fairly clear. Notice in Figure 1. Two additional complexities become apparent:. In many systems, DMA and cache are independent of each other. The data cache is likely to be unaware of memory changes due to DMA transfers, which means that if the data cache sits between the CPU and this memory space, more inconsistencies must be dealt with.

Most of the prior problems are solved through good hard- ware and firmware design. The initial issue of I-cache and D-cache inconsistency can be resolved by invoking a flush of the data cache and an invalidation of the instruction cache. A flush forces the content of the data cache out to the real memory.

An invalidation empties the instruction cache so that future CPU requests retrieve a fresh copy of corresponding memory. Also, there are some characteristics of cache that can help resolve these problems. For example, a write through data cache ensures that data written to memory is placed in cache but also written out to physical memory. This guarantees that data writes will be loaded to cache and will pass through the cache to real memory; hence it really only provides a speed improvement for data reads.

Also, a facility in some CPUs called bus snooping helps with the memory inconsistency issues related to DMA and cache. Bus snooping hardware detects when DMA is used to access memory space that is cached and automatically invalidates the corresponding cache loca- tions.

It is very common for a CPU to restrict caching to certain specific cachable regions of memory, rather than designating its entire memory space cacheable. The bottom line is that the firmware developer must be aware of these hardware and firmware capa- bilities and limitations in order to deal with these complexities efficiently.

Summary While embedded systems come in a fascinating array of variations, at the lowest hardware levels they usually have many general similarities.

Memory systems interface with CPU via address and data buses.

tesolosan1974's Ownd

0コメント

1000 / 1000