Using Nucleus SE

Colin Walls , in Embedded RTOS Design, 2021

Using a debugger

Debugging tools that are designed specifically for embedded applications have been with us for more than 30 years now and have, hence, become very sophisticated. The key characteristic of an embedded application, as compared with a desktop program, is that every embedded system is different (but one PC looks very much like every other). The trick with a good embedded debugger is for it to be flexible and customizable enough to accommodate such variability in requirements from one user to another. The customizability of a debugger is manifest in various forms, but there is generally some scripting capability. It is this facility in particular which may be exploited to make a debugger perform well with a kernel-based application. I will review some of the possibilities here.

It is important to note that a debugger is typically a family of tools, not just a single program. A debugger may have different modes of operation whereby it can assist with development of code on a simulated target or with real target hardware.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128228517000205

MATLAB® Debugging, Profiling, and Code Indentation

Munther Gdeisat , Francis Lilley , in Matlab by Example, 2013

8.3.2 The Conditional Breakpoint Debugging Tool

We can use the MATLAB debugging tool Set/Modify Conditional Breakpoint to check the execution of the program when n=0. Using the mouse, go to line 15 and highlight n. Go to the Menu→Debug→Set/Modify Conditional Breakpoint… as shown in the following figure.

The window shown pops up. To set the condition n=0, type n==0 in the text box.

Using the keyboard, press F5 to run the code. MATLAB starts executing the script M-file and stops at line 15 when n is equal to 0.

Check the value of n. Is the value of n=0 as expected?

Note here that the color of a Conditional Breakpoint is yellow.

Press F10. MATLAB then jumps from line 15 to line 18. Lines 16 and 17 have not been executed, which is correct and as expected. This is because of the presence of the if statement, so the code is only executed if n is not equal (~=) to 0. The content of the summation variable has therefore not changed when n=0. So it can be seen that the program works fine for the case when n=0.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780124052123000086

A Hardware-Software Co-simulator for Embedded System Design and Debugging

A. Ghosh , ... O. Yamamoto , in Readings in Hardware/Software Co-Design, 2002

2 PREVIOUS WORK

In [7] , a debugging tool for embedded system software is presented. The software is cross-compiled for the embedded processor and then executed on a model of the system. The system is modeled completely in hardware and simulated using a hardware simulator. During simulation, which may take several days, all interaction between the processor model and the surrounding hardware is logged. After simulation, the designer switches to a software debugging environment on a host workstation where the code is compiled for the host and re-linked to pseudo hardware drivers that interact with the logged information. The primary advantage of this approach is that during debugging, software can run at the host computer speed. However, when a bug is fixed, the entire system may have to be re-simulated, thereby increasing the debugging time. Further, during debugging, there is no way of interactively affecting system behavior by feeding the system a different set of inputs. In our opinion, such a debugger has limited usefulness.

An interesting approach presented in [1] is based on distributed communicating processes modeling hardware and software. Software is run on a host workstation and all interactions with hardware are replaced by remote procedure calls to a hardware simulator process. The main drawback of this approach is that there is no notion of timing accuracy as neither the software execution speed nor the interface between hardware and software are accurately modeled.

The Poseidon co-simulator is described in [4]. An event driven simulator is used to co-ordinate the execution of a hardware and a software simulator. The processor simulator is tied closely to the DLX microprocessor [4] model. There is no special handling of standard peripherals and little information regarding the debugging environment, simulation speed and accuracy is available.

In [6] the use of Ptolemy [2] in hardware-software co-design for a digital signal processing (DSP) application is described. The emphasis in [6] is on the use of the capabilities of Ptolemy for heterogeneous simulation and code synthesis for single and multiple processors. After code generation and hardware synthesis, co-simulation is performed using the hardware simulator Thor [13] and a simulator for the digital signal processor DSP56000. It is our belief that though what is described here in terms of the backplane and what is provided by Ptolemy may be similar in principal, Ptolemy does not address the efficiency issues related to hardware-software co-simulation, especially the simulation of processors and peripherals. From [6], few details are available regarding speed of simulation, accuracy, the way standard peripherals are handled and about the debugging environment.

The use of virtual instruments was introduced in [3] in the context of simulation of hardware systems. Currently, the tool described in [3] does not have any capabilities for hardware-software co-simulation. Use of a simulation backplane in mixed mode simulation is described in [10] and similar backplanes for the integration of hardware simulators are commercially available.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9781558607026500521

Debugging

Mark Siegesmund , in Embedded C Programming, 2014

Real-Time Issues

In the world of computing, the debugging tools like breakpoints and single stepping are basic and commonly used debugger features. For some embedded programs, however, they cannot be used. Here are some simple examples to demonstrate the issues:

Controller for window blinds. If you hit a breakpoint while closing, the motor will just keep running and there is no program running to stop it.

TV remote receiver. Hit a breakpoint and it will stop the code but that will not stop or slow down the transmitter. You can examine the data from the first break but there will be no way to continue.

HVAC motor speed control. The program may need to respond to many interrupts every second just to keep the motor operating correctly. One break and the motor breaks.

You will find many more examples similar to the above. Virtually every program that is in active communication with other devices will have this problem. It is not uncommon for multi-processor systems to use one processor to shut down the whole system if it appears another processor has stopped responding.

This is not to say it is impossible to set a breakpoint. You can modify the devices the program talks to and use hardware simulators instead of real hardware for dangerous interfaces. There will be a moderate amount of work for some capability, but you should be aware that many of the problems you need to debug will only happen with real hardware and even then probably infrequently.

In addition to data streaming the following few sections have some other techniques for debugging that can be helpful in situations where you can't use breakpoints.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128013144000259

The Nucleo-F411RE Development Board

Dogan Ibrahim , in ARM-Based microcontroller projects using MBED, 2019

5.2.6 The ST-LINK/V2-1

The ST-LINKV2-1 programming and debugging tool is integrated in the Nucleo boards and it makes the boards mbed enabled. The ST-LINK/V2-1 supports only SWD for the STM32 devices. The ST-LINK/V2-1 does not support single wire interface module (SWIM) interface and the minimum supported application voltage is limited to 3  V. The ST-LINK/V2-1 supports virtual COM port interface on USB, USB software re-numeration, mass storage interface on USB, and USB power management request for more than 100   mA power on USB.

In order to program the board, we have to plug in two jumpers on connector CN2 as shown in Fig. 5.7. Table 5.1 shows the connector CN4 configurations.

Fig. 5.7

Fig. 5.7. Connector CN2.

Table 5.1. Connector CN4 Configurations

Jumper Function Default State Description
JP1 ST-LINK RST ON[1-2]
OFF
Reset MCU
Normal use
JP2/JP3 Ground OFF Ground probe
JP4 nRST ON
OFF
ST-LINK can reset MCU
ST-LINK cannot reset MCU
JP5 5   V Power selection ON[1-2]
ON[3-4]
ON[5-6]
ON[7-8]
OFF
5   V from ST-LINK
5   V from VIN
5   V from E5V
5   V from USB_CHARGE
No 5   V power (use 3.3   V)
JP6 Current measurement ON[1-2]
OFF
No current measurement
Current measurement mode
JP7 VDD_MCU   =   3.3   V ON[1-2]
ON[2-3]
OFF
VDD_MCU voltage   =   3.3   V
VDD_MCU voltage   =   1.8   V
No VDD_MCU
JP8 VDD_IN_SMPS ON[1-2]
OFF
1.1   V ext SMPS input power
Ext SMPS not powered
CN2 ON[1-2], ON[3-4]
OFF
ST-LINK enable for debugger
ST-LINK enabled for ext CN2 connector

Before connecting the Nucleo-64 board to a Windows PC, a driver for the ST-LINK/V2-1 must be installed. This can be downloaded from the following site. You will have to register at the site so that you can download the driver. At the time of writing this book, the driver was called en.stsw-link009.zip:

http://www.st.com/en/development-tools/stsw-link009.html#getsoftware-scroll

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780081029695000057

PIC Programming

Martin Bates , in Interfacing PIC Microcontrollers (Second Edition), 2014

Assignments 2

2.1 LED2 Simulation

Use the MPLABX debugging tools to single step the program LED2 and observe the changes in the MCU registers. Operate the simulated inputs to enable the output count to Port B. Set a break point at the output instruction and run one loop at a time, checking that Port B is incremented. Use the stopwatch to measure the loop time. Comment out the delay routine call in the source code, reassemble and check that the delay does not execute, and note the effect on the loop time. Reinstate the delay, change the delay count to 03 and note the effect on the loop time.

2.2 LED2 Modification

Study program LED2 in MPLABX. Modify the source code to light only the least significant LED and then rotate it through each bit so that the output port appears to scan at a visible rate. Add code to detect the high bit in the carry flag and reverse the direction of travel at each end so the scanning is continuous from end to end.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780080993638000029

The Final Phases of Embedded Design

Tammy Noergaard , in Embedded Systems Architecture (Second Edition), 2013

12.1.4 Debugging Tools

Aside from creating the architecture, debugging code is probably the most difficult task of the development cycle. Debugging is primarily the task of locating and fixing errors within the system. This task is made simpler when the programmer is familiar with the various types of debugging tools available and how they can be used (the type of information shown in Table 12-1).

Table 12-1. Debug Tools

Tool Type Debugging Tools Descriptions Examples of Uses and Drawbacks
Hardware In-circuit emulator (ICE) Active device replaces microprocessor in system

typically most expensive debug solution, but has a lot of debugging capabilities

can operate at the full speed of the processor (depends on ICE) and to the rest of the system it is the microprocessor

allows visibility and modifiability of internal memory, registers, variables, etc., real-time

similar to debuggers, allows setting breakpoints, single stepping, etc.

usually has overlay memory to simulate ROM

processor dependent

ROM emulator Active tool replaces ROM with cables connected to dual port RAM within ROM emulator, simulates ROM. It is an intermediate hardware device connected to the target via some cable (i.e., BDM) and connected to the host via another port

allows modification of contents in ROM (unlike a debugger)

can set breakpoints in ROM code and view ROM code real-time

usually doesn't support on-chip ROM, custom ASICs, etc.

can be integrated with debuggers

Background debug mode (BDM) BDM hardware on board (port and integrated debug monitor into master CPU) and debugger on host, connected via a serial cable to BDM port. The connector on cable to BDM port, commonly referred to as wiggler. BDM debugging sometimes referred to as On-Chip Debugging (OCD)

usually cheaper than ICE, but not as flexible as ICE

observe software execution unobtrusively in real-time

can set breakpoints to stop software execution

allows reading and writing to registers, RAM, I/O ports, etc.

processor/target dependent, Motorola proprietary debug interface

IEEE 1149.1 Joint Test Action Group (JTAG) JTAG-compliant hardware on board

similar to BDM, but not proprietary to specific architecture (is an open standard)

IEEE-ISTO Nexus 5001 Options of JTAG port, Nexus-compliant port, or both, several layers of compliance (depending on complexity of master processor, engineering choice, etc.)

offers scalable debug functions depending on level of compliance of hardware

Oscilloscope Passive analog device that graphs voltage (on vertical axis) versus time (on horizontal axis), detecting the exact voltage at a given time

monitor up to two signals simultaneously

can set a trigger to capture voltage given specific conditions

used as voltmeter (though a more expensive one)

can verify circuit is working by seeing signal over bus or I/O ports

capture changes in a signal on I/O port to verify segments of software are running, calculate timing from one signal change to next, etc.

processor independent

Logic analyzer Passive device that captures and tracks multiple signals simultaneously and can graph them

can be expensive

typically can only track two voltages (VCC and ground); signals in-between are graphed as either one or the other

can store data (whereas only storage oscilloscopes can store captured data)

two main operating modes (timing, state) to allow triggers on changes of states of signal (i.e., high-to-low or low-to-high)

capture changes in a signal on I/O port to verify segments of software are running, calculate timing from one signal change to next, etc. (timing mode)

can be triggered to capture data from a clock event off the target or an internal logic analyzer clock

can trigger if processor accesses off-limits section of memory, writes invalid data to memory, or accesses a particular type of instruction (state mode)

some will show assembly code, but usually cannot set break point and single-step through code using analyzer

logic analyzer can only access data transmitted externally to and from processor, not the internal memory, registers, etc.

processor independent and allows view of system executing in real time with very little intrusion

Voltmeter Measures voltage difference between two points on circuit

to measure for particular voltage values

to determine if circuit has any power at all

cheaper than other hardware tools

Ohmmeter Measures resistance between two points on circuit

cheaper than other hardware tools

to measure changes in current/voltage in terms of resistance (Ohm's law: V = IR)

Multimeter Measures both voltage and resistance

same as volt and ohm meters

Software Debugger Functional debugging tool Depends on the debugger — in general:

loading/single-stepping/tracing code on target

implementing breakpoints to stop software execution

implementing conditional breakpoints to stop if particular condition is met during execution

can modify contents of RAM, typically cannot modify contents of ROM

Profiler Collects the timing history of selected variables, registers, etc.

capture time-dependent (when) behavior of executing software

capture execution pattern (where) of executing software

Monitor Debugging interface similar to ICE, with debug software running on target and host. Part of monitor resides in ROM of target board (commonly called debug agent or target agent), and a debugging kernel on the host. Software on host and target typically communicate via serial or Ethernet (depends on what is available on target).

similar to print statement but faster, less intrusive, works better for soft real-time deadlines, but not for hard real-time

similar functionality to debugger (breakpoints, dumping registers and memory, etc.)

embedded OSs can include monitor for particular architectures

Instruction set simulator Runs on host and simulates master processor and memory (executable binary loaded into simulator as it would be loaded onto target) and mimics the hardware

typically does not run at exact same speed of real target, but can estimate response and throughput times by taking into consideration the differences between host and target speeds

verify assembly code is bug free

usually doesn't simulate other hardware that may exist on target, but can allow testing of built-in processor components

can simulate interrupt behavior

capture variable, memory and register values

more easily port code developed on simulator to target hardware

will not precisely simulate the behavior of the actual hardware in real-time

typically better suited for testing algorithms rather than reaction to events external to an architecture or board (waveforms and such need to be simulated via software)

typically cheaper than investing in real hardware and tools

Manual Readily available, free or cheaper than other solutions, effective, simpler to use but usually more highly intrusive than other types of tools; not enough control over event selection, isolation, or repeatability. Difficult to debug real-time system if manual method takes too long to execute.
Print statements Functional debugging tool, printing statements inserted into code that print variable information, location in code information, etc.

to see output of variables, register values, etc. while the code is running

to verify segment of code is being executed

can significantly slow down execution time

can cause missed deadlines in real-time system.

Dumps Functional debugging tool that dumps data into some type of storage structure at runtime

same as print statements but allows faster execution time in replacing several print statements (especially if there is a filter identifying what specific types of information to dump or what conditions need to be met to dump data into the structure)

see contents of memory at runtime to determine if any stack/heap over-runs

Counters/timers Performance and efficiency debugging tool in which counters or timers reset and incremented at various points of code

collect general execution timing information by working off system clock or counting bus cycles, etc.

some intrusiveness

Fast display Functional debugging tool in which LEDs are toggled or simple LCD displays present some data

similar to print statement but faster, less intrusive, working well for real-time deadlines

allows confirmation that specific parts of code are running

Output ports Performance, efficiency, and functional debugging tool in which output port toggled at various points in software

with an oscilloscope or logic analyzer, can measure when port is toggled and get execution times between toggles of port

same as above but can see on oscilloscope that code is being executed in first place

in multitasking/multithreaded system assign different ports to each thread/task to study behavior

As seen from some of the descriptions in Table 12-1, debugging tools reside and interconnect in some combination of standalone devices, on the host, and/or on the target board.

A Quick Comment on Measuring System Performance with Benchmarks

Aside from debugging tools, once the board is up and running, benchmarks are software programs that are commonly used to measure the performance (latency, efficiency, etc.) of individual features within an embedded system, such as the master processor, the OS, or the JVM. In the case of an OS, for example, performance is measured by how efficiently the master processor is utilized by the scheduling scheme of the OS. The scheduler needs to assign the appropriate time quantum—the time a process gets access to the CPU—to a process, because if the time quantum is too small, thrashing occurs.

The main goal of a benchmark application is to represent a real workload to the system. There are many benchmarking applications available. These include EEMBC (Embedded Microprocessor Benchmark Consortium) benchmarks, the industry standard for evaluating the capabilities of embedded processors, compilers, and Java; Whetstone, which simulates arithmetic-intensive science applications; and Dhrystone, which simulates systems programming applications, used to derive MIPS introduced in Section II. The drawbacks of benchmarks are that they may not be very realistic or reproducible in a real-world design that involves more than one feature of a system. Thus, it is typically much better to use real embedded programs that will be deployed on the system to determine not only the performance of the software, but the overall system performance.

In short, when interpreting benchmarks, ensure you understand exactly what software was run and what the benchmarks did or did not measure.

Some of these tools are active debugging tools and are intrusive to the running of the embedded system, while other debug tools passively capture the operation of the system with no intrusion as the system is running. Debugging an embedded system usually requires a combination of these tools in order to address all of the different types of problems that can arise during the development process.

Based on the article "Firmware Basics for the Boss," Jack Ganssle, Embedded Systems Programming, February 2004.

Real-World Advice

The Cheapest Way To Debug

Even with all the available tools, developers should still try to reduce debugging time and costs, because (1) the cost of bugs increases the closer to production and deployment time the schedule gets, and (2) the cost of a bug is logarithmic (it can increase 10-fold when discovered by a customer versus if it had been found during development of the device). Some of the most effective means of reducing debug time and cost include:

Not developing too quickly and sloppily. The cheapest and fastest way to debug is to not insert any bugs in the first place. Fast and sloppy development actually delays the schedule with the amount of time spent on debugging mistakes.

System inspections. This includes hardware and software inspections throughout the development process that ensures that developers are designing according to the architecture specifications, and any other standards required of the engineers. Code or hardware that doesn't meet standards will have to be "debugged" later if system inspections aren't used to flush them out quickly and cheaply (relative to the time spent debugging and fixing all that much more hardware and code later).

Don't use faulty hardware or badly written code. A component is typically ready to be redesigned when the responsible engineer is fearful of making any changes to the offending component.

Track the bugs in a general text file or using one of the many bug tracking off-the-shelf software tools. If components (hardware or software) are continually causing problems, it may be time to redesign that component.

Don't skimp on the debugging tools. One good (albeit more expensive) debugging tool that would cut debug time is worth more than a dozen cheaper tools that, without a lot of time and headaches, can barely track down the type of bugs encountered in the process of designing an embedded system.

And finally what I (the author of this book) believe is one of the best methods by which to reduce debug times and costs: read the documentation provided by the vendor and/or responsible engineers first, before trying to run or modify anything. I have heard many, many excuses over the years—from "I didn't know what to read" to "Is there documentation?"—as to why an engineer hasn't read any of the documentation. These same engineers have spent hours, if not days, on individual problems with configuring the hardware or getting a piece of software running correctly. I know that if these engineers had read the documentation in the first place, the problem would have been resolved in seconds or minutes—or might not have occurred at all.

If you are overwhelmed with documentation and don't know what to read first, anything titled along the lines of "Getting Started…," "Booting up the system…," or "README" are good indicators of a place to begin. ☺ Moreover, take the time to read all of the documentation provided with any hardware or software to become familiar with what type of information is there, in case it's needed later.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123821966000121

Pitfalls and Issues of Manycore Programming

Ami Marowka , in Advances in Computers, 2010

3.4 Locks and Deadlocks

Manycore programming uses multiple threads to access shared-memory objects. To avoid unpredictable situations such as race conditions, mutual exclusion techniques are used to impose constraints on the order that threads access a shared-memory location. An exclusive access to a shared object can be guaranteed by using locks, also called mutexes. Listing 4 illustrates a simple example of using OpenMP locks for protecting simultaneous access to a shared variable (count).

There are many types of mutexes, each one used in different kinds of situations and each one incur different amount of overhead. There are low-level primitive locking mechanisms (semaphores, condition-variables, and mutexes) and high-level locking mechanisms (recursive-mutexes, read-write mutexes, spinmutexes, queuing-mutexes, and monitors). For example, OpenMP supports the locking constructs lock–unlock, atomic, single, and critical. Lock–unlock and critical constructs are used for protecting coarse-grained critical sections where atomic is applied to a single assignment statement for protecting a single shared variable. Figure 10 is a bar chart of the OpenMP locking construct overhead measured by running the EPCC microbenchmarks [59] with two threads on three different platforms (Intel Pentium D, Intel Core 2 Duo, and Intel 2 Quad) [49]. Analysis of the results leads to the conclusion that it costs less to use the critical directive than to use the lock–unlock pair directive and since the overhead of the atomic directive is negligible is recommended for use, where possible, instead of the critical or lock–unlock directives.

Fig. 10. OpenMP locking overheads of two threads on Intel Pentium D, Intel Core 2 Duo, and Intel Core 2 Quad machines.

Another example is the Intel TBB, which offers enhanced mutexes called mutex, spin_mutex, queuing_mutex, and atomic. For example, a task invoking a request to lock on spin_mutex waits (spins) until it can acquire the lock. It is very fast in low-contention scenarios and incurs very low overhead as shown in Fig. 11. Queuing_mutex is less desirable locking mechanism because it incurs more overhead though it offers fairness and avoids starvation by enforcing FIFO policy on the arriving locking requests [42].

Fig. 11. TBB Locking overheads of two threads on Intel Pentium D, Intel Core 2 Duo, and Intel Core 2 Quad machines.

Improper use of locks may cause many problems that are very difficult to detect without a sophisticated debugging tool. One such situation is known as deadlock. A deadlock is a situation in which a task A is waiting to acquire a lock on a shared object r1 locked by a task B, while locking a shared object r2 requested by task B. Since both tasks are blocked and waiting for release the object held by the other task, and none of them volunteers to be the first to release its object, the program execution is stuck. There are four conditions that lead to a deadlock situation:

Exclusiveness: Exclusive assignment of an object to a task.

Multilock: Allowing a task to acquire a lock on one object while locking another object.

Ownership: A locked object can be released only by the task that holds it.

Cycling: A task is willing to acquire a lock on an object held by another task that willing to acquire a lock on an object held by him.

Deadlocks can be avoided by breaking any one of these conditions.

Listing 29—An example of a deadlock caused by an incorrect locking hierarchy.

#include <   stdio.h>

#include <   omp.h>

int globalX = 0;

int globalY = 0;

int work0 ()

{

omp_set_lock (&lck0) ;

globalX++;

omp_set_lock (&lck1);

globalY++;

omp_unset_lock (&lck1);

omp_unset_lock (&lck0);

return 0;

}

Int work1 ()

{

omp_set_lock (&lck1);

globalX++;

omp_set_lock (&lck0) ;

globalY++;

omp_unset_lock (&lck0) ;

omp_unset_lock (&lck1) ;

return 0;

}

int main (int argc, char *argv [])

{

omp_lock_t lck0 ;

omp_lock_t lck1 ;

omp_init_lock(&lck0) ;

omp_init_lock(&lck1) ;

#pragma omp parallel sections

{

#pragma omp section

WORK0() ;

#pragma omp section

WORK1() ;

}

printf ("TOTAL = (\\%d,\\%d)\\n", globalX, globalY);

omp_destroy_lock(&lck0) ;

omp_destroy_lock(&lck1) ;

}

The above example (Listing 29) illustrates the potential for deadlock because of an incorrect locking hierarchy. The two threads in this program invoke two functions (WORK0 and WORK1) that attempt to acquire two locks (lck0 and lck1) in reverse order for exclusive access of two global variables (globalX, globalY). If both threads obtain only the first critical section (an access to globalX) deadlock occurs because the second critical section (an access to globalY) never becomes available. The deadlock is avoided if one of the threads acquires both critical sections. This nondeterministic behavior of the threads execution can lead to situations where potential deadlocks lay dormant in the code for a long time without causing any damage until the day they suddenly appear for a moment and then disappear again. For example, Edward Lee reported on a case where a deadlock appeared 4 years after the application was launched [60].

One way to avoid deadlocks is to impose an ordering (eliminating cycles in the resource acquisition graph) on the locks and demand that all threads acquire their locks in the same order. Other techniques to prevent deadlock are timer-attached mutex and exception-aware mutex. In a timer-attached mutex a timer is attached to the mutex, thus guaranteeing that the mutex will be released after a predetermined time if a release operation has not been invoked before. An exception-aware mutex is a technique that ensures that a mutex gets released when an exception occurs.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/S0065245810790021

Computing Platforms

Marilyn Wolf , in Computers as Components (Third Edition), 2012

4.5.5 Debugging Techniques

A good deal of software debugging can be done by compiling and executing the code on a PC or workstation. But at some point it inevitably becomes necessary to run code on the embedded hardware platform. Embedded systems are usually less friendly programming environments than PCs. Nonetheless, the resourceful designer has several options available for debugging the system.

The USB port found on most evaluation boards is one of the most important debugging tools. In fact, it is often a good idea to design a USB port into an embedded system even if it will not be used in the final product; USB can be used not only for development debugging but also for diagnosing problems in the field or field upgrades of software.

Another very important debugging tool is the breakpoint. The simplest form of a breakpoint is for the user to specify an address at which the program's execution is to break. When the PC reaches that address, control is returned to the monitor program. From the monitor program, the user can examine and/or modify CPU registers, after which execution can be continued. Implementing breakpoints does not require using exceptions or external devices.

Programming Example 4.1 shows how to use instructions to create breakpoints.

Programming Example 4.1

Breakpoints

A breakpoint is a location in memory at which a program stops executing and returns to the debugging tool or monitor program. Implementing breakpoints is very simple—you simply replace the instruction at the breakpoint location with a subroutine call to the monitor. In the following code, to establish a breakpoint at location 0 × 40c in some ARM code, we've replaced the branch (B) instruction normally held at that location with a subroutine call (BL) to the breakpoint handling routine:

When the breakpoint handler is called, it saves all the registers and can then display the CPU state to the user and take commands.

To continue execution, the original instruction must be replaced in the program. If the breakpoint can be erased, the original instruction can simply be replaced and control returned to that instruction. This will normally require fixing the subroutine return address, which will point to the instruction after the breakpoint. If the breakpoint is to remain, then the original instruction can be replaced and a new temporary breakpoint placed at the next instruction (taking jumps into account, of course). When the temporary breakpoint is reached, the monitor puts back the original breakpoint, removes the temporary one, and resumes execution.

The Unix dbx debugger shows the program being debugged in source code form, but that capability is too complex to fit into most embedded systems. Very simple monitors will require you to specify the breakpoint as an absolute address, which requires you to know how the program was linked. A more sophisticated monitor will read the symbol table and allow you to use labels in the assembly code to specify locations.

LEDs as debugging devices

Never underestimate the importance of LEDs (light-emitting diodes) in debugging. As with serial ports, it is often a good idea to design in a few to indicate the system state even if they will not normally be seen in use. LEDs can be used to show error conditions, when the code enters certain routines, or to show idle time activity. LEDs can be entertaining as well—a simple flashing LED can provide a great sense of accomplishment when it first starts to work.

When software tools are insufficient to debug the system, hardware aids can be deployed to give a clearer view of what is happening when the system is running. The microprocessor in-circuit emulator (ICE) is a specialized hardware tool that can help debug software in a working embedded system. At the heart of an in-circuit emulator is a special version of the microprocessor that allows its internal registers to be read out when it is stopped. The in-circuit emulator surrounds this specialized microprocessor with additional logic that allows the user to specify breakpoints and examine and modify the CPU state. The CPU provides as much debugging functionality as a debugger within a monitor program, but does not take up any memory. The main drawback to in-circuit emulation is that the machine is specific to a particular microprocessor, even down to the pinout. If you use several microprocessors, maintaining a fleet of in-circuit emulators to match can be very expensive.

The logic analyzer [Ald73] is the other major piece of instrumentation in the embedded system designer's arsenal. Think of a logic analyzer as an array of inexpensive oscilloscopes—the analyzer can sample many different signals simultaneously (tens to hundreds) but can display only 0, 1, or changing values for each. All these logic analysis channels can be connected to the system to record the activity on many signals simultaneously. The logic analyzer records the values on the signals into an internal memory and then displays the results on a display once the memory is full or the run is aborted. The logic analyzer can capture thousands or even millions of samples of data on all of these channels, providing a much larger time window into the operation of the machine than is possible with a conventional oscilloscope.

A typical logic analyzer can acquire data in either of two modes that are typically called state and timing modes. To understand why two modes are useful and the difference between them, it is important to remember that an oscilloscope trades reduced resolution on the signals for the longer time window. The measurement resolution on each signal is reduced in both voltage and time dimensions. The reduced voltage resolution is accomplished by measuring logic values (0, 1, x) rather than analog voltages. The reduction in timing resolution is accomplished by sampling the signal, rather than capturing a continuous waveform as in an analog oscilloscope.

State and timing mode represent different ways of sampling the values. Timing mode uses an internal clock that is fast enough to take several samples per clock period in a typical system. State mode, on the other hand, uses the system's own clock to control sampling, so it samples each signal only once per clock cycle. As a result, timing mode requires more memory to store a given number of system clock cycles. On the other hand, it provides greater resolution in the signal for detecting glitches. Timing mode is typically used for glitch-oriented debugging, while state mode is used for sequentially oriented problems.

The internal architecture of a logic analyzer is shown in Figure 4.24. The system's data signals are sampled at a latch within the logic analyzer; the latch is controlled by either the system clock or the internal logic analyzer sampling clock, depending on whether the analyzer is being used in state or timing mode. Each sample is copied into a vector memory under the control of a state machine. The latch, timing circuitry, sample memory, and controller must be designed to run at high speed because several samples per system clock cycle may be required in timing mode. After the sampling is complete, an embedded microprocessor takes over to control the display of the data captured in the sample memory.

Figure 4.24. Architecture of a logic analyzer.

Logic analyzers typically provide a number of formats for viewing data. One format is a timing diagram format. Many logic analyzers allow not only customized displays, such as giving names to signals, but also more advanced display options. For example, an inverse assembler can be used to turn vector values into microprocessor instructions. The logic analyzer does not provide access to the internal state of the components, but it does give a very good view of the externally visible signals. That information can be used for both functional and timing debugging.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123884367000040

Software Development Tools for Embedded Systems

Catalin Dan Udma , in Software Engineering for Embedded Systems, 2013

Debugging tools using Eclipse and GDB

In this section we present an example of how to use free open-source software to obtain an integrated development and debugging tool providing standard debug capabilities for Linux user-space application debug and Linux kernel debug. We will use the following open-source software:

GDB/GDBserver for low-level support for debugging the applications on the embedded target. We have seen above how to download, compile, configure and use GDB for our target.

KGDB – this is a kernel functionality that allows the kernel to be debugged over a serial line or Ethernet from a remote host. The remote host uses GDB for connecting to the running kernel on the target through the interface provided by KGDB.

Eclipse – Eclipse is an open-source community focused on building an open development platform for extensible frameworks, tools and run-times for debugging, deploying and managing software. Eclipse also provides the graphical user interface for GDB: source view in editor, debug window with stack frames, memory view, register view, variables and many others.

In our example we will use the Eclipse IDE for C/C++ Developers (http://www.eclipse.org/downloads/).

Linux application debug with GDB

In this example we will describe how to set up the debugging environment for Linux user-space application debug. There are some preconditions that have to be met before starting the environment configuration:

GDB and GDBserver have to be compiled and configured for the embedded target. The GDBserver will be manually started on the target and the GDB will run on the host, compiled for cross-platform debugging.

Eclipse IDE for C/C++ Developers has to be installed on the host computer.

Eclipse IDE has the necessary support to do the Linux application debug using GDB and GDBserver and we will present how to configure the debug launch of the Eclipse.

The Eclipse project for debugging the Linux application can be created, for example, as in the following examples:

using the Eclipse wizard for creating a new project: the option "Cross-Compile Project" allows compiling the application using the cross-build tool chain, so that the application is compiled to run on the embedded target;

importing an already compiled target application.

In the "Debug Configuration" submenu, we will use the "C/C++ Remote Application" launch configuration. The "Debugger" configuration settings are presented in Figure 16.5.

Figure 16.5. Eclipse Linux application debug.

The preferred launcher should be "GDB (DSF) Manual Remote Debugging Launcher" to debug an application that was manually started on the remote target under control of the GDB debugger integrated using the Debugger Services Framework (DSF).

In the "Main" tab, the cross-platform GDB debugger is set as the GDB debugger, for example powerpc-linux-gdb or arm-linux-gdb.

In the same tab, set the location of the GDB command file, or initialization file. The file should contain target-specific settings: for example setting the target root file system.

In the "Shared Libraries" tab we can add the shared libraries to be debugged along with the application. Load shared library symbols automatically should be enabled.

In the "Connection" tab set the IP address of the target where the GDBserver has been started and the listening port of the GDBserver.

Linux kernel debug with KGDB

In the same way as for the Linux application debug, for the Linux kernel debug GDB should be configured for cross-platform debugging of the target and the Eclipse IDE for C/C++ Developers should be installed on the host computer.

For kernel debugging, KGDB should be used, instead of GDBserver. Debugging the kernel is not an easy task and it assumes a very good understanding of the kernel. We will present only the configuration steps for getting started with KGDB debugging. KGDB is enabled in the Linux kernel using the standard Linux configuration tool, "make menuconfig". The following items should be enabled in the "Kernel Hacking" submenu:

kernel debugging

compile the kernel with debug info

KGDB – kernel debugging with remote GDB and select one of the options: KGDB over serial or KGDB over Ethernet.

The boot loader (e.g., u-boot) transmits the KGDB parameters to the kernel for serial or Ethernet connection. These parameters can be checked or changed in the running Linux kernel accessing the files:

/sys/module/kgdboc/parameters/kgdboc

/sys/module/kgdboe/parameters/kgdboe

The option kgdbwait causes KGDB to wait for a GDB connection in the early kernel boot stage. The kernel stops in the kernel_init() function and waits for the GDB connection from the host computer. For early kernel debug, KGDB support should be compiled inside the kernel, not as a module.

In the Eclipse IDE, the project for debugging the Linux kernel should be created by importing the vmlinux kernel file from the location where the kernel has been compiled.

The standard GDB support in Eclipse does not allow, for example, connection to the target using a UDP connection, as required for KGDB over Ethernet. For this purpose, we propose to use the "Eclipse C/C++ GDB Hardware Debugging" extension available from Indigo – http://download.eclipse.org/releases/indigo.

In the "Debug Configuration" submenu, we will use the "GDB Hardware Debugging" launch configuration. The configuration settings are presented in Figure 16.6.

Figure 16.6. Eclipse Linux kernel debug.

In the Debugger tab, the settings are similar to the launch used for Linux application debug:

the cross-platform GDB tool should be set, for example powerpc-linux-gdb, arm-linux-gdb;

the JTAG settings (Use remote target) are disabled. The connection to the target, based on KGDB, is set in the Startup tab.

In the Startup tab the initialization commands allow connecting to the target using, for example, a UDP connection, using the GDB command target remote udp:<target IP addr>.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780124159174000165