How does Bound-T work, and how is it implemented? This page tells you a bit about the internals of Bound-T:

- The overall architecture of Bound-T.
- How this architecture is implemented.
- Acknowledgements for tools, programs, and libraries that Bound-T uses.

The "Analysis Process" chapter of the Bound-T Reference Manual goes into more detail about the analysis methods.

The decription below focuses on the worst-case execution time (WCET) analysis. Stack usage analysis is a part of WCET analysis, but Bound-T can also be configured as a pure stack-usage analyser.

Here we sketch the architecture of Bound-T, concentrating on what Bound-T does and dividing it as follows:

- an overview of the inputs, the processing, and the outputs of Bound-T;
- how Bound-T constructs the flow-graphs and the call-graph of the program under analysis;
- modelling and analysing the computations in the target program and deducing loop iteration bounds;
- finding an upper bound on the execution time of each instruction in the program; and
- calculating an overall WCET bound from the control-flow analysis, loop bounds, and instruction execution times.

As presently implemented, Bound-T is a classical computer program in the sense that it reads some inputs, performs some processing on the inputs, and emits some outputs.

The **inputs** to Bound-T are the *target program* to be
analysed, as a compiled and linked (executable) file; the command-line
*arguments and options*; and the user-written *assertions*
in one or more files of text.

The **processing** falls into five main phases, each
supported by the relevant assertions:

- Extracting the code of the target program from the executable
file, organising it into the
*control-flow graphs*for each subprogram, and linking the flow-graphs by the*call-graph*. - Modelling and analysing the
*computations*performed by the instructions in the target program. A necessary part of the computation model is an analysis of the*stack usage*. Several types of data-flow analysis are used, the most powerful (and time-consuming) being based on a decision procedure for Presburger Arithmetic. - Using the computation model to study the data-dependent
*flow of control*and thereby find the*possible execution paths*. The main point is finding*loop bounds*: bounds on the number of iterations of each loop in the code. - Modelling the
*time taken*to execute each instruction in the target program, and thus finding an upper bound on the execution time of each basic block in the flow-graph of each subprogram. - Calculating the
*worst-case execution time*as constrained by the loop bounds and assertions. This phase uses the*Implicit Path Enumeration Technique*(IPET) which applies Integer Linear Programming to maximise the execution time.

The **outputs** from Bound-T are simple representations of the
results of the analysis: text on the standard output channel giving
the computed *WCET bounds* and *stack usage* bounds, and
text files that represent the *flow-graphs* and the
*call-graph* in a form that can be viewed with the "dot"
graph-drawing tool from the GraphViz kit.

The subsections below go into each processing phase in some more detail.

The first analysis phase in Bound-T constructs the flow-graphs of the subprograms to be analysed and the call-graph that links the flow-graphs. The input is the program's code in the executable file.

The target program is usually given as a compiled and linked
executable file, for example in ELF form. Such files contain the
initial *memory image* of the program, as loaded into the
target processor's memory (or memories) before the processor is
booted. The memory image contains the code (machine instructions)
and initial values for initialised data variables. However, the
initial values of variables are usually not placed (loaded)
directly into the variables (data memory locations); rather, the
startup code copies the initial values from their load-time
locations (often in the program memory) to the variables.

The executable files generally (and hopefully) also contain
information that connects the machine-level program, where parts are
numerically identified by code and data addresses, to the symbolically
identified source-code entities such as subprograms, variables, and
source-code files. This information is necessary for source-level
debugging of the program and is therefore usually called *debugging
information*. Bound-T uses the debugging information to accept input
(especially assertions) written in source-code terms and to display
outputs in source-code terms.

One of the main assumptions that Bound-T relies on is that the
program code given in the memory image is *immutable*, that is, it
does not change when the program is executed. Self-modifying
programs are not in scope for Bound-T.

Bound-T starts its analysis from the entry point of the *root
subprogram* named on the command line. Bound-T looks up this
address in the target program's memory image, reads the machine
code (binary form) of the instruction at that address, decodes
the instruction to find out what it does, and inserts a (model of)
the instruction in the flow-graph under construction for this
subprogram. If the instruction is not a branch (jump) instruction
Bound-T goes on to the next address and reads, decodes, and models
the next instruction in the same way.

When Bound-T finds a branch instruction there may be two, or
sometimes more, possible "next addresses" and their next instructions.
Bound-T creates a *branch point* in the flow-graph and follows all
the paths. Likewise, when Bound-T finds a return instruction, there is
no next instruction (in this subprogram) and the exploration of this
execution path stops with the return instruction, creating a *return
point* — a childless node — in the flow-graph.

When Bound-T finds a *call* instruction it creates a special
kind of node in the flow-graph to represent the execution of the called
subprogram (the *callee*) and then goes on with the construction of
the flow-graph of the calling subprogram (the *caller*) from the
return point of the call instruction. Bound-T also adds to the
call-graph a node that represents this call, and includes the callee in
the analysis. That is, Bound-T later analyses the code of the callee in
the same way, starting from the callee's entry point and analysing
instructions one at a time to build the flow-graph of the callee.

Constructing flow-graphs instruction by instruction in the above way
is generally easy as long as each instruction specifies its possible
successors *statically*. This is true even if the condition that
selects one of these successors at run time depends dynamically on the
run-time values of variables, as happens in conditional branches.

However, many programs contain branch instructions where the actual
address of the successor — the *branch target* — is
computed dynamically at run time. For example, many instruction sets
contain a register-indirect branch instruction where the target address
is taken from a register.

When Bound-T finds such a *dynamic branch* instruction, it
suspends the exploration of this execution path and thus leaves the
flow-graph incomplete. Later data-flow analysis, in particular the
constant propagation and the Presburger analysis, may model the
computation of the branch target well enough to find all the possible
target addresses and thus *resolve* the dynamic branch. If so,
Bound-T resumes the construction of the flow-graph from these resolved
target addresses, and hopefully completes it. However, when the
subprogram has several dynamic branches, Bound-T may find *more*
dynamic branches and thus require several cycles of flow-graph
construction interleaved with data-flow analysis until the flow-graph is
completed.

The target address of a call instruction can also be computed
dynamically, resulting in a *dynamic call*.

Programs typically use dynamic branches or calls for two purposes:

- for switch-case statements that are implemented by means of constant tables of static addresses, and
- for calling subprograms via pointers (function pointers).

Bound-T is usually able to resolve the switch-case kind of dynamic branch by analysing the local code to find the bounds of the address table, and then reading the contents of the address table from the memory image. This of course assumes that the address table is constant during execution.

Bound-T usually fails to resolve calls via pointers, because the run-time value of the pointer is usually set at some distant point in the program, and Bound-T does not do the necessary global analysis to find the possible pointer values. Thus you must usually find the possible callees manually and list them in an assertion.

One of the main problems in program analysis is *data-dependent
control flow* &mdash that is, conditional branches. For WCET
analysis, in particular, the conditional branches that create loops are
very important, because an upper bound on the number of loop iterations
must be known before any WCET bound can be found. While you can, and
often must, write assertions to set loop bounds, most WCET tools try to
find loop bounds automatically, by analysing the computations that
decide when the loop terminates.

When Bound-T decodes instructions and builds the flow-graphs, it also
creates a model of the computational effect of each instruction. In this
model, the effect of an instruction is a set of *assignments* of
values to *data cells* — registers, flags, memory locations.
The assigned value is defined by an *arithmetic expression* that
uses arithmetic *operations* such as addition and subtraction,
logical bit-level operations such as bitwise "and", "or", "xor" or shift
and rotate operations. The *operands* in the expressions are
constants or data cells.

When Bound-T adds a conditional branch to a flow-graph, as an edge
between two nodes of the flow-graph, it models the logical
*condition* for the branch as a *Boolean expression*, similar
to an arithmetic expression but returning a Boolean value.

These models of the computation turn the flow-graphs into full models of the program under analysis. In principle, the flow-graphs could be simulated and would then perform the same computation as the original program. However, in practice some computations are not modelled, or are modelled approximately. For example, Bound-T does not model floating-point computations because loop bounds seldom depend on floating-point values.

There are several knotty problems in the modelling of the computations, for example data cells that overlap, dynamically computed pointers to data cells, the differences between signed and unsigned computations, and the modelling of underflows and overflows. A deeper discussion of these problems does not fit on this page (not even in the margin :-).

Given a flow-graph with the model of the computation, Bound-T applies several forms of data-flow analysis to simplify the model and deduce constraints on the values of variables. These analyses include:

- Constant propagation.
- Value-origin analysis.
- Presburger Arithmetic analysis.

*Constant propagation* simplifies arithmetic and Boolean
expressions by replacing data-cell operands with their values, when the
values are statically known, and evaluating operations that have
constant operands. For example, if a basic block consists of two
instructions, where the first instruction assigns the value 14 to
register `r2`, and the second instruction assigns the expression
`r2+7` to register `r4`, constant propagation simplifies
the expression `r2+7` to `14+7` and then evaluates the sum
to give `21`. After constant propagation, the model of the second
instruction simply assigns 21 to `r4`.

*Value-origin analysis* applies to every use (occurrence) of a
data cell as an operand in an expression and identifies the instructions
that can have defined (assigned) the value of the data cell that is used
in this expression. This is also called "def-use" analysis or "value
numbering". For example, if a subprogram saves and restores a
"callee-save" register by first pushing it onto the stack, then popping
it off the stack before returning, value-origin analysis can show that
the popped value originates in the pushed value, and thus that the value
of the register on return is the same as the value on entry, so the
register's value is preserved over any call of this subprogram.

The *Presburger Arithmetic* (PA) analysis models the
*relations* between data-cell values, even when the actual values
are not statically known. For example, if an instruction assigns the
expression `2*r2+7` to `r4`, where `r2` and
`r4` are registers, the PA analysis records that the relation
`r4=2*r2+7` holds from this instruction on, until either
`r2` or `r4` is changed by a new assignment. Each
assignment implies a relation between data-cell values. The PA analysis
propagates these relations over the whole subprogram. The resulting
relation set can be queried in various ways to reveal bounds on
data-cell values and other things.

The Bound-T manuals tend to call the PA analysis "arithmetic analysis".

Bound-T uses the Presburger Arithmetic analysis to identify loop
*induction variables*, which are data cells that change (increase
or decrease) by a known *step* on every repetition of the loop. The
value of an induction variable on iteration number *n* can thus be
computed as the initial value plus *n* times the step. Combining
this formula with the Presburger Arithmetic model of the loop's
termination condition can give bounds on the iteration count *n*,
thus bounds on the number of times the loop is repeated.

After the control-flow and loop-bounds analysis, the second main problem in WCET analysis is to find upper bounds on the execution time of all instructions in the target program. This depends on the instructions and on the particular target processor or target system and can be very difficult for complex processors with several levels of cache, external memory banks, busses with several competing masters, and so on.

The target processors that Bound-T currently supports are simple enough that most instructions have a fixed execution time — a fixed number of clock cycles. The time can depend on the kind of instruction, for example a load instruction or an addition instruction, and on the kinds of the operands, for example registers or memory cells, but the execution time usually does not depend on the values of the operands, the address of the instruction, or the earlier history of execution.

For processors with pipelines, such as the SPARC/ERC32 or the ADSP-21020 (SHARC), Bound-T models the pipeline and includes the effect of pipeline "stalls" (interlocks) in the timing. This is aided by Bound-T's unusual view of control-flow graphs. In the usual view, each instruction occurs once in the flow-graph, and the flow-graph is actually a graph of the possible values of the Program Counter. Each PC value identifies a single instruction. In Bound-T, however, the flow-graph is a graph of the possible states of the pipeline, and a pipeline state (and thus a flow-graph node) identifies a set or sequence of instructions, one in each stage of the pipeline. This makes it easier to locate the points in the flow-graph where the pipeline may stall.

For processors that support external memory, Bound-T may let the user choose a number of wait-state cycles to be assumed for memory accesses. Some versions of Bound-T allow different wait-state choices for different sorts of memory accesses: code fetch, stack data read or write, other data read or write.

At present, Bound-T's most complex model of instruction timing is used for the SPARC/ERC32 target. This processor has separate computation units for integer computations and for floating-point computations — the Integer Unit (IU) and Floating-Point Unit (FPU). The IU and FPU can run in parallel. When the processor fetches and decodes a floating-point instruction, the FPU starts to execute this instruction. Meanwhile, the IU can fetch, decode, and execute any number of integer instructions. When the IU encounters a new FPU instruction, or an instruction that needs the results of the earlier FPU instruction, the IU stalls until the FPU has finished the earlier instruction. Bound-T models the worst-case effect of this IU/FPU synchronisation by distributing stall delays in the flow-graph according to an heuristic but safe procedure, guided by an ILP optimisation.

For any target processor, the end result of this processing phase in Bound-T is an upper bound on the execution time of each basic block and each control-flow edge in the flow-graphs.

Bound-T uses the *Implicit Path Enumeration Technique* (IPET) to
compute an upper bound on the WCET of each subprogram, once all loops
have iteration bounds and the execution times of all basic blocks and
control-edge edges are bounded. IPET uses an *Integer Linear
Programming* formulation to maximise the execution time bound over
all possible execution paths (see the TECS survey paper for
references on IPET and the
Wikipedia entry for more about ILP).

IPET considers the total execution time of a flow-graph as a function
of the number of times each part of the flow-graph (each basic block and
each edge) is executed. Let *x _{p}* stand for the number of
times the flow-graph part

The problem is thus to find the maximum value of *T(x)* over all
values of *x* that are possible for one execution of the
flow-graph. The *x* variables are constrained by the structure of
the flow-graph and by the results of the control-flow analysis. Let the
entry point of the flow-graph be the target of a notional "entry edge"
*e* that represents a call to the subprogram, and let each return
point in the flow-graph be the source of a notional "return edge" that
represents a return from the subprogram. Then:

*x*_{p}__>__0 for all*p*.*x*= 1._{e}- The sum
*x*over all return edges_{r}*r*is 1. - For every basic block
*b*:*x*equals the sum of_{b}*x*over all edges_{in}*in*that enter*b*, and also equals the sum of*x*over all edges_{out}*out*that leave*b*. - For every loop in the flow-graph, the upper bound on the number of loop iterations implies a constraint on the number of times the loop body is executed, relative to the number of times the loop is started.
- Other constraints on
*x*may be derived by flow analysis (infeasible paths, for example) or come from user-written assertions that constrain the number of executions of specific parts of the flow-graph, for example calls to other subprograms.

Given the expression for *T(x)* and the affine constraints on
*x* an ILP solver like LP_Solve can compute the
maximum of *T(x)*, which is the desired overall WCET bound. The ILP
solver also gives a value of *x* that reaches this maximum, which
is not necessarily unique, but gives some idea of the contribution of
each part of the flow-graph to the overall WCET bound.

Bound-T applies IPET to each subprogram's flow-graph separately, traversing the call-graph in a bottom-up direction from the leaf subprograms to the root subprogram.

Applying IPET per subprogram, and not to the whole extended
flow-graph at once, has the advantage of keeping the ILP problems small,
but the disadvantage of preventing any inter-subprogram flow
constraints. In partial mitigation, Bound-T performs
*context-specific* analyses for subprograms. Thus, the loop bounds
in a subprogram can depend on its parameters, and Bound-T can find
different loop bounds and different WCET bounds for this subprogram in
different contexts (different call paths).

Calculating an overall stack-usage bound from the stack-usage bounds
of each subprogram is much simpler. Again proceeding bottom-up in the
call-graph, Bound-T computes the stack usage for a subprogram call by
adding the *local stack height* at the call instruction to the
total stack usage of the callee. The total stack usage of a non-leaf
(calling) subprogram is computed as the maximum of its local stack
height and the stack usages of its calls. When the root subprogram is
reached we have both a stack usage bound for the root, including its
callees, and a call-path that reaches this stack-usage (if each
subprogram on the call path reaches its local worst-case stack
usage).

Stack usage can also be context-specific, although this is much rarer than contex-specific WCET bounds.

As in architecture, so also in implementation is Bound-T a classical
computer program. Bound-T is compiled and linked from manually written
source code into a monolithic executable file. The only external,
dynamically linked or activated components are the operating system
libraries and the two supporting tools: the Omega Calculator program
(*oc*) for solving problems in Presburger Arithmetic, and the
LP_Solve program (*lp_solve*) for solving Integer Linear
Programming problems.

Bound-T is written in the Ada language and compiled with the GNU Ada compiler GNAT. The main development platform is Debian Linux which has excellent Ada support thanks to Ludovic Brenta and others.

For many, many reasons, both low-level and high-level. To take a
fairly trivial low-level point as a first example, analysis and
simulation of binary machine code requires well-defined and portable
binary integer types, of various bit-widths. Ada is excellent at this,
as its *modular types* can be defined to have exactly the number of
bits desired.

On a slightly higher level, one of the main steps in porting Bound-T
to a new target processor is to define an internal, abstracted
representation of the instruction set: the types of instructions,
operands, and other things that make up the instruction set. The strong
support in Ada for *enumeration types* and structured types such as
arrays and records is very helpful here. In particular, arrays indexed
by enumeration values, and discriminated records with variant forms, are
often a close fit to this problem domain.

On an intermediate level, the Ada feature of *generic
programming* lets the programmer write a complex, general algorithm
once, and then instantiate it for the particular data types and
operations required by a given problem. For example, Bound-T contains a
generic package for *least-fixpoint iteration*, a very common and
powerful tool in program analysis. Different instances of this generic
package implement different phases of the analysis, such as constant
propagation and value-origin analysis.

Finally, to take an example from the highest level, Ada makes a clear
*separation* between the *interface* of a module, and the
*implementation* of the module. For Bound-T, in particular, this
helps to separate the general, target-independent modules from the
target-specific modules.

Bound-T relies mostly on the 1995 Ada standard but increasingly uses the features added in the 2005 standard.

Object-oriented coding, in the form of Ada's *tagged types*, is
used more and more. We find it to be a good new way to reduce
dependencies between modules without hampering further development and
refinement of each module. The exception-handling feature of Ada is also
useful here, in addition to its primary (and important) role of catching
programming errors and errors or unforeseen properties of the input
data.

The *stream-oriented Ada I/O packages* are exactly what we need
for reading compiled and linked executables, for example ELF or COFF
files. The ability to override the stream input operations for scalar
types lets us support both little-endian and big-endian files very
easily, as well as compute check-sums on the fly. For structured types
(arrays and records) Ada automatically constructs input operations by
combining the input operations of the components of the structured type.
This avoids a lot of manual programming and greatly reduces the risk of
errors in such code.

One feature of Ada that Bound-T does not yet use is *tasking*:
the ability to divide the program into concurrently executing tasks and
thus, for example, make good use of the several CPU cores now often
available in workstations. However, static program analysis can be very
time consuming — for Bound-T, this holds especially for the
Presburger Arithmetic analysis. In the future, we may well use tasking
to spread the computations over all the available cores.

Bound-T relies heavily on modularity through the Ada *package*
feature. Counting both the general parts of Bound-T and the
target-specific parts for all current targets, Bound-T contains a total
of some 480 Ada packages. These packages can be grouped into four
subsets as follows:

- Some 150 general, target-independent packages that form the core of Bound-T and are used for every target processor.
- Some 80 library-type packages, mainly for reading various kinds of executable files and their debugging information. The libraries needed for a given target processor depend on the executable-file formats commonly used with that processor. For example, programs for the Intel-8051 processor often use a variant of the Intel AOMF-51 format, while programs for ARM7TDMI favour the ELF format.
- About 10 library-type packages that encapsulate functions of the operating system of the host PC (workstation) on which Bound-T runs: Linux, MS-Windows, Mac OS-X.
- Typically about 40 target-specific packages for each target processor.

A separate Bound-T executable is compiled for each target processor. Target-specific packages define fundamental data types for the target processor, such as the code- and data-address types, the structure of the pipeline state that identifies nodes in the flow-graphs, and the set of data storage cells. These target-specific types and their operations are used in the general, target-independent packages that build flow-graphs, analyse arithmetic expressions of data cells, and so on.

Source-level portability to the most common host PC (workstation) types is an important goal for Bound-T. The Ada language has worked very well for this: only a very small part of Bound-T is host-specific. The main host-specific code implements the child-process creation and communication through which Bound-T uses its support programs: the Omega Calculator and LP_Solve. This host-specific code exists in two versions, one for Unix/Linux/Mac and one for MS-Windows. Each version consists of about 300 Ada declarations and statements, an insignificant amount compared to the 51,000 declarations and statements in the core Bound-T (the general, target-independent packages and the library packages) which is portable to all these hosts. Not to mention a similar amount of host-portable but target-specific code.

Bound-T was first developed at Space Systems Finland Ltd (SSF) with support from the European Space
Agency (ESA/ESTEC). Free software has
played an important role: we are grateful to Ada Core Technology and GNU for the GNAT compiler, to William Pugh
and others for the Omega system, to Michel
Berkelaar for the LP_Solve
program, to Mats Weber, Ted Dennison and EPFL-DI-LGL for Ada programming
component libraries, and to Marc Criley for the *XML EZ_Out*
package that helps to generate XML outputs. Call-graphs and flow-graphs
from Bound-T are displayed with the *dot* tool from AT&T Bell
Laboratories.