CProver manual
|
This is a tool for generating harnesses, that is, functions that instrument another function under analysis, by setting up an appropriate environment before calling them.
This is useful when trying to analyse an isolated unit of a program without having to manually construct an appropriate environment.
For a given C program - program.c
- to generate a harness for a function test_function
, we have to do the following:
We have two types of harnesses we can generate for now:
function-call
harness, which automatically synthesises an environment without having any information about the program state.memory-snapshot
harness, which loads an existing program state (in form of a JSON file) and selectively havoc-ing some variables.The harness generator can either produce the harness (i.e., the function environment) as C code, or a full GOTO binary. C code is generated when the output file name ends in “.c”.
For the most basic use of the function-call
harness generator, imagine that we have the following C program:
We first need to generate a GOTO binary.
Then we can call goto-harness
on the generated GOTO binary to get a new C file that contains the harness function:
The options we pass to goto-harness
are:
harness-function-name
is the name of the function generated by the harness generator (this needs to be specified for all harness generators).harness-type
is the type of harness to generate (call-function
is the function-call harness generator)function
is the name of the function that gets instrumentedWhat comes out of this is a C file that looks like this:
After you have generated the harness file, you have two choices. The first one is to pass it along with the original file containing the function to be harnessed to CBMC for analysis, something that would look like this:
Or, you could choose to turn it into another goto binary with goto-cc
, to be served as input to another instrumentation tool (like goto-instrument
).
The example above demonstrates the simplest case, which is roughly the same as the entry point cbmc
automatically generates for functions. However, the function-call
harness can also non-deterministically initialise global variables, array and struct elements. Consider this more complicated example:
Now we’ll try to find the bug in check_list
by generating a harness for it.
We have 3 new options here:
max-nondet-tree-depth
is the maximum extent to which we will generate and initialize non-null pointers - in this case, this means generating lists up to length 4min-null-tree-depth
this is the minimum depth at which pointers can be nullpointers for generated values - in this case, this sets the minimum length for our linked lists to one.nondet-globals
is non-deterministically initialising global variables.CBMC version 5.11 (cbmc-5.11-1523-g419a958-dirty) 64-bit x86_64 linux [...] ** Results: example.c function initialize_global [initialize_global.unwind.0] line 17 unwinding assertion loop 0: SUCCESS example.c function initialize_other [initialize_other.unwind.0] line 32 unwinding assertion loop 0: SUCCESS example.c function check_list [check_list.assertion.1] line 42 assertion list_parameter != global_linked_list: SUCCESS [check_list.unwind.0] line 50 unwinding assertion loop 0: SUCCESS [check_list.assertion.2] line 55 assertion global_cursor->value == parameter_cursor->value: FAILURE ** 1 of 5 failed (2 iterations) VERIFICATION FAILED
We also have support for arrays (currently only for array function parameters, globals and struct members are considered for the future).
Take this example of an implementation of an is_prefix_of
function that should return true if the first string parameter prefix
is a prefix of the second one string
.
We can compile and run it like this:
We have the additional option associated-array-size
here. This is in the format <array-parameter-name>:<array-size-parameter-name>
and will cause the array parameter with namearray-parameter-name
to be initialised as an array, with the parameter array-size-parameter-name
holding its size (it should have some integral type like int
or size_t
).
Running the example shows the bug highlighted in the comments:
[...] [is_prefix_of.pointer_dereference.6] line 14 dereference failure: pointer outside object bounds in prefix[(signed long int)i]: FAILURE
By default, arrays are created with a minimum size of 1 and a maximum size of 2. These limits can be set with the --min-array-size
and --max-array-size
options.
If you have a function that takes an array parameter, but without an associated size parameter, you can also use the --treat-pointer-as-array <parameter-name>
option.
If you want to non-deterministically initialise a pointer as a C string (character array with last element ‘’\0'`) then you can do that like this:
Then call the following:
Note that C strings are supported by the same mechanism behind the non-deterministic initialisation of pointers and arrays, so the same command line arguments apply, in particular --associated-array-size
.
This will result in:
[...] ** Results: main.c function function [function.assertion.1] line 5 assertion pointer[size-1] == '\0': SUCCESS ** 0 of 1 failed (1 iterations) VERIFICATION SUCCESSFUL
The function-call
harness is used in situations in which we want to analyze a function in an arbitrary environment. If we want to analyze a function starting from a real program state, we can use the memory-snapshot
harness instead.
The snapshot of the program state of interest may be taken at a particular program location within a function (using the memory-analyzer
tool). In that case we want to generate a harness that behaves as if execution starts at a particular program location. The initial program location can be specified via the options --initial-goto-location <function>[:<location-number>]
or --initial-source-location <file>:<line-number>
.
Say we want to check the assertion in the following code:
Assume we are interested in the code represented by the clip()
function and its effect on the assertion below. To that end, we want to take a memory snapshot after the calls to get_random_value()
and get_one()
.
In order to take the snapshot with memory-analyzer
, we need to first compile the program which goto-gcc
, which produces a binary containing both native machine code and the corresponding goto program:
Then we can execute the program with memory-analyzer
and take a snapshot at the specified breakpoint. The variables to be included in the snapshot need to be specified via the --symbols
option.
We then generate a harness with goto-harness
that behaves as if execution started from the state given by the memory snapshot at the specified program location. We further overapproximate the value returned by get_random_value()
by havocking the variable x
.
We can now verify the resulting goto program with cbmc
:
This will result in:
[...] ** Results: main.c function main [main.assertion.1] line 50 assertion y + z <= 100: SUCCESS ** 0 of 1 failed (1 iterations) VERIFICATION SUCCESSFUL
Last modified: 2024-12-16 13:58:47 -0800