CProver manual
|
This document describes the semantics and usage of memory-related and pointer-related primitives in CBMC.
When CBMC analyzes a program, by default it uses the architectural parameters of the platform it is running on. That is, on a 64-bit system, CBMC will treat pointers as having 64 bits. This can be changed by various options (see section “C/C++ frontend options” in the output of cbmc --help
).
Memory is represented in CBMC as a set of objects. Each object represents a contiguous sequence of bytes and is identified via a numeric object ID. For example, assuming integers of width 4 and chars of width 1, a global integer variable would correspond to an object of size 4, and memory allocated via malloc(10)
would correspond to an object of size 10.
A pointer then consists of two parts: the upper n bits form the object ID, and the remaining bits form the offset. The object ID part holds the ID of the object the pointer is pointing to, and the offset part holds the byte offset within that object. The offset is signed.1 The null pointer is the pointer with object ID 0 and offset 0. CBMC uses 8 bits by default to represent the object ID. This can be changed via the --object-bits <n>
option.
There are three primitives which directly operate on the value of a pointer:
__CPROVER_size_t __CPROVER_POINTER_OBJECT(const void *p)
__CPROVER_ssize_t __CPROVER_POINTER_OFFSET(const void *p)
_Bool __CPROVER_same_object(const void *p, const void *q)
The primitive __CPROVER_POINTER_OBJECT(p)
retrieves the object ID part of a pointer, and the primitive __CPROVER_POINTER_OFFSET(p)
retrieves the offset part of a pointer. The __CPROVER_same_object(p, q)
primitive simply compares the object IDs of the two given pointers. That is, it is true if and only if __CPROVER_POINTER_OBJECT(p) == __CPROVER_POINTER_OBJECT(q)
. It is always valid to apply these three primitives to a pointer (i.e., they do not have any special preconditions).
Seeing that pointers consist of an object ID and an offset, it remains to describe how CBMC assigns object IDs to memory objects (such as local variables or malloced memory). CBMC deterministically assigns consecutive object IDs to memory objects as it encounters them. For example:
Here the pointers p
and q
would contain consecutive IDs in their object ID parts (as retrieved by __CPROVER_POINTER_OBJECT()
). Due to the deterministic assignment of object IDs, bugs that can only be exposed with specific pointer values cannot be found by CBMC. For example:
CBMC will report verification successful for this code snippet (assuming it assigns an object ID other than 0x0F to the malloced memory). However, assuming that malloc()
could allocate memory at any address, the assertion could fail.
Moreover, CBMC does not reuse object IDs for malloced memory. For example:
CBMC would report verification successful on this code snippet. However, assuming that malloc()
could reuse deallocated addresses, the assertion could fail.
The memory objects in CBMC are independent of each other. That is, for example, when incrementing a pointer past the bounds of an object, the pointer will never point into another memory object (such as could happen when running on a real machine). To verify that pointers stay within the bounds of their pointees, the CBMC option --pointer-overflow-check
can be used.
CBMC ships a model of malloc
that seeks to emulate the behaviour of the C standard library. This model is configurable to suit the assumptions the software under scrutiny may be making. One common assumption, matched by CBMC’s default configuration, is that dynamic memory allocation always succeeds and malloc
never returns a NULL
pointer. Code making such an assumption will look as follows:
This extends to the case of malloc(0)
, and CBMC returns a valid pointer to an object. The size of that object is zero, implying that any attempt to read from or write to this object will result in an out-of-bounds access. The ensuing undefined behaviour can be detected by running CBMC with --pointer-check
.
In an actual execution, however, memory allocation may fail for a number of reasons and malloc
would return a NULL pointer. CBMC’s model can, therefore, also be configured to fail allocating memory when the requested allocation size is larger than representable under CBMC’s object-offset model (as described in Memory and pointers in CBMC), or even non-deterministically fail (for any size). Any such failure can either result in calls to malloc
returning NULL
, or reporting such a call as a failed property. The following command line options facilitate the above failure configurations:
Flag | Check |
---|---|
--malloc-fail-null | return NULL when emulating an allocation failure |
--malloc-may-fail | non-deterministically fail to allocate |
Note that the use of --malloc-may-fail
also requires --malloc-fail-null
. The following code example demonstrates the effect of these options:
Under CBMC’s default model of malloc
, the else
branch is unreachable. When running CBMC with --malloc-fail-null --malloc-may-fail
, p
would non-deterministically be set to NULL
, making all branches in the above code reachable.
These malloc failure options need to be set when the C library model is added to the program. Typically this is upon invoking CBMC, but if the user has chosen to do so via goto-instrument (using goto-instrument --add-library
), then the malloc failure mode needs to be specified with that goto-instrument
invocation, i.e., as an option to goto-instrument
.
In verification tools, uninitialized variables are typically treated as having a nondeterministic value. Programs can thus be verified on a set of potential inputs. For example:
Here, the value of i
is nondeterministically chosen from all the possible integer values, and then constrained to negative values via the assumption. In CBMC, like uninitialized integers, uninitialized pointers are treated as having a nondeterministic value. That is, the value of the pointer itself is nondeterministically chosen, though no memory is allocated. Therefore, pointers should be explicitely initialized to ensure that they are backed by valid memory.
In this section, we describe further memory primitives of CBMC. Above, we have already encountered the primitives __CPROVER_POINTER_OBJECT(p)
, __CPROVER_POINTER_OFFSET(p)
, and __CPROVER_same_object(p, q)
. These primitives just retrieve the object ID part or offset part of a pointer, or compare the object ID parts of two pointers. It is always valid to apply these primitives to a pointer (i.e., they do not have any special preconditions).
In the following, we need the concept of a valid pointer. A pointer is valid if it points to a live memory object. That is, it points to the start or to somewhere within the sequence of bytes that makes up the memory object.
Conversely, a pointer is invalid if it is null, uninitialized, points to deallocated dynamic memory, points to an out-of-scope local variable, or has a value that does not point to (dynamically, automatically, or statically) allocated memory, or is out of bounds of the memory object it points to (i.e., the memory object identified by __CPROVER_POINTER_OBJECT(p)
).
The primitives below have unspecified semantics on pointers that are neither null nor valid. CBMC has an option --pointer-primitive-check
(see section Detecting potential misuses of memory primitives below) to check that pointers used in the primitives are either null or valid.
The following primitive can be used to retrieve the size of the memory object a pointer points to:
__CPROVER_size_t __CPROVER_OBJECT_SIZE(const void *p)
If p
is the null pointer, the primitive returns 0. If p
is valid, the primitive returns the size of the memory object the pointer points to. Otherwise, the semantics is unspecified. In particular, it is valid to apply this primitive to a pointer that points to within a memory object (i.e., not necessarily to the start). The result is the same as if the pointer would point to the start of the memory object (i.e., would have offset 0).
The following primitive can be used to check whether a pointer points to dynamic (malloced, heap) memory:
_Bool __CPROVER_DYNAMIC_OBJECT(const void *p)
If p
is the null pointer, the primitive returns false. If p
is valid, the primitive returns true if the pointer points to dynamically allocated memory, and false otherwise. If p
is neither null nor valid, the semantics is unspecified. Like __CPROVER_OBJECT_SIZE()
, it is valid to apply the primitive to pointers that point to within a memory object.
The following two primitives can be used to check whether there is a memory segment starting at the given pointer and extending for at least the given number of bytes:
_Bool __CPROVER_r_ok(const void *p, size_t size)
_Bool __CPROVER_w_ok(const void *p, size_t size)
At present, both primitives are equivalent as all memory in CBMC is considered both readable and writeable. The primitives return true if p
points to a live object and the object that p
points into extends to at least size
more bytes. Else, an assertion encompassing the primitive will be reported to fail. Do not use these primitives in assumptions when p
can be not valid as such use can yield spurious verification results.
As described above, the primitives listed in the Memory Primitives section require a pointer that is either null or valid to have well-defined semantics. CBMC has the option --pointer-primitive-check
to detect potential misuses of the memory primitives. It checks that the pointers that appear in the following primitives are either null or valid:
__CPROVER_OBJECT_SIZE
__CPROVER_DYNAMIC_OBJECT
__CPROVER_r_ok
__CPROVER_w_ok
The following three primitives have well-defined semantics even on invalid pointers. Thus, they have been excluded from the --pointer-primitive-check
option.
__CPROVER_POINTER_OBJECT
__CPROVER_POINTER_OFFSET
__CPROVER_same_object
Using them on invalid pointers, however, may still be unintended in user programs.
1 Pointers with negative offsets never point to memory objects. Negative values are used internally to detect pointer underflows.
Last modified: 2024-11-20 06:00:32 -0800