research-article

Open access

The Concept of Class Invariant in Object-oriented Programming

Authors:

Bertrand Meyer,

Alisa Arkadova,

Alexander KogtenkovAuthors Info & Claims

Formal Aspects of Computing, Volume 36, Issue 1

Article No.: 5, Pages 1 - 38

https://doi.org/10.1145/3626201

Published: 20 March 2024 Publication History

PDF eReader

Abstract

Class invariants—consistency constraints preserved by every operation on objects of a given type—are fundamental to building, understanding, and verifying object-oriented programs. For verification, however, they raise difficulties, which have not yet received a generally accepted solution. The present work introduces a proof rule meant to address these issues and allow verification tools to benefit from invariants.

It clarifies the notion of invariant and identifies the three associated problems: callbacks, furtive access, and reference leak. As an example, the 2016 Ethereum DAO bug, in which $50 million was stolen, resulted from a callback invalidating an invariant.

The discussion starts with a simplified model of computation and an associated proof rule, demonstrating its soundness. It then removes one by one the three simplifying assumptions, each removal raising one of the three issues and leading to a corresponding adaptation to the proof rule. The final version of the rule can tackle tricky examples, including “challenge problems” listed in the literature.

1 Class Invariants

The concept of the invariant, a property preserved by various operations, plays an important role in many areas of science, particularly mathematics and physics. Computing is no exception. Invariants arise on both sides of the basic computing duality: algorithms and data. For algorithms, reasoning on loops relies on the notion of loop invariant, a property obtained upon loop initialization then preserved by every subsequent iteration [19, 20, 25]. For data, reasoning on object structures relies on the notion of class invariant, obtained upon object creation and then preserved by every subsequent operation on an object. While the general concept of class invariant is clear, going back to a 1972 article [27], there is as yet no widely accepted semantic specification, as exists for loop invariants. The purpose of this article is to provide such a specification.

1.1 Basics

Class invariants exist because the values, or “fields,” making up objects are not arbitrary but often constrained by consistency properties. In an object representing a company, assets equal equity plus liabilities (“accounting equation” [58]). Every operation must preserve this property: If it modifies (say) the equity, then it must update at least one of the other two fields to ensure that it holds again.

In object-oriented (OO) programming, which structures programs based on the types of their data, or classes, such a property will be part of the invariant for the corresponding COMPANY class, part of its contract [39, 40]. Each operation of the class must, in addition to satisfying its own contract—assuming its precondition on entry, ensuring its postcondition on exit—preserve the invariant, part of the contract of the class, which in effect is “and”-ed to both its pre- and postconditions. Verification of OO programs relies on these rules.

This basic idea is straightforward, captured by a Groucho Marx quip in a cult scene of A Night at the Opera [36]: “That’s in EVERY contract: it’s called a sanity clause.” The more respectable software name for the “sanity clause” is the concept of class invariant, leading to an Invariant Hypothesis (Section 2.2), which makes it possible to reason about programs manipulating data structures. Three phenomena, made possible by the constructs of actual programming languages, can invalidate the Invariant Hypothesis, seeming to justify Chico Marx’s final retort in the scene: “There ain’t no sanity clause.” They are as follows:

—

Callbacks, which can find an object in a state not satisfying the invariant.

—

Furtive access to objects that are temporarily inconsistent.

—

Reference leak, arising from the presence of references (or pointers) and the associated phenomenon of aliasing, which may render an object inconsistent through operations on other objects.

Restrictions on the programming language yield a “Simple Model” of computation (Section 3), with a simple proof rule guaranteeing the Invariant Hypothesis. These restrictions are not acceptable for realistic programming, but removing them can (Section 4) raise the risks just cited. To remove these risks, Sections 5 to 8 successively drop the restrictions, updating the proof rule to preserve the Invariant Hypothesis. The final rule is suitable for verifying actual, unrestricted OO programs and, we hope, restore the faith in a sanity clause.

1.2 Contributions and Limitations

This work builds on the rich literature on class invariants in the past half-century and particularly the past decade (Section 11). Two distinctive features are as follows:

—

Proof rule. Many published discussions propose a “methodology” for handling class invariants. We feel that a more rigorous approach is needed in the form of an inference rule, as used in axiomatic (Hoare–Dijkstra-style) semantics to characterize all programming constructs, as in the use of loop invariants in the inference rule for loops.

—

No programmer annotations. Most approaches require programmers to add special elements, not affecting the program’s behavior, to help verification. In contrast, the proof rule of this article applies to programs as programmers would normally write them, without any such additions.

Examples of annotations (in approaches reviewed in Section 11) include wrap/unwrap instructions to express that at a given point objects satisfy, or not, their invariants; and “ownership” type annotations to express that certain objects (such as a list element) can only be accessed through others (such as a list header). They help the verification tools but complicate the programmers’ task and can turn them away from verification. Transferring as much of the burden as possible from programmers to the tools, the intent of the present work, is part of a general effort to make verification part of a normal programming process.

Other contributions include the following:

—

A theoretical analysis of the invariant concept (starting with Section 3). To our knowledge, the literature does not so far include a proper explanation of that concept including its subtleties.

—

A general model of OO computation enabling formal reasoning (again starting with Section 3).

—

As part of this model, the notion of a “Global Consistency Property,” an overall invariant property of the state of the entire heap.

—

A classification of the issues (Section 4), not always properly identified in the literature.

—

A correctness condition, key to avoiding wrap/unwrap annotations: Objects must satisfy their own invariants before a qualified call (Section 5).

—

The use of information hiding mechanisms and specifically of selective exports to address furtive access (Section 6). To our knowledge, no approach so far has made the connection between invariant issues and export specifications. Section 7 removes any undue burden on programmers.

—

A simple solution (Section 8) to the tricky issue of reference leak, non-optimal but easy to implement in a prover—along with a more general solution that is harder to integrate in a modular tool.

—

Proofs of soundness for the techniques described (Sections 5 to 8).

—

On the notation side, a concise new way (Section 2.4) of expressing inference rules of axiomatic semantics, making it easy to compare candidate rules.

Limitations are discussed in Section 12; in particular, the rule does not yet cover inheritance, which will be addressed in a follow-up, and the soundness proof has not yet been mechanized.

1.3 Progression and Preview

The main theoretical result of this work is a proof rule, whose final versions appear in Sections 8.1 to 8.3. The discussion starts from a simple version (reflecting the intuitive notion of invariant), which holds in the Simple Model but does not handle the three obstacles that arise in practice; it then refines it in subsequent sections to overcome these obstacles one after the other. The goal of this step-by-step approach is to let the reader understand the precise justification for all components of the rule and the subtleties of the concepts involved.

For the reader who would like to have a preview of the overall result, the key inference rule derived from this discussion is (Section 8.3)

characterizing the semantics of a call x.r (a) to a routine r with body body $_r$ and formal arguments f. Pre $_r$ and Post $_r$ , applicable to both f and the call’s actual arguments a, are r’s pre- and postconditions. INV denotes the invariant of respective classes (client and supplier); it can be applied to specific objects, as in x.INV. INV_LOC is the part of INV that only depends on an object’s fields, not on other objects. Finally, INV / r is the result of “slicing” an invariant by a routine: removing all clauses of INV that have more visibility, in the sense of information hiding, than r. (Section 7 will introduce a small optional optimization allowing slightly more liberal versions of the conditions INV / r and a.INV / r.)

This rule has variants and complements (for initialization, for pure queries) but is the basis for proofs of the literature’s “challenge problems” in Section 10.

1.4 Terminology

Presentations of object-oriented programming vary in their terminology, often influenced by a specific programming language. The present article uses the following consistent terminology, going back to Reference [38]:

—

An object is a runtime instance of a class. (“Class” is a static concept, denoting a part of the program text, and “object” a dynamic concept.)

—

Type means the same as “class” for this discussion, so that “an object of type T” means the same as “instance of T,” and a program may declare a variable “of type T” to specify that its runtime values will be references to instances of T. (“Type” is in fact a more general concept, particularly in the presence of generic classes, but the differences do not matter for this article. More generally, typing, static or dynamic, plays no role in the discussion.)

—

An object is a collection of individual values or fields. In the corresponding class, each field is defined by an attribute (sometimes also called field, but it is better to distinguish the static and dynamic concepts).

—

Classes define operations, or features, on their instances. A feature can be a command, also known as a procedure, which can change one or more objects but does not return a result, or a query, which returns information about the object. (In C++, the term for “feature” is “member.”)

—

A query (such as “obtain a bank account’s balance”) can be implemented as an attribute (look up the balance field in the object) or a function (define an algorithm to compute the balance from successive deposits and withdrawals).

—

A routine, or “method”, is an algorithm applicable to the instances of a class; routines cover both procedures and functions.

We distinguish between the “class invariant,” a consistency constraint defined in a class text, applying to all instances of the class, and an “object invariant,” its application to a given instance. (For no apparent reason, some work cited in Section 11 uses “object invariant” to mean “class invariant,” even though the latter term has been around for 50 years.) Note that a class invariant, and corresponding object invariants for instances, denote a property of the fields of a single object, even though some of these fields can be references to other objects and the invariant may involve queries on them. A property that describes the combined consistency of several objects is called a multi-object invariant.

2 Working with Invariants

Class invariants are a tool for reasoning about the correctness of programs manipulating possibly complex runtime data structures. We review the concept, first informally then through an initial proof rule that captures the essential simplicity and elegance of the class-invariant concept.

Fig. 1.

2.1 Working with Heaps

A runtime object structure is made of objects with mutual references and is called a heap. Figure 1 shows an illustration of a small part of a possible heap. Objects are, as noted, made of “fields.” Some fields, such as age, are basic values such as integers and strings, while others, such as spouse, are references to other objects. We say that such an object is attached to the reference as represented in the program; for example, the bottom object in Figure 1 is attached to the city field of the top-left object. The objects in the figure represent three persons and one city; some fields have been left blank.

A “qualified call” x.r (a) executes a routine r on a target object denoted by x, with (optional) arguments a. We may view the life of a single object S during execution as a sequence of such calls, illustrated by Figure 2. After an initial creation, through a “constructor” or “creation procedure” called make in the figure, the object executes successive operations (q, r, t) as a result of qualified calls from client objects (which will appear in a more complete version of the figure, Figure 3 in Section 2.8). Figure 2 shows the successive states (S $_1$ , S $_2$ , S $_3$ , ...) of the object: after creation (make) and then before and after every qualified call (to routines q, r, t, ...).

Fig. 2.

Every object is (Section 1.4) an instance of a class defined in the program text. The objects in Figure 1 are assumed to be instances of classes PERSON and CITY. Such classes typically specify consistency constraints on their instances, which make up the class invariant. For example (Section 1.1), the invariant of a COMPANY class may specify assets = equity + liabilities. In Figure 1, the invariant of class PERSON may include city = spouse.city (I live in the same city as my spouse) and spouse.spouse = Current (the spouse of my spouse is myself, with Current denoting the current object as explained below).

2.2 The Invariant Hypothesis and the Scandalous Obligation

Invariants such as the ones cited are crucial for reasoning about programs. In spite of the name, a class invariant does not need to hold at all times; it can be temporarily violated during a routine’s execution, as at the marked point in r’s execution in Figure 2. But it must hold in states (S $_1$ , etc.) where client objects can start a new call. An object that satisfies its invariant is said to be consistent.

This fundamental property is the “Invariant Hypothesis”:

(“Initial” because a slightly revised version will appear in Section 6.5.) The “target” in x.r (a) is the object denoted by x. A “consistent” object satisfies the invariant of its class. The Invariant Hypothesis enables both program construction and verification to rely on the class invariant, by assuming that every object satisfies it whenever it is accessible from the rest of the system (in states S $_1$ , S $_2$ ...). We deduce this property simply by showing that every creation procedure (constructor) of the class ensures it (in state S $_1$ in Figure 2) and every exported feature (every feature r that clients can use, in calls of the form x.r (a)) preserves it.

This mode of reasoning takes after mathematical induction (of which invariants, whether for classes or loops, are the application to computing). To deduce a property P of all natural numbers it suffices to prove that $P (0)$ holds and that $P (n + 1)$ follows from $P (n)$ for any n. Then we can assert, for example, $P (386)$ . We do not need to go back to $P (385)$ , then $P (384)$ , and so on.

To understand the importance of the Hypothesis, one may consider what happens if it does not hold. The invariant would just be a clause repeated explicitly in the pre- and postconditions of every routine, which would be expressed as Pre $\ \wedge \$\) INV and Post $\ \wedge \$\) INV, where INV captures common elements. But then every caller would have to establish x.INV before every call x.r (a). We call this transfer of responsibility from a class to its clients the Scandalous Obligation.

The Scandalous Obligation would render the notion of invariant largely useless (it is the equivalent of forcing the proof of $P (386)$ to go over all previous values). Also note that the invariant may include secret (private) features, which clients cannot directly affect. The rest of this article shows how to ensure the Invariant Hypothesis, which unleashes the induction-like power of the invariant concept by enabling clients always to assume, prior to a call, that the supplier object is consistent (satisfies its invariant). A model for OO computation, such as those appearing in Sections 3 to 8, will only be considered sound if it guarantees the Hypothesis.

An issue that does not arise with mathematical induction is that the Invariant Hypothesis only talks about the start of qualified calls: What about their end? As Figure 2 suggests, nothing happens to an object between the end of a qualified call of target S and the start of the next call on S, but something could happen to other objects that invalidates S’s invariant (reference leak, studied in Section 8). Separately from the Invariant Hypothesis, the postconditions of all rules will ensure that objects are consistent on call completion too.

We now turn to a formal expression of these concepts.

2.3 Background: A Classical Proof Rule for Routine Calls

Axiomatic reasoning on programs uses “Hoare-style” inference rules such as this one for non-OO routine calls (ignoring recursion and termination) [26],

Such a rule is a proof technique: Below the line appears a property of a construct, here the call r (a); the rule states that we may deduce it by proving the property above the line, applicable to the body of r (f denoting its formal arguments). The properties involved are “Hoare triples” of the form {P} A {Q}, meaning that executing A in a state satisfying P will yield one satisfying Q.

Classic states that the effect of a call is that of executing the body after substituting actual for formal arguments. It captures the role of routines (methods): abstracting a computation by giving it a name and parameterizing it.

In an OO context, Classic remains applicable to unqualified calls r (a), executed on the current object (see 2.11). Also, if we ignore the invariant, to qualified calls:

The No-Invariant rule is conceptually the same as Classic, but takes into account the role, in OO programming, of the target of a qualified call, here x, which it simply treats as if it were an extra argument.

2.4 Notation

The rest of the discussion will introduce variants of the rule No-Invariant. Since they have many elements in common, we can rely on a concise tabular notation to avoid repetitions and emphasize the distinctive novelty of each variant. This notation expresses No-Invariant as the highlighted part of the following table (the rest provides explanations, not repeated in later rules):

Since the context is always the same, the notation further drops r (i.e., Pre stands for Pre $_r$ and so on). On the supplier side, INV will stand for the invariant of class SC; on the client side, for the invariant of class CC. Since x is of type SC, x.INV stands for the invariant of SC applied to x.

2.5 Sound But Useless Invariant Integration

No-Invariant does not recognize the role of the invariant; the author of the code would have to add it manually to Pre and Post for every exported routine r. Rule Pointless is a first attempt at integrating the invariant explicitly:

(Multiple assertions in a box, such as INV and Pre in BS $_1$ and BS $_2$ , are to be combined by conjunction. In mnemonics for entries, BS $_1$ , and so on, B stands for before and A for after, S for supplier and C for client.)

This rule and the ones that follow only talk about preservation of the invariant for an object that already satisfies it as a result of proper initialization. Section 9.1 will introduce a creation rule covering initialization.

2.6 The Invariant Preservation Property

The reason for calling the previous rule Pointless is that it fails to take advantage of the invariant. It is in fact the same as No-Invariant, with pre- and postconditions extended with the invariant.

The disappointment is clause BC, requiring clients to obtain x.INV prior to calls. This approach suffers from the Scandalous Obligation (2.2), losing the basic idea of invariants expressed by the Invariant Hypothesis. To satisfy the Hypothesis, we need a rule in which the invariant for the target object:

—

Will hold after the call (as represented by the presence of x.INV in AC).

—

May be assumed to hold before the call (BC), without transferring the responsibility to the caller.

Under such assumptions, one may prove the correctness of any call, in the form

with no x.INV on the precondition side, simply by having proved once and for all, independently of any callers, the correctness of the routine, in the form

Proving a class correct means proving the IPP (plus initialization, see 9.1) for its exported routines, which then suffices (this is what the Invariant Hypothesis means) for proving the correctness of any call (CCP, without x.INV on the left).

2.7 The Ideal Rule

Removing x.INV in BC gives an “ideal” rule, which satisfies the Invariant Hypothesis and in a simpler world would be the final word:

(Since the Pre and Post parts are not specific to OO programming and apply equally to all rule variants, we omit them from now on.) The novelty is the highlighted empty BC entry, which does not contain the requirement x.INV on the caller. If the Ideal rule applies, then the callers to a routine satisfying the CCP:

—

Never have to worry about guaranteeing x.INV on entry (as this property was ensured on creation, then preserved by previous operations); that is, are spared the Scandalous Obligation.

—

Are always able to rely on x.INV on exit (since re-establishing this sanity clause is part of the contract of r).

The Ideal rule is the direct formal expression of the Invariant Hypothesis, and the formal embodiment of the notion of class invariant. (The antecedent of the rule—BS and AS—expresses the Invariant Preservation Property.) A “Simple Model” of computation, described in the following section, satisfies it. The vagaries of actual programming in realistic languages will threaten to bring back the Scandalous Obligation and require adjustments to the rule.

2.8 The Runtime Model

To understand how the Ideal rule applies and where it requires adaptation, one must put the lifecycle of a single object (Figure 2) in the broader context of an entire OO system’s execution, involving any number of objects (as in the heap of Figure 1). Figure 3 includes, along with the original supplier object S, the history—after initialization—of two of its client objects, C1 and C2.

Fig. 3.

As in the earlier figure, S executes routines q, r...; Figure 3 shows that these executions result from qualified calls coming from clients C1 and C2. C1 calls q on S, and at some later time C2 calls r on S.

2.9 Current Object, Active Objects, Active Routines

At any point during execution, one of the objects is the “current object,” known as Current (or “this” in languages such as Java). Each execution step is either

—

An internal operation, which only affects the state of the current object. An example is an assignment modifying one of the object’s fields.

—

An external operation, in the form of a qualified call x.r (a), which triggers a routine on another object, denoted by x. That object becomes current for the duration of the routine’s execution.

Execution starts with a “root” object, the first “current.” As a result of qualified calls, other objects become current. At any time, objects are partitioned into two categories (Figure 4): objects in the call chain, starting from the root down to ending with the current object, called active; all others, called inactive. (In concurrent programming, execution may include more than one root and more than one call chain.) Note that an object S is active if it has started and not yet completed a qualified call x.r (a). We call r the “routine of” the active object; r and other routines along the call chain are the “active routines.” Note that the current object is active, although the semantic rules will treat it differently from other active objects. Inactive objects appear in the lower part of Figure 4.

As suggested in that figure, fields of objects in any category may be references to objects of any other.

Fig. 4.

2.10 Consistent and Inconsistent Objects

We saw in Section 2.2 that during execution some objects may temporarily become inconsistent (violate their invariants). The objects in Figure 3 go from consistent state to consistent state (blue squares), traversing inconsistent states in between. In particular, internal operations often break the invariant: with the invariant clause assets = equity + liabilities, after a change to equity the invariant no longer holds. The routine must restore it by changing one or both other fields.

The key formal property of an OO computational model is the specification, represented by a thick black rectangle in Figure 4, of which objects must be consistent. It will be known as a Global Consistency Property or GCP. Figure 4 illustrates version GCP-0, whose definition in Section 3.2 below will state that among non-current objects only active objects are permitted to be inconsistent. They include the current object, which in all models can be inconsistent, since it may be executing an internal operation. Later models with their associated proof rules, representing increasingly sophisticated forms of OO computation, use different variants, GCP-1 to GCP-3. For each variant, we will need to prove two results:

—

A Global Consistency Lemma stating that every runtime operation preserves the relevant GCP.

—

A Soundness Theorem (following from the Lemma), stating that the rule’s conclusion follows from its antecedent, and that the Invariant Hypothesis holds (every qualified call finds its target in a consistent state).

(A Global Consistency Property is a global invariant on the heap. To avoid confusion, the discussion will reserve “invariant” for class and object invariants.)

2.11 Qualified and Unqualified Calls

Along with qualified calls x.r (a), OO languages support unqualified calls r (a), which apply a routine r to the current object. To avoid any confusion (sometimes found in the literature), note that the rules on class invariants apply to qualified calls only. An unqualified call r (a) can be understood (“inlined”) as the sequence of instructions making up the body of r (accounting for possible recursion), and is not subject to the invariant: It may not assume it on entry and does not have to ensure it on exit. Accordingly, the applicable rule for such a call is the traditional non-OO call rule: Classic from Section 2.3, applied to the current object. What determines the rule to apply is not whether r is exported, but whether a given call is qualified, x.r (a), or unqualified, r (a). (Note that Current.r (a) and r (a), which perform the same computation, are bound by different rules: the first is subject to the invariant, the second is not.)

A routine r involved in an unqualified call can, of course, include a qualified call, but that call is only relevant for the process of proving r itself correct.

3 The Simple Model

What determines whether the Ideal rule and Invariant Hypothesis hold is both what properties invariants may state about objects and how operations (in the computational model presented above) may change them. The “Simple Model” of this section meets the rule and the hypothesis by severely restricting the expressive power of OO programs. Its proof of soundness in the rest of the section corresponds to proofs about simplified OO models in the literature.

3.1 The Restrictions

The Simple Model makes three assumptions:

Assumptions in the Simple Model
[SM1] There is no callback. (“No-callback”.)
[SM2] Invariant Preservation Property (2.6): every routine that can be used in a
qualified call preserves the class invariant (“Invariant preservation”.)
[SM3] Every invariant is a function of the object’s fields only. (“Invariant locality.”)

A “callback,” “direct,” or “indirect” is a qualified call whose target is (respectively) the current object or another active object. (An illustration of an indirect callback will appear in Figure 8, Section 6.7.) Assumption SM1 rules out both kinds.

Assumption SM2, Invariant Preservation, expresses the invariant concept in its simplest form. It is guaranteed by the antecedent part of the Ideal rule (Section 2.7), BS and AS (Invariant Preservation Property as seen in Section 2.6).

SM3, Invariant Locality, excludes any invariant clause x.some_property where x is a reference to another object: the invariant depends on the fields of a single object. A clause such as assets = equity + liabilities, involving only integer fields, satisfies this condition, but not any of the example invariants on PERSON, such as spouse.spouse = Current, which involves another object. These examples indicate that the restriction is severe.

A formal OO framework consists of a programming language (with any associated restrictions), a computational model, and Hoare-style inference rules for qualified calls (and initialization as will be added in Section 9.1). We say that the framework is sound if the computational model satisfies the rules.

We will now prove that the Simple Model with the Ideal rule is sound.

3.2 Global Consistency Property

The Simple Model’s key property is that it preserves Global Consistency, version GCP-0 (Section 2.10, see Figure 4): at all times, all inactive objects are consistent.

Global Consistency Lemma: in the Simple Model, all runtime events preserve GCP-0.

Proof: While specifying the full semantics of a program’s execution requires a formal model of the entire language, the analysis of invariant-related behavior need only consider runtime events of three kinds (recognizable in Figure 2):

—

Internal operation: perform an internal operation on the current object.

—

Start-of-call: start a qualified call x.r (a).

—

End-of-call: terminate such a call, returning to the calling object and routine.

The following table characterizes each in terms of (1) whether it affects which object is “current” and (2) whether it affects the invariant of any object.

Event type	Which object	Possibly invalidated
	becomes current?	invariants
Internal operation	No change (current	Invariant of
	remains current)	current object
Start of call x.r (a)	Object attached to x	None
End of call	Previously current object	None

Events of each kind preserve GCP-0:

Internal operation: Because of invariant locality (assumption SM3 in the Simple Model), the only invariant that the operation could invalidate is the invariant of the current object (highlighted entry). GCP-0, which does not require the current object to be consistent, is not affected.

Start-of-call: The target (the object attached to x) becomes current. The previous current and all other active objects up the call chain remain active. Since no object becomes newly inactive, GCP-0, which only constrains inactives, continues to hold. Note that the target (the new current) is consistent anyway, since it was inactive by the assumption SM1 of no callbacks.

End-of-call: Two objects are involved, the target of the call, say, S, and the previously current object, C. S becomes inactive again (otherwise the call would have been a callback, violating SM1). As we just saw in G2, S was consistent before the call. Thanks to SM2 (invariant preservation), it is now consistent again. C becomes current and hence does not have to satisfy its invariant under GCP-0.

This proof required all three assumptions SM1, SM2, and SM3 of the Simple Model. Subsequent sections will adapt it as we remove them one by one.

In passing, we have proved (discussion of G2 and G3) that the Simple Model satisfies the Consistent Target Theorem (CTT): On entry and exit of a qualified call, the target is consistent.

3.3 The Simple Model Is Sound

Simple Model Theorem: For the Simple Model, the Ideal rule is sound.

Proof: The property to establish is that if a routine has been proved correct (antecedent of the rule, with INV in both the pre- and postconditions of the routine, entries BS and AS), the postcondition x.INV (entry AC) will hold upon completion of any qualified call. This property immediately follows from the CTT theorem above, which also implies that the Invariant Hypothesis holds.

This result justifies leaving the BC entry blank in the rule: The client does not need to worry about x.INV before the call, avoiding the Scandalous Obligation.

An important observation applies to proofs of soundness for Ideal and its followers. All of them have the postconditions INV for the supplier (entry AS) and correspondingly x.INV for the client (AC). Unlike Pointless, however, they do not include the precondition x.INV for the client (entry BC); Ideal itself has nothing there. If we are able to prove the Invariant Hypothesis, then we guarantee that x.INV holds on start-of-call and hence can simply apply Pointless to derive the postcondition AC. As a result, to prove such a model sound it suffices to prove that its satisfies the Invariant Hypothesis.

4 Threats to the Invariant Hypothesis

Three phenomena can invalidate the properties of the Simple Model in the world of real OO programming: callbacks, furtive access, and reference leak. They are of a different nature (although sometimes commingled in the literature) and subject to different remedies. Each results from renouncing one of the assumptions of the Simple Model. Callbacks and furtive access are closely related; reference leak is of a different nature, arising from the use of references to objects in OO programming and the resulting possibility of aliasing.

Callback: If we drop assumption SM1, then the routine of a qualified call (such as q in the call S.q (a), executed on behalf of C1 in Figure 3) may execute another qualified call with the same target (a call C1.v (x)). Then the execution may find that object (here C1) in an inconsistent state. The callback can be indirect (to a caller’s caller and so on up the call chain). As to the practicality of this issue, recall that the 2016 $50-million Ethereum cryptocurrency heist resulted from an exploited consistency violation in a callback [53].

Furtive access: While normally every routine r in a call x.r (a) must preserve the invariant, it is often necessary in practice, as an intermediate step, to allow a call to a utility routine that does not preserve it, violating SM2 and, as a consequence, breaking the Invariant Hypothesis.

Reference leak: If we drop SM3 (invariant locality), even if every operation on an object A preserves its invariant, then an operation on another object B, to which A holds a reference, may break it.

4.1 The Marriage Example

A simple example, introduced in Reference [41], is subject to all three issues (other examples appear in Section 10). It involves a single class PERSON and a single routine of interest, marry. The class has attributes

where spouse is declared detachable as it can be a void (or “null”) reference. It has the following invariant:

expressing the relationship between the attributes. ( $\Rightarrow$ denotes implication, and a name before a colon in an assertion clause, such as distinct, is a tag identifying the clause.) is_married could be a function (returning true if and only if spouse $\ne \$\) Void), but defining it as an attribute helps reveal potential problems.

4.2 Recursive Marriage and Callbacks

Reference [41] analyzes the difficulty of writing a procedure marry (other: PERSON) that will result in setting the spouse field of the current object to other and its is_married field to True. The procedure has as a precondition that both the current object and other are unmarried. A straightforward recursive implementation, where marry calls other.marry (Current), will not work, because this call will violate the second precondition. The following variant avoids this problem:

The call other.spouse is, as marked, a callback. It finds the original object inconsistent, violating married_iff_has_spouse, since spouse is not Void but is_married is still False to satisfy the precondition. No reshuffling of instructions will (to our knowledge) satisfy both the preconditions and the invariant. All code is available in the article’s repository [3] for the reader to try out.

4.3 Non-recursive Marriage and Furtive Access

After callbacks, furtive access. In line with Reference [41] it is possible to write a non-recursive version with auxiliary routines:

both visible, for obvious reasons of information hiding, only to PERSON (that is, not usable from other classes). The callback goes away, but furtive access arises, in the form of violated invariants, for example if we write marry as follows (pre- and postconditions are as with the preceding version and omitted for brevity):

Before instruction 4, the clause married_iff_has_spouse of other’s object invariant does not hold, violating the Invariant Hypothesis. With the elementary semantics of invariants as suggested so far, such a violation will arise with any reordering of the four: As soon as one of instructions 2 and 4 has been executed, other will violate its clause married_iff_has_spouse (and executing the other one of these instructions would be the only way to restore it).

The invariant at issue here is not the invariant of the current object but the invariant of other. It is frustrating that these violations cause trouble, because they are all intuitively harmless: As in an internal (unqualified) call, the intermediate states of other can be inconsistent as long as the final one is consistent. In software verification, however, there is no simple equivalent of “trust me, I know what I am doing, everything will be fine in the end!”

4.4 Reference Leak

Fig. 5.

Finally, reference leak. In the same example, assume that we somehow have obtained a solution for callbacks and furtive access and a suitable version of marry. A reference leak will occur, illustrated in Figure 5, if some code using references Alice, Bob, and Charles to distinct objects performs

where divorce simply sets is_married to false and spouse to Void (its code appears in Section 10.1). Calling divorce is necessary, because marry’s precondition. After these calls, Alice.spouse is Charles, but Bob.spouse is still Alice, so Bob.spouse.spouse is Charles, causing Bob to violate spouse.spouse = Current.

The code is intuitively buggy: divorce should reset the is_married and spouse fields of the previous spouse. But divorce satisfies its contract (with obvious pre- and postconditions), and in particular it preserves the class invariant for the target object, here Alice. The scary part is that one may break another object’s invariant, here Bob’s, without performing any operation on it. Any proof technique relying on a simple invariant concept (as in the Simple Model) will miss the bug as all routines satisfy their contracts.

Stating that the code is “buggy” means, more formally, that it violates the Invariant Hypothesis. The violation will only cause trouble, however, in the next qualified call, if any, of target Bob. This property is the crux of the reference leak problem: It does not immediately manifest itself but results in a corrupted data structure, with some sitting ducks—inconsistent objects—ready to fall prey to the next qualified call.

The rest of this article develops a sound proof rule addressing all three issues.

5 Addressing Callbacks: Wrap Up Before Calls

We need a rule that accounts for callbacks. Most published solutions require programmer annotations, which—as noted—limits their practicality. Here the proof rule will integrate the sanitization of callbacks. The GCP becomes stronger (before being weakened again in the next step, Section 6).

5.1 Wrapping Up before Going Out

The basic observation is that prior to executing a qualified call an object must make itself amenable to callbacks, meaning that it should guarantee its invariant. Hence the next version of the rule:

For clarity, occurrences of INV in the statement of this rule refers to the corresponding class: S for the supplier side, C for the client side. The new element, highlighted in BC, is INV $_C$ . The use of C instead of S and of INV rather than x.INV is not a typo: We are talking of the invariant of the calling object. Compare with the presence in Pointless, at the same position BC, of x.INV, meaning x.INV $_S$ , imposing a Scandalous Obligation on the caller. Here, by using INV $_C$ , the object takes care of ensuring its invariant before starting a qualified call (like a person who cleans up the house before going out on a walk, just in case someone comes to visit during that time).

The requirement of wrapping up before a qualified call will be loosened in Section 6, and further in Section 7, which introduces a “smart” version of the invariant (removing the wrapping-up requirement if no callbacks may occur).

In some methodologies, as noted, ensuring the invariant before an outgoing call is the task of a special wrap instruction (Section 11). The Callbacks rule achieves the wrapping-up without imposing an annotation on programmers.

5.2 Computational Model for Handling Callbacks

The computational model corresponding to rule Callbacks, which we call the Callback Model, uses a new version GCP-1 of Global Consistency.

Compared to GCP-0 from the Simple Model, GCP-1 restricts the set of objects that may violate their invariants, now limited to at most one object, the current object. Figure 6 (which drops inter-object references, not needed for the discussion) is the counterpart for GCP-1 to Figure 4 for GCP-0.

Fig. 6.

The Global Consistency Lemma now states that all operations, assuming SM2 and SM3 (but no longer SM1, invariant preservation) preserve GCP-1. As before (Section 3.2), each step demonstrates this property for one of the three types of event:

Internal operation: Same reasoning as for GCP-0: Since we still assume invariant locality (SM3), the only object that may become inconsistent is the current object, which GCP-1 does not require to be consistent.

Start-of-call: Here the reasoning changes from GCP-0. We no longer assume SM1 (absence of callbacks) but now, remarkably, all objects will be consistent, ensuring GCP-1: the previous current C thanks to precondition INV $_C$ in entry BC; all other objects, which were already consistent by GCP-1. In particular, the target, which is the same as C in the case of a direct callback, is consistent.

End-of-call: The call’s target S satisfies its invariant thanks to the postcondition x.INV $_S$ in AC; all other objects satisfy theirs, so GCP-1 is preserved. Note that the previous current, which becomes current again, might be the same as S (direct callback) or another previously active object (indirect callback), and is consistent in all cases.

In passing we have proved an extreme form of the GCP, the Quasi-Constant Consistency Theorem (QCCT-1): In the Callback Model, in all states other than immediately after an internal operation, all objects are consistent. (Steps G2 and G3 of the proof showed that consistency holds of all objects after start-of-call and end-of-call, and internal events are the only other case.)

QCCT-1 is the closest we will ever come to a literal reading of the word “invariant” as a property that always (in this case, almost always) holds.

To prove the soundness of the Callback model for rule Callbacks, it suffices (see Section 3.3) to note that it satisfies the Invariant Hypothesis. That property is part of QCCT-1: The target is consistent after start-of-call. (QCCT-1 also tells us that the target is consistent after end-of-call.)

6 Addressing Furtive Access: the Slicing Model

Furtive access arises because of calls such as other.set_married (in non-recursive marriage, Section 4.3), where other does not satisfy its invariant, but where we do not expect it to; in fact, the call is expressly intended as a step toward re-establishing that invariant. It is paradoxical and frustrating to be punished with an invariant violation just as we are working toward the invariant.

Although qualified, such calls are conceptually comparable to internal, unqualified calls r (a), which can break the invariant of an object as part of a strategy to restore it. Legitimate cases of furtive access arise when two or more objects are involved in a similar strategy, trying to restore each other’s invariants through steps that may temporarily violate some of them. The Observer example (Sections 10.3 and 10.4) is typical, involving a subject and observer objects.

The Slicing Model, which will be described now, is a small update to the Callback model, addressing furtive access. The key observation for obtaining an annotation-free solution is that in practice legitimate cases of furtive access arise for calls to selectively exported features.

6.1 Selective Export Mechanisms

One of the object-oriented principles is information hiding (introduced by Parnas [50] in the same year as the first article mentioning class invariants [27]), which lets each class define which of its features and other properties it makes available (“exports”) to others (its clients). Many OO languages offer ways to fine-tune this decision through selective export, whereby a class exports specific features to specific clients; examples are the “friends” mechanism of C++ and “internal” accessibility in C#. In Eiffel, one may declare a feature in a clause starting with feature {A, B, &} to specify that it is only exported (or “visible” per the terminology defined below in Section 6.3) to classes A, B, … and their descendants.

Previous work seems to have missed the relationship of selective exports to the furtive access problem of class-invariant semantics, and its potential role in solving the problem without requiring additional programmer annotations.

6.2 Taking Advantage of Selected Exports

Observation of utility routines such as set_married and set_spouse and their counterparts in other furtive-access-prone code (such as the Observer pattern) reveals that in properly written code they are not publicly exported. Instead, they are selectively exported to the relevant class, in this case PERSON itself. As a consequence, while a call Alice.marry_stepwise (Bob) (Section 4.3) may appear in any class (since all versions of marry should be publicly exported), other.set_married is only permitted in class PERSON (and descendants).

Using information hiding criteria in the proof rule combines design principles and correctness concerns. The reasoning is that a routine should not have to account for invariant clauses of visibility higher than its own. All three invariant clauses of PERSON involve public features, spouse and is_married, of higher availability than set_spouse and set_married; they should not have to preserve these clauses. The situation with examples such as Observer is similar.

6.3 Slicing: Definitions

As a reminder, an invariant is made of a sequence of clauses; the one for PERSON (Section 4.1) has three clauses (here without tags): spouse $\ne$ Current, is_married = (spouse $\ne$ Void) and is_married $\Rightarrow$ (spouse.spouse = Current). The invariant is understood as the sequential conjunction (“and then”) of these clauses.

Regardless of the language mechanism for selective exports, the visibility V (r)of a feature r of a class C is defined as the set of classes that may use r as clients, in qualified calls x.r (a) with x of type C. The example has assumed so far (Section 4) that the features of class PERSON have full visibility (they are exported to all classes) except for two of them, exported only to class PERSON itself: V (set_married) = V (set_spouse) = {PERSON}. Visibility gives a preorder relation (reflexive and transitive) on the features of a class: we write r $\le$ g to express that V (r) $\subseteq$ V (g) (g is exported to the same classes as r, and possibly to some additional ones).

The queries of an invariant clause are the queries that appear in it either by themselves or as targets of calls. For example, the queries of

a and b.c

are a and b (but not c, which is the feature of a qualified call, not a feature of the invariant expression itself).

Visibility, defined above for individual features, also applies to invariant clauses. The visibility V (CL)of an invariant clause CL is the intersection $\cap$ V (q) of the visibilities of all its queries q. The preorder relation also follows: CL1 $\le$ CL2 means that $\cap$ V (q1) $\subseteq$ $\cap$ V (q2) for queries q1 of CL1 and q2 of CL2; in other words, for any class C to which all queries of CL1 is exported, all queries of CL2 are also exported to C. The preorder relation can similarly hold between invariant clauses and features: CL $\le$ r means that V (CL) $\subseteq$ V (r), that is to say, r is exported to all classes to which all queries of CL are exported.

Finally, if INV is the invariant of a class, then its slicing by a feature r of that class, written INV / r, is INV restricted to its clauses CL such that CL $\le$ r. In other words, INV / r is the part of INV that has no more visibility than r.

The clauses of the invariant of class PERSON (repeated at the beginning of this subsection) all have spouse or is_married among their features, and hence have full visibility, since these features are public (exported to all classes). Since marry, in all its versions, is also public, INV / marry is the full INV. However, INV / set_spouse and INV / set_married are empty invariants, since the given features have lesser visibility (they are exported to PERSON only). In the examples summarized in Section 10, sliced invariants are in-between these extremes.

The following Slicing Implication Theorem holds: If r $\le$ g, then INV / g $\Rightarrow$ INV / r. Proof: for any clause CL of INV / r, CL $\le$ r by the definition of slicing. Hence CL $\le$ g, implying—again by the definition of slicing, applied now to INV / g—that CL is one of the clauses of INV / g. So all the clauses of INV / r are also clauses of INV / g. As a consequence, remembering that an invariant denotes the conjunction of its clauses, INV / g implies INV / r.

A caveat about slicing is that some invariant clauses may need reinforcing to remain defined. For example, a good prover will accept an invariant clause c involving 1 / y if earlier clauses state x = y and x > 0; slicing x off removes these clauses and makes c potentially undefined, hence useless for verification. It is the programmer’s responsibility to make sure that all invariant clauses are meaningful, pre- and post-slicing (here, precede c’s condition by y $\ne$ 0 $\Rightarrow$ ).

6.4 Extended-active Objects

Active objects have been defined as the objects in the call chain (Figures 4 and 6). Each is executing “its call” to “its routine,” r, s, t in the figures. These routines can have arguments of reference types. The term extended-active objects will cover active objects plus those attached to arguments of calls in the call chain. (Remember that an object is “attached to” a reference if that reference points to it.) For brevity, “inactive” will from now on denote all non-extended-active objects (in the thick rectangle at the bottom of Figure 7). Note that the current object is active and hence extended-active, but all the properties below involving extended-active objects will be restricted to non-current ones.

As illustrated in Figure 7 and explained next, in the Slicing Model all non-current extended-active objects (those in the large dashed rectangle) will satisfy their invariants sliced by the routine of the respective call.

Fig. 7.

6.5 Relative Consistency and the Slicing Model

Consider a non-current extended-active object, with invariant INV. It either started the execution of a qualified call x.r (a) or is one of its arguments a. It is partially consistent if it satisfies INV / r: the invariant sliced by the current routine. Obviously, consistent implies partially consistent. A non-current object is relatively consistent if it is either consistent or partially consistent.

GCP-2, the new variant of the Global Consistency Property (Figure 7), differs only slightly from GCP-1 (compare to Figure 6): All non-current objects are relatively consistent. In particular, any (non-current) object in the call chain or attached to an argument is partially consistent (it satisfies INV / r for its routine r). All other non-current objects are fully consistent.

The Invariant Hypothesis (Section 2.2) similarly becomes the Relative Invariant Hypothesis, stating that at the start of any call, the target satisfies its relative invariant (its invariant sliced by the feature of the call).

6.6 The Slicing Rule

Notation: L.INV, for a list L, is the conjunction of x.INV for all elements x of L (in the rule, L is a list of formal or actual arguments, f or a). Here is the rule:

(We no longer need C and S subscripts for INV.) Notes on rule Sliced:

—

Preconditions on both the target (BS $_1$ , BC $_1$ ) and arguments (BS $_2$ , BC $_2$ ) are both sliced.

—

Postconditions are sliced on the target (AS $_1$ , AC $_1$ ), indicating that the routine need only preserve the relevant part of the invariant, to handle furtive access. An argument, however, denotes an object that may become inactive after the call, rejoining the lower part of Figure 7, and hence must be fully consistent on routine exit, ready to serve later as the target of a call to a routine of arbitrary visibility. The proof will formalize this analysis.

—

The Slicing Implication Theorem (6.3) indicates that some preconditions (BS, BC) are weaker than their counterparts in Callbacks. The effect for a selectively exported routine is to make the proof of the routine’s correctness harder (it assumes less) and the proof of a particular call easier (the object only needs to wrap itself up for its sliced invariant, not the full one). This property is what makes it possible to handle tricky cases such as Observer or marriage without annotations.

6.7 Two Programming Language Conditions

The soundness of the Sliced rule requires two conditions on programs, to be enforced by the programming language.

Export Consistency Condition: in any qualified call x.s (a) appearing in a routine r, s $\le$ r.

In other words, a routine may only perform qualified calls to routines with the same visibility or less. In the example of Figure 8, visibility along the routine sequence p, q, t, r can only remain the same or decrease. This reasonableness rule (do not rise above your own rank) can be compared to the practice of Web browsers such as Firefox, which let users start a new public or private window from a public window, but, from a private window, only a private one.

Fig. 8.

Under the Export Consistency Condition, the following Safe Indirect Callback Theorem holds: In an indirect callback of routine r, the routine q of the target has the same visibility as r. (See Figure 8.) Proof: from the Export Consistency Condition, r $\le$ q and q $\le$ r. (From Section 6.3, “ $\le$ ” on features is only a preorder, but on the associated visibility sets such as V (r) the relation is antisymmetric, since it is simply “ $\subseteq$ ”.)

The second language condition needed for the soundness of the Slicing Model is the Selective Export Call Rule: in a call x.r (a) where r is selectively (rather than fully) exported, the target x must be a formal argument of the enclosing routine. (In other words, x may not be an attribute or a local variable.) This restriction does not affect the actual power of the programming language, because it is always possible to wrap a call in a routine. Informal examination of code using selective exports shows that most selective-export calls are indeed on arguments and that only a small amount of tweaking may be needed, as in two of the challenge-problem examples, marriage (Section 4.1) and circular list (Section 10.2), whose verification led to introducing an extra argument.

The soundness of the Slicing Model also assumes that a routine cannot change the values of reference variables passed to it as actual arguments in a call. In other words, such arguments are passed by value. Many modern OO languages (such as Eiffel and Java) satisfy this common-sense condition, but others may violate it, requiring adaptations of the model presented here.

6.8 Handling Furtive Access Correctly: The Slicing Rule Is Sound

As with earlier rules, the core of the soundness proof for Sliced is the proof of the Global Consistency Lemma, here stating the preservation of GCP-2 (all objects except the current one are relatively consistent) through events of all three kinds:

Internal operation: The reasoning for Callbacks (GCP-1) applies: Since internal operations only affect the current object, and we still assume SM3 (invariant locality), they preserve GCP-2. Note the importance of the rule that routines cannot change the value of their reference arguments (Section 6.7): without it, an assignment f := Current to a formal argument f, modifying the corresponding actual, could make an extended-active object inconsistent.

Start-of-call: The following objects are involved:

–

(G2-A) The target of the call. It becomes current and hence does not affect GCP-2.

–

(G2-B) The client (the initiator of the call), previously current. It remains active and, thanks to the call’s precondition INV / r (entry BC $_1$ ), satisfies the sliced invariant, as required of active objects under GCP-2.

–

(G2-C) Objects attached to arguments of the call (a in the caller, f in the target). They become extended-active and hence must satisfy the invariant sliced by the routine being called. That is exactly what the precondition AS $_2$ and its counterpart BC $_2$ enforce: a.INV / r.

End-of-call: the following objects are involved:

–

(G3-A) The target. It ceases to be current. From entry AS $_1$ it satisfies INV / r. If (outside of the case of a callback, in which it remains active) it could become inactive, then that would not be enough, since inactive objects must satisfy the full invariant under GCP-2. The Selective Export Call Rule (Section 6.7) saves us: By requiring the target of a selective-export call to be a formal argument of the calling routine, it guarantees that the target remains extended-active (attached to an argument of an active routine), for which the sliced invariant suffices.

This property is key to the soundness of the Slicing Model, enabling “furtive access” calls to use the sliced invariant on both entry and return. The proofs of such challenge examples as marriage and Observer (Section 10.2) would be impossible (and the software incorrect) if furtive-access routines had to restore the full invariant.

–

(G3-B) The client. It becomes current again, hence incurs no obligation.

–

(G3-C) Objects attached to arguments of the call. During the call they were extended-active and only had to be partially consistent. On end-of-call they cease to be extended-active (except if active or attached to an argument higher up in the call chain) and must become fully consistent to preserve GCP-2. That is exactly what the postconditions AS2 and its counterpart AC $_2$ guarantee: a.INV, using the unsliced INV.

Steps G2-A and G3-B involve an object that GCP-2 allows to be inconsistent, because it is current. It is in fact relatively consistent in both cases, yielding an updated form QCCT-2 of the Quasi-Constant Consistency Theorem: In the Slicing Model, in all states other than immediately after an internal operation, all objects are relatively consistent. Proof:

—

In start-of-call (G2-A), the new current object is the target, which may be: inactive, satisfying the full INV; attached to an argument, hence relatively consistent by precondition BC $_2$ ; in the case of a direct callback, the previous current object, which is relatively consistent thanks the “self-wrapping-up” precondition BS $_1$ ; or (in the remaining case, indirect callback) an object up the call chain—active, not just extended-active—which is relatively consistent as a consequence of the Safe Indirect Callback Theorem (6.7).

—

In end-of-call (G3-B), the new current object is the client, which becomes active and hence, by preservation of GCP-2, relatively consistent.

As a consequence of QCCT-2, the Slicing Model satisfies the Invariant Hypothesis and hence (see 3.3) is sound.

7 Programming Style and Smart Invariants

The rules seen so far place two limitations on programmers: the obligation for an object to wrap itself up before a qualified call (Section 5) and the prohibition of qualified calls to routines of greater visibility (Selective Export Call Rule, 6.1). They may appear unduly constraining for practical programming. It is possible to relax them considerably, removing much of the burden on programmers.

Both of the restrictions cited arise from the need to protect against callbacks. In practice, many calls x.r (a) are callback-free, written CBF (r), meaning that r and any routine it calls directly or indirectly do not call back into the current object. For example, r may be a routine from a reusable library (as in x.extend (a), which adds an element a to the end of a list x), written long before the client code and hence certain not to call it. (The present work assumes that no arguments are function pointers or their counterparts such as C# delegates and Eiffel agents.) It may be possible for the verifying tools to ascertain CBF (r) in such cases through modular analysis (by making sure that any compiled routine carries information about which routines it may call outside of its own package).

If it can be ascertained that a call is callback-free, the rules simplify:

—

Objects no longer need to wrap themselves up before a qualified call.

—

The Selective Export Call Rule is no longer required.

Under these changes, the proof of the soundness of the Sliced Model (Section 6.8) remains applicable with only two updates:

—

Partially relax the Global Consistency Property (GCP-2): Remove the requirement of partial consistency on objects (such as the first three at the top of Figure 7) in a call chain that is guaranteed callback-free. Like the current object, they are not subject to any conditions. Requirements on objects attached to arguments remain.

—

Step G2-B showed that the client, which “remains active,” still “satisfies the sliced invariant” thanks to the call’s precondition. That precondition may not hold any more, but we no longer need active objects to satisfy the sliced invariant, since the only reason was to handle callbacks correctly.

Everything else remains. To reflect this change in the rule, it suffices to make a small adaptation to the definition of slicing. Consider INV / r, the local invariant INV sliced by routine r. (We call INV “local,” because it is applied to the current object, to distinguish it from the “remote” invariant x.INV, handled next.) Let INV / r be the sliced local invariant as defined earlier (Section 6.3). Thanks to the preceding observations the rules involving the sliced local invariant remain sound if we replace it by the smart sliced local invariant, defined as

\begin{equation*} \text{CBF (r)} \ {\vee} \ \underline {\mathsf{INV} /\mathsf{r}} \end{equation*}

In other words, the usual invariant by default, but True if the verification tools can guarantee that the call will not cause a callback. Using the smart local invariant is only an optimization: wherever the tools cannot guarantee CBF (r), the standard sliced local invariant remains applicable.

With the smart local invariant and the loosened Selective Export Call Rule, the impact on programming style is small. Where a constraint remains, because calls cannot be guaranteed callback-free, it is easy to justify it by the potentially catastrophic consequences of an incorrect callback, evidenced by the DAO hack.

We may apply a similar small optimization to remote sliced invariants applied to arguments, such as a.INV / r in the caller-side precondition BC $_2$ of Sliced. That condition is only necessary for arguments a that are neither current nor active, a condition we write NCA (a). Let a.INV / r be the sliced remote invariant as defined earlier (Section 6.3). The rules involving the sliced remote invariant remain sound if we replace it by the smart sliced remote invariant, defined as

\begin{equation*} \text{NCA (a)} \ {\vee} \ (\underline {\mathsf{a.INV} / \mathsf{r}} \end{equation*}

In practice, NCA may be more delicate to ascertain than CBF, leading to using plain a.INV. The smart version highlights a particular difficulty of calls, familiar to anyone having tried to verify or just debug OO programs: passing to another object a reference to oneself (Current, “this”) reeks of trouble. The preconditions ensure the key property (giving some insight into conditions governing the practice of OO programming): You may only pass yourself as argument if you are sure you are consistent—and so are other active objects up the call chain.

The verification of the challenge examples reported in Section 10 used the plain invariants, not the “smart” versions.

8 Addressing Reference Leaks

Rule Sliced dropped assumption SM1 (no callback) and relaxed SM2 (invariant preservation) from the Simple Model. There remains to remove SM3, invariant locality, stating that the invariant only involves the current object’s fields.

Soundness proofs so far have relied on this assumption by considering that the only way to invalidate an object’s invariant is by modifying one of its fields; of the three kinds of event (Section 3.3), only “internal operations” can do this. Without invariant locality, such an operation can also invalidate the invariant of other objects that depend on it. Considering dependents leads to new versions of the rule, “strong” and “weak.” They differ in terms of both implementation (in a verifier) and underlying effect on consistency:

—

The strong version is more precise but (in the absence of annotations) requires global analysis, whereas all the previous rules are modular (meaning that they can be verified class by class). The weak version, in its two variants, is modular but less powerful.

—

The strong version retains GCP-2 (all non-current objects are relatively consistent). Operations can therefore assume the invariant (possibly sliced) on entry. With the weak versions, some dependent objects satisfy only its “local” part; calls on these objects can no longer expect the full invariant.

—

The weak versions, then, have a “lazy” approach to consistency: the object structure may contain inconsistent objects. The inconsistency will be detected (in static or dynamic verification) only in case of a later attempt to use them. Absent such calls, it will remain. The approach remains sound: all objects actually used satisfy their (relative) invariants. It is a matter for discussion (in the spirit of “if a tree falls unheard and unseen in the forest”) whether inconsistencies in unused objects are a reason to worry.

8.1 Strong Rule

We need a precise definition of the notion of dependents of an object. At any time during execution, the dependent set of an object S, Dep (S), is the set of other objects C whose invariants have a clause x.some_property where x is a field of C attached to S. In Figure 5, for example, Dep (Alice) = {Bob, Charles}.

In reference to the program text, we may use Dep (x) to denote Dep (S) for the object S attached to x at runtime (empty if x is Void). If we could determine that set, then the corresponding Strong proof rule would be the same as Sliced with a clause on dependents added above and below (highlighted).

Soundness: GCP-2 remains. So does the previous proof of soundness (Section 6.8), with Invariant Locality relaxed into “an object’s invariant only involves fields of the object itself and its dependents.”

In effect, the Strong rule addresses—rather than the classical notion of class invariant—the notion of multi-object invariant (Section 1.4) by extending the invariant of an object with the invariants of all its dependents. Implementing this rule into a verifying tool requires computing the Dep sets, which in the absence of programmer annotations implies an analysis that is global (on the entire program) rather than modular (class by class). As a conjecture, a modular implementation may be applicable to classes whose invariants satisfy specific patterns (see the “mirroring” suggestion in Section 11.14 of Reference [40]).

8.2 Weak Rule, Call-side

Weak-C (“C” for “Client” or “caller”) puts the burden on callers:

This rule relies on splitting any invariant into two parts: INV_LOC and INV_REM, for “local” and “remote.” INV_REM contains the clauses that have a subexpression of the form b.some_property where b denotes a reference field. In PERSON (4.1), INV_LOC includes two of the three clauses distinct and married_iff_has_spouse; the third one, reciprocal: is_married $\:\Rightarrow \:$ (spouse.spouse = Current), goes into INV_REM (because of spouse.spouse).

We say that an object is locally consistent if it satisfies the local part of its invariant. (Local consistency could also be partial in the sense of slicing as defined in Section 6.5, but will only be applied in the unsliced case.)

The weak variant is based on the observation that the Invariant Hypothesis would hold if the invariant consisted of INV_LOC only; the risk of reference leak comes from INV_REM, which GCP-2 does not allow us to assume on start-of-call. GCP-3, the new GCP, is (change from GCP-2 highlighted):

—

All non-current active objects are relatively consistent.

—

All inactive objects are locally consistent (with the unsliced invariant).

The inactive part of the heap (bottom part of Figure 7 and predecessors) consists of objects that are “just sitting there,” ready to serve as target of future calls. In GCP-2 and previous models, they were fully consistent, in anticipation of ensuring the Invariant Hypothesis in those calls. With GCP-3 this massive guarantee of consistency no longer holds: We must cope with the possible presence of inactive objects that are only locally consistent. Under Weak-C, callers can only rely on the local consistency of these objects, and must ensure the rest.

Soundness: The Invariant Hypothesis continues to hold for the local part of the invariant and for the remote part is ensured by the caller (entry BC $_3$ ).

Rule Weak-C, by reintroducing a limited form of Scandalous Obligation (a partial return to Pointless of Section 2.5), makes the client responsible for invariant elements that could have been messed up by a “man in the middle” such as Bob in the marriage case. The next rule frees the client from the Scandalous Obligation by making life harder for the supplier instead.

8.3 Weak Rule, Supply-side

In rules governing routine calls, we may always move an obligation from the client’s to the supplier’s precondition. Rule Weak-S (“S” for “Supplier”) transfers responsibility for the remote part of the invariant from the client (as in Weak-C) to the supplier. Removing INV_REM on one side means adding INV_LOC to the other, maintaining the Invariant Hypothesis and hence soundness: A criticism of this rule is that it hands over some of the responsibility from the verifier to the programmer. For each routine:

—

It is not possible to assume INV_REM on entry (only INV_LOC). That situation seems inevitable in the absence of an analysis that would compute (or approximate) the Dep sets.

—

As a consequence, the routine may have to do more work, since it is still required to satisfy the full invariant on exit (AS).

—

Note that there is no change to the precondition on the caller side (BC). We make it harder for the routine, forcing it to handle objects that do not satisfy their multi-object requirements (since the remote part of the invariant does not hold), but there is no corresponding extra work for the caller, and nothing the caller can do to help.

The approach is “lazy,” as noted, in that it may leave inconsistent objects around. This possibility causes no threat to program correctness, since subsequent operations on such an object will have to restore consistency. With the strong approach such a situation would not arise, but the price to pay is that operations on an object must also take care of updating other objects whose consistency depends on it.

While Strong remains a possible choice, Weak-S appears, in spite of the preceding objections, to be the best available rule for proving the correctness of object-oriented programs and the main result of the present work. It was given in the usual style of Hoare inference rules (without the abbreviations introduced in the last sections) in the Preview Section (1.3) and is used for the proofs of example programs in Section 10.

9 Initialization, Queries, and Concurrency

The discussion so far has addressed commands (procedures): operations that can modify an existing object, causing transitions between consistent states S $_i$ of its lifecycle as illustrated in Figs 2 and 3. We also need rules for two separate cases: a creation operation, which does not assume an existing object but creates a new one, and a query, which instead of modifying an existing object, returns information about it. The following discussion will also mention the effect of concurrency and summarize the applicable rules.

9.1 Initialization

Weak-S and predecessors govern a qualified call x.r (a) where x denotes an existing object that, per the Invariant Hypothesis, satisfies the invariant (at least its local component). Such rules correspond to the induction step in a proof by induction. We also need an initialization rule, corresponding to the base step.

The initialization mechanism is part of object creation, variously written:

—

create x.make (a) in Eiffel, where make is a creation procedure.

—

x = new SC (a) in Java, C# and C++, for a class SC with an appropriate “constructor” determined by the types of arguments in a.

The goal of object creation is to obtain an object satisfying its invariant. In adapting the preceding rules, it suffices to remove any property of the object itself on the precondition side—INV for the supplier (BS) and x.INV for the client side (BC), which makes no sense for a creation instruction as it creates a new object from scratch. (Of course any specific precondition of the creation procedure is still applicable; remember that the tabular notation used, since Section 2.7 ignores such routine-specific pre- and postconditions, but they reappear in the rule when we express it in full, in the classical notation of Hoare logic used for example for Weak-S in 1.3.) The postcondition side remains, as well as the conditions on arguments. The creation variant of Weak-S, applicable to an instruction create x.make (a), is Weak-S-Creation:

DEF $_S$ denotes the property that fields of the new object have default values (typically zero for integers, etc.). Including DEF $_S$ is appropriate for languages such as Eiffel, Java, and C#, which specify default initialization rules; for C++, which does not, the BS $_1$ entry remains blank. Since in a creation instruction there is no prior “current”, the rule needs neither slicing nor (as a consequence) the “smart” versions of the invariants (Section 7): A just created object must always satisfy the full invariant.

9.2 Pure Queries

A query is an operation returning information about an object. It can be an attribute, giving the value of a field of the object, or a function, returning the result of a computation on those fields.

It is common programming practice (violating the “Command-Query Separation” design principle, References [39, 40]) to let functions change objects. Then the above rules for procedures are applicable to functions, with postconditions extended by properties of function results. The question remains of pure queries, which cannot change the state, and include attributes and side-effect-free functions.

The postcondition side of previous rules, AS and AC, is irrelevant for a routine that cannot change any object. Pure-query rules correspondingly have only one column (BS and BC), the same as in the corresponding rules for procedures.

The results of this simplification (with no new elements, just a removal of the postcondition column in the general rules) are given here for the record:

Applying these rules assumes that the verifier can ascertain that a query is pure, either through a language rule or by static analysis. Both approaches have received considerable attention but fall beyond the scope of the present work.

9.3 Concurrency

This discussion has assumed a sequential programming context. In an unbridled concurrent framework, for example using multithreading, there is no hope for a meaningful notion of class invariant, since furtive access is ever looming. The risk of an Ethereum-DAO-like failure is consequently high.

Preserving the notion of class invariant, and the associated reasoning capabilities, requires a more controlled concurrent mechanism. In SCOOP [18], all access to shared objects is exclusive; then the entire reasoning and rules of this article apply unchanged.

9.4 Retained Rules

For expository purposes, this article has introduced the rules as successive versions. The final result, however, is precisely defined. The rules to be applied for verification are as follows:

—

For commands (procedures), Weak-S (expressed in full in traditional Hoare-style inference-rule notation in the preview, Section 1.3).

—

For creation procedures, Weak-S-Creation.

—

For pure queries, choice (per the criteria discussed in Section 9.2) of programming style: Weak-C-pure or Weak-S-pure.

10 Challenge Problems and Solutions

The following discussion reports on the application of the rule to a number of problems widely considered in the literature to embody the most tricky challenges to invariant-based verification. All raise invariant violations when handled with a simple notion of invariant.

Some have been verified before, but only with extra programmer annotations. The versions used here are annotation-free, although some have required a modicum of tweaking of the code, as is often the case for verification. All source code for the examples is available (with better formatting than the compact forms given below to save space) in a GitHub repository [3].

As the ongoing integration of the proof rule into a proof tool (AutoProof [21]) is not complete, verification has consisted of one or both of

—

Manual application of the rules.

—

Dynamic testing, taking advantage of Eiffel’s assertion constructs (including class invariants as well as routine pre- and postconditions), and the mechanisms in the associated development environment (EiffelStudio) for monitoring these assertions at runtime, widely used for testing and debugging.

While not as final as mechanized static proofs of correctness, these approaches are meaningful, since:

—

The examples, while tricky, are small, making manual proofs credible.

—

The state space (for cases that could trigger violations) is typically small, so dynamic monitoring can cover all relevant scenarios.

These efforts should be viewed as stepping stones toward the goal of mechanized static proofs of correctness.

10.1 Marriage, Non-recursive

The class text with contracts is the following (in the interest of space the formatting does not respect indentation rules, and the creation procedure is omitted):

(Routine divorce has an argument—which must equal to spouse—because of the Selective Export Call Rule (Section 6.7), but it is easy to wrap it into a routine with no argument, as done with insert_right in the next example.) The proof appears as (a), (b), and (c) below. For conciseness, such proofs use a tabular notation, with one row per goal to be proven. A checkmark $\checkmark$ signals an available fact. A numeric prefix refers to a line number in the code: for the preceding code, 4.pre $_3$ denotes the precondition clause pre $_3$ of the feature (set_spouse) called at line 4. If there is no such prefix, then the fact is a contract element or an AS/BS component from the proof rule; for example, AS $_1$ .o.inv $_1$ states that inv $_1$ holds for o, satisfying entry AS $_1$ of the rule.

10.2 Circular Doubly-linked List

The full program text appears below. The body of make only consists of assignments and trivially establishes both its postcondition and the class invariant.

Verification of set_left and set_right is straightforward too—the assignments establish the postconditions, and the preconditions establish the invariants on exit. The generic parameter G represents the type of list elements.

Note that routine insert_between_two is “wrapped” by a public routine insert_right as the extra arguments of the first should be in internally called (secret) routine, so that the preconditions will be guaranteed.

The proofs of insert and remove appear below.

10.3 Observer

Class OBSERVER is a simple implementation of the Observer pattern.

The proof of OBSERVER.update is immediate: The postcondition follows from the assignment, and the AS $_1$ .Current.inv $_3$ obligation from assuming BS $_1$ .Current.inv $_3$ .

10.4 Subject

The code for the Subject in the Observer pattern is as follows:

Proving make and update_observer is trivial: All the proof goals imposed by the proof rule directly follow from these features’ preconditions and implementations, so we do not present them.

11 Previous Work

The first presentation of class invariants is a single mention of “invariant of the class” in Hoare’s 1972 “correctness of data representations” [27]. Experimental 1970s languages included representation invariants: Alphard [59] provides both abstract invariants (defined on the underlying Abstract Data Type) and representation invariants (on a specific implementation). OO design provides the opportunity to use as many levels as needed instead of just two. Abstraction and representation become relative concepts; each class inherits its parents’ invariants, adding its own. The development of the concept of class invariant for OO appeared in 1985 in References [37, 38] and other Eiffel-related publications. References [39, 40] (1988, 1997) furthered the concepts. Other formalisms with invariant support include JML [13] and Spec# [11].

The problem of precise semantics for invariants has attracted considerable attention. Reference [49] introduced a reachability predicate for linear lists. Reference [54] developed an invariant-based tool to analyze Java code for such defects as illegal dereferencing. Reference [35] exploited design patterns to validate OCL [57] invariants in C++.

The issue of reference leak, as analyzed in the present work, arises when operations on an object may invalidate the invariant of another object. One category of solutions has received particular attention: ownership techniques, in which some objects are “owned” by others, with the rule that any operation on an owned object must go through its owner, so as to control invariant change. Programmers are responsible for declaring ownership relations in the code. This idea has many variants:

—

The methodology of Reference [28] uses the notion of “vulnerable” objects, meaning objects that may be invalidated by operations on instances of another class. The code must declare logical variables to identify vulnerable objects.

—

The Universe Type System [45] distributes objects into ownership contexts limiting updates. Reference [10] introduced an ownership-based mechanism that includes boolean states in which the invariant is known to hold.

—

Later work went further through a friendship protocol to allow state dependence across ownership boundaries [12], a case of cooperation-based approaches. References [43, 44], also known as visibility-based [46], formalized and proved sound in Reference [48].

—

The Boogie methodology [31] refined the ideas by removing bounds on the number of objects in an ownership context and letting objects change their contexts. Reference [30] extended the methodology to the multithreaded case by protecting object structures from race conditions. Another extension [32] handles invariants over static fields. An extension [33] adds yet another modifier, additive, specifying which fields may be mentioned in subclass invariants.

—

In an ownership context, Reference [47] used ghost fields to handle reentrant method calls and shared references. Reference [46] combined ownership and visibility techniques for modular specification and verification of layered structures. New extensions to the Boogie approach use history invariants—two-state invariants describing the evolution of data values—to verify variations of the Observer pattern; one of the latest, semantic collaboration [51], enhances the methodology with such ghost fields as subjects and observers.

—

Reference [34] introduces two sets of objects: those in the validity invariant must be valid before and after, unlike those in the validity effect. On exit, re-validation is only needed for the intersection. Later work [29] generates the sets automatically from visibility modifiers.

—

Reference [4] addresses invariant issues in a language with contracts, treating inter-object relations as first-class citizens. Verification relies on a “Matreshka Principle” to guarantee absence of transitive call-backs and restore a visible-states semantics for multi-object invariants [5]. The methodology involves annotations to declare relationships and their participants.

—

References [6, 7, 8] introduce region logic supporting heap-local reasoning about mutation and separation via ghost fields and variables of type “region” consisting of finite sets of object references. Reference [9] adapted region logic to JML and extended it with separating conjunction [52].

—

Reference [15] proposed two-state locally checked invariants, reducing verification of a concurrent program to checking that they hold for every update.

In addition to handling reference leak, the approaches cited generally address furtive access through special instructions wrap and unwrap (sometimes pack and unpack) to specify when objects must satisfy their invariants and when they are permitted to violate them, for example inside the execution of a routine. Such instructions have no effect on program execution; in other words, they are intended for the verifier, not the compiler.

Reference [17] proposes seven verification parameters and conditions on them to guarantee soundness; analyzing existing approaches in terms of the proposed semantics. Reference [42] was an initial iteration toward the present work.

Unlike the static approaches reviewed so far, other work performs dynamic verification of programs, requiring execution:

—

Reference [22] proposed a runtime verification scheme guaranteeing that any invariant violation is detected exactly where it occurs, and proved its correctness. The scheme automatically tracks dependencies between objects.

—

Reference [23] presented another framework for runtime-invariant checking.

—

Reference [14], an extension to Pex [55], synthesizes parameterized unit tests [56] from multi-object invariants.

The Ethereum DAO bug [16], due to a callback [53], gave rise to invariant-based methodologies for smart contracts [1, 2, 24]. Reference [1] adds a type invariant to smart contracts. Reference [24] introduces the notion of Effectively Callback Free objects (as in Section 7 of the present work). [2] extended the concept and provided a static analysis technique to ensure effective-callback-freedom.

12 Limitations, Future Work, and Conclusion

The preceding discussion does not consider inheritance. Together with associated techniques of polymorphism and dynamic binding, inheritance is an integral part of object-oriented programming. Under inheritance, invariants accumulate [37, 39]. We surmise that the rules given will naturally extend to inheritance, but the present study has not yet explored this matter in depth. Also, not considered in the current state of this work include reflection and mechanisms for passing routines as arguments to other routines, as either plain function pointers (in C++) or more abstract values such as Eiffel’s agents and C#’s delegates. Such arguments can cause callbacks, which the analysis must detect.

The “weak” rules currently retained for addressing reference leak require explicit enforcement of the “remote” part of the invariant, preferably on the supplier side, and may require some fine-tuning of existing code for verification. The “strong” version of the rules would avoid these problems, but making them convenient in practice raises the question of how to compute the dependent sets statically, efficiently, and modularly—a question that remains open.

Proofs of properties of the underlying theory (consistency of successive versions of Global Consistency, and associated theorems) have been conducted manually as reported above, in the usual style of mathematical proofs, and have been checked carefully, but have not been subject to an automated theorem prover. As to the application of these rules to programs, the process of integration into an automatic prover, AutoProof, is not complete.

While the rules do not assume any specific typing system, or even that the language is statically typed, they do assume that it is reasonably well behaved, for example that it prohibits a routine from modifying its own arguments (Section 6.7). It may be possible to generalize the approach to arbitrary, undisciplined languages, but we have not attempted such a feat.

The Selective Export Call Rule (Section 6.1), requiring the targets of calls to selectively exported routines to be formal arguments, may require some wrapping effort, as in the verification of divorce in 10.1 and insert_right in 10.2. As discussed in Section 7, the burden appears acceptable in practice, and disappears if the analysis can detect the absence of callbacks.

As mentioned in the discussion of the Callback Model (end of Section 5), it would be useful to conduct an empirical study of a large OO code base, using a prover such as AutoProof, to find out whether most qualified calls to non-library routines do already wrap up the client object (make it consistent) first, meeting the INV or INV / r part of the call’s precondition.

With these qualifications, the rules as they stand provide an answer to the goals set at the beginning of this article: opening up an effective way to use class invariants, based on an actual axiomatization, not just a “methodology”; identifying the precise obstacles (three distinct threats) to the soundness of the basic invariant concept; and providing solutions to all of them, without requiring of the programmer any annotation burden.

The rules give practicing software developers a basis for making the fundamental notion of class invariant a key tool of their work, helping them benefit from the full power of OO concepts. Our own experience as programmers shows that extensive use of invariants leads to better programs, easier to write, understand and get right (and debug when not). We are, as noted, integrating the rules of this article into an automatic prover, part of a determined effort to restore the belief in a sanity clause.

Acknowledgments

We are grateful to the referees for many detailed and perceptive comments that led to considerable improvements of the article in successive revisions. Iulian Ober, now with ISAE-Supaéro in Toulouse, made important suggestions on an early version of the work. A presentation of an intermediate stage of the research, at a meeting of the IFIP WG2.3 working group in Villebrumier, elicited fruitful insights from Gary Leavens and other members of the group, many of whom have themselves made important contributions to the topic over the years. The work benefited from the “semantic collaboration” concepts [51] integrated into the original AutoProof and discussions with its coauthors: Carlo Furia, Nadia Polikarpova (who in an informal discussion brought up the observation that selective exports could help) and Julian Tschannen. Li Huang from Constructor Institute participated in a number of discussions. Alexandr Naumchev, formerly from Constructor Institute, was involved in the first version.

References

[1]

M. Anthony Aiello, Johannes Kanig, and Taro Kurita. 2020. Call me back, I have a type invariant. In Formal Methods. FM 2019 International Workshops. Springer, 325–336.

Google Scholar

[2]

Elvira Albert, Shelly Grossman, Noam Rinetzky, Clara Rodríguez-Núñez, Albert Rubio, and Mooly Sagiv. 2020. Taming callbacks for smart contract modularity. Proc. ACM Program. Lang. 4, OOPSLA (2020), 30. DOI:

Digital Library

Google Scholar

[3]

Alisa Arkadova, Alexander Kogtenkov, Alexandr Naumchev, and Bertrand Meyer. 2021-2022. Source Code and Other Supporting Material for This Article. Retrieved from https://github.com/alicealice19/invariant_paper

Google Scholar

[4]

Stephanie Balzer and Thomas R. Gross. 2010. Modular reasoning about invariants over shared state with interposed data members. In Proceedings of the 4th ACM SIGPLAN Workshop on Programming Languages Meets Program Verification (PLPV ’10). ACM Press, 49.

Digital Library

Google Scholar

[5]

Stephanie Balzer and Thomas R. Gross. 2011. Verifying multi-object invariants with relationships. In ECOOP ’11, Mira Mezini (Ed.), Vol. 6813. Springer, 358–382.

Google Scholar

[6]

Anindya Banerjee and David A. Naumann. 2013. Local reasoning for global invariants, part II: Dynamic boundaries. J. ACM 60, 3 (June2013), 1–73.

Digital Library

Google Scholar

[7]

Anindya Banerjee, David A. Naumann, and Stan Rosenberg. 2008. Regional logic for local reasoning about global invariants. In ECOOP ’08, Jan Vitek (Ed.), Vol. 5142. Springer, 387–411.

Google Scholar

[8]

Anindya Banerjee, David A. Naumann, and Stan Rosenberg. 2013. Local reasoning for global invariants, part I: Region logic. J. ACM 60, 3 (June2013), 1–56.

Digital Library

Google Scholar

[9]

Yuyan Bao and Gary T. Leavens. 2018. A methodology for invariants, framing, and subtyping in JML. In Principled Software Development, Peter Müller and Ina Schaefer (Eds.). Springer, 19–39.

Crossref

Google Scholar

[10]

Mike Barnett, Robert DeLine, Manuel Fähndrich, K. Rustan M. Leino, and Wolfram Schulte. 2004. Verification of object-oriented programs with invariants. J. Obj. Technol. 3, 6 (2004), 27.

Crossref

Google Scholar

[11]

Mike Barnett, K. Rustan M. Leino, and Wolfram Schulte. 2004. The Spec# programming system: An overview. In CASSIS ’04. Springer, 49–69.

Google Scholar

[12]

Mike Barnett and David A. Naumann. 2004. Friends need a bit more: Maintaining invariants over shared state. In Mathematics of Program Construction, Dexter Kozen (Ed.). Springer, 54–84.

Google Scholar

[13]

Lilian Burdy, Yoonsik Cheon, David R. Cok, Michael D. Ernst, Joseph R. Kiniry, Gary T. Leavens, K. Rustan M. Leino, and Erik Poll. 2005. An overview of JML tools and applications. Int. J. Softw. Tools Technol. Transf. 7, 3 (June2005), 212–232.

Crossref

Google Scholar

[14]

Maria Christakis, Peter Müller, and Valentin Wüstholz. 2014. Synthesizing parameterized unit tests to detect object invariant violations. In Software Engineering and Formal Methods, Dimitra Giannakopoulou and Gwen Salaün (Eds.), Vol. 8702. Springer, 65–80.

Crossref

Google Scholar

[15]

Ernie Cohen, Michał Moskal, Wolfram Schulte, and Stephan Tobies. 2010. Local verification of global invariants in concurrent programs. In Computer Aided Verification, Tayssir Touili, Byron Cook, and Paul Jackson (Eds.). Springer, 480–494.

Google Scholar

[16]

Phil Daian. 2016. Analysis of the DAO Exploit. Retrieved from https://hackingdistributed.com/2016/06/18/analysis-of-the-dao-exploit/

Google Scholar

[17]

S. Drossopoulou, A. Francalanza, P. Müller, and A. J. Summers. 2008. A unified framework for verification techniques for object invariants. In ECOOP ’08, Jan Vitek (Ed.), Vol. 5142. Springer, 412–437.

Google Scholar

[18]

Eiffel Software. Concurrent Programming with SCOOP. Retrieved from https://www.eiffel.org/doc/solutions/Concurrent_programming_with_SCOOP

Google Scholar

[19]

Robert W. Floyd. 1967. Assigning meanings to programs. Math. Aspects Comput. Sci. 19, 19-32 (1967), 1.

Google Scholar

[20]

Carlo A. Furia, Bertrand Meyer, and Sergey Velder. 2014. Loop invariants: Analysis, classification, and examples. Comput. Surv. 46, 3 (Jan.2014), 1–51.

Digital Library

Google Scholar

[21]

Carlo A. Furia, Martin Nordio, Nadia Polikarpova, and Julian Tschannen. 2017. AutoProof: Auto-active functional verification of object-oriented programs. Int. J. Softw. Tools Technol. Transf. 19, 6 (Nov.2017), 697–716.

Digital Library

Google Scholar

[22]

Madhu Gopinathan and Sriram K. Rajamani. 2008. Runtime monitoring of object invariants with guarantee. In Runtime Verification, Martin Leucker (Ed.), Vol. 5289. Springer, 158–172.

Google Scholar

[23]

Michael Gorbovitski, Tom Rothamel, Yanhong A. Liu, and Scott D. Stoller. 2008. Efficient runtime invariant checking: A framework and case study. In ISSTA ’08–WODA ’08. ACM Press, 43.

Google Scholar

[24]

Shelly Grossman, Ittai Abraham, Guy Golan-Gueta, Yan Michalevsky, Noam Rinetzky, Mooly Sagiv, and Yoni Zohar. 2018. Online detection of effectively callback free objects with applications to smart contracts. Proc. ACM Program. Lang. 2 (Jan.2018), 1–28.

Digital Library

Google Scholar

[25]

C. A. R. Hoare. 1969. An axiomatic basis for computer programming. Commun. ACM 12, 10 (Oct.1969), 576–580. DOI:

Digital Library

Google Scholar

[26]

C. A. R. Hoare. 1971. Procedures and parameters: An axiomatic approach. In Symposium on Semantics of Algorithmic Languages, E. Engeler (Ed.), Vol. 188. Springer, 102–116.

Google Scholar

[27]

C. A. R. Hoare. 1972. Proof of correctness of data representations. Acta Inf. 1, 4 (1972), 271–281.

Digital Library

Google Scholar

[28]

Kees Huizing, Ruurd Kuiper, and SOOP. 2000. Verification of object oriented programs using class invariants. In Fundamental Approaches to Software Engineering, Vol. 1783. Springer, 208–221.

Crossref

Google Scholar

[29]

Stefan Huster, Patrick Heckeler, Hanno Eichelberger, Jürgen Ruf, Sebastian Burg, Thomas Kropf, and Wolfgang Rosenstiel. 2014. More flexible object invariants with less specification overhead. In Software Engineering and Formal Methods, Dimitra Giannakopoulou and Gwen Salaün (Eds.), Vol. 8702. Springer, 302–316.

Google Scholar

[30]

B. Jacobs, K.R.M. Leino, F. Piessens, and W. Schulte. 2005. Safe concurrency for aggregate objects with invariants. In SEFM ’05. IEEE, 137–146.

Google Scholar

[31]

K. Rustan M. Leino and Peter Müller. 2004. Object invariants in dynamic contexts. In ECOOP ’04, Martin Odersky (Ed.). Springer, 491–515.

Google Scholar

[32]

K. Rustan M. Leino and Peter Müller. 2005. Modular verification of static class invariants. In FM 2005: Formal Methods, John Fitzgerald, Ian J. Hayes, and Andrzej Tarlecki (Eds.). Springer, 26–42.

Google Scholar

[33]

K. Rustan M. Leino and Angela Wallenburg. 2008. Class-local object invariants. In ISEC ’08. ACM Press, 57.

Digital Library

Google Scholar

[34]

Yi Lu, John Potter, and Jingling Xue. 2007. Validity invariants and effects. In ECOOP ’07, Erik Ernst (Ed.), Vol. 4609. Springer, 202–226.

Google Scholar

[35]

Brian A. Malloy and James F. Power. 2006. Exploiting design patterns to automate validation of class invariants. Softw. Test. Verif. Reliabil. 16, 2 (Jun.2006), 71–95.

Digital Library

Google Scholar

[36]

Marx Brothers. 1935. A Night at the Opera. https://youtu.be/G_Sy6oiJbEk?t=228. (1935).

Google Scholar

[37]

Bertrand Meyer. 1985. Eiffel: A Language for Software Engineering. Technical Report TR-CS-85-19. University of California, Santa Barbara.

Google Scholar

[38]

Bertrand Meyer. 1988. Eiffel: A language and environment for software engineering. J. Syst. Softw. 8, 3 (Jun.1988), 199–246.

Digital Library

Google Scholar

[39]

Bertrand Meyer. 1988. Object-oriented Software Construction (1st ed.). Prentice-Hall.

Google Scholar

[40]

Bertrand Meyer. 1997. Object-oriented Software Construction (2nd ed.). Prentice Hall.

Digital Library

Google Scholar

[41]

Bertrand Meyer. 2005. The dependent delegate dilemma. In Engineering Theories of Software Intensive Systems, Manfred Broy, Johannes Grünbauer, David Harel, and Tony Hoare (Eds.), Vol. 195. Springer, 105–118.

Google Scholar

[42]

Bertrand Meyer. 2021. Class invariants: Concepts, problems, solutions. arXiv: 1608.07637. Retrieved from https://arxiv.org/abs/1608.07637

Google Scholar

[43]

Ronald Middelkoop, Cornelis Huizing, Ruurd Kuiper, and Erik Luit. 2006. Cooperation-based invariants for oo languages. Electr. Not. Theor. Comput. Sci. 160 (Aug.2006), 225–237.

Crossref

Google Scholar

[44]

Ronald Middelkoop, Cornelis Huizing, Ruurd Kuiper, and Erik J. Luit. 2008. Invariants for non-hierarchical object structures. Electr. Not. Theor. Comput. Sci. 195 (Jan.2008), 211–229.

Digital Library

Google Scholar

[45]

Peter Müller. 2002. Modular Specification and Verification of Object-oriented Programs. Springer, Berlin.

Crossref

Google Scholar

[46]

Peter Müller, Arnd Poetzsch-Heffter, and Gary T. Leavens. 2006. Modular invariants for layered object structures. Sci. Comput. Program. 62, 3 (Oct.2006), 253–286.

Digital Library

Google Scholar

[47]

David A. Naumann. 2005. Assertion-based encapsulation, object invariants and simulations. In Formal Methods for Components and Objects, Frank S. de Boer, Marcello M. Bonsangue, Susanne Graf, and Willem-Paul de Roever (Eds.). Springer, 251–273.

Digital Library

Google Scholar

[48]

David A. Naumann and Mike Barnett. 2006. Towards imperative modules: Reasoning about invariants and sharing of mutable state. Theor. Comput. Sci. 365, 1-2 (Nov.2006), 143–168.

Digital Library

Google Scholar

[49]

Greg Nelson. 1983. Verifying reachability invariants of linked structures. In POPL ’83. ACM Press, 38–47.

Digital Library

Google Scholar

[50]

David L. Parnas. 1972. On the criteria to be used in decomposing systems into modules. Commun. ACM 16, 12 (Dec.1972), 1053–1058.

Digital Library

Google Scholar

[51]

Nadia Polikarpova, Julian Tschannen, Carlo A. Furia, and Bertrand Meyer. 2014. Flexible invariants through semantic collaboration. In FM 2014: Formal Methods, Cliff Jones, Pekka Pihlajasaari, and Jun Sun (Eds.). Springer, 514–530.

Google Scholar

[52]

J. C. Reynolds. 2002. Separation logic: A logic for shared mutable data structures. In LICS ’02. IEEE, 55–74.

Google Scholar

[53]

Emin Gün Sirer. 2016. Reentrancy Woes in Smart Contracts. Retrieved from https://hackingdistributed.com/2016/07/13/reentrancy-woes/

Google Scholar

[54]

S. Skevoulis and Xiaoping Jia. 2000. Generic invariant-based static analysis tool for detection of runtime errors in Java programs. In TOOLS-Pacific ’00. IEEE, 36–44.

Google Scholar

[55]

Nikolai Tillmann and Jonathan de Halleux. 2008. Pex–white box test generation for .NET. In Tests and Proofs, Bernhard Beckert and Reiner Hähnle (Eds.). Springer, 134–153.

Google Scholar

[56]

Nikolai Tillmann and Wolfram Schulte. 2005. Parameterized unit tests. ACM SIGSOFT Softw. Eng. Not. 30, 5 (Sept.2005), 253–262.

Digital Library

Google Scholar

[57]

Jos B. Warmer and Anneke G. Kleppe. 2003. The Object Constraint Language: Getting your Models Ready for MDA (2nd ed.). Addison-Wesley.

Google Scholar

[58]

Wikipedia. Accounting Equation. Retrieved from https://en.wikipedia.org/wiki/Accounting_equation

Google Scholar

[59]

W. A. Wulf, R. L. London, and M. Shaw. 1976. An introduction to the construction and verification of alphard programs. IEEE Trans. Softw. Eng. SE-2, 4 (Dec.1976), 253–265.

Digital Library

Google Scholar

Recommendations

What Is Object-Oriented Programming?

The meaning of the term 'object oriented' is examined in the context of the general-purpose programming language C++. This choice is made partly to introduce C++ and partly because C++ is one of the few languages that supports data abstraction, object-...
A Domain-Theoretic Model Of Nominally-Typed Object-Oriented Programming

The majority of contemporary mainstream object-oriented (OO) software is written using nominally-typed OO programming languages. Extant domain-theoretic models of OOP developed to analyze OO type systems miss crucial features of these mainstream OO ...
Enforcing object protocols by combining static and runtime analysis
OOPSLA '08: Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications

In this paper, we consider object protocols that constrain interactions between objects in a program. Several such protocols have been proposed in the literature. For many APIs (such as JDOM, JDBC), API designers constrain how API clients interact with ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

Formal Aspects of Computing Volume 36, Issue 1

March 2024

194 pages

EISSN:1433-299X

DOI:10.1145/3613521

Editor:
Jim Woodcock
University of York, UK

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 March 2024

Online AM: 24 January 2024

Accepted: 23 September 2023

Revised: 07 July 2023

Received: 17 December 2021

Published in FAC Volume 36, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
677
Total Downloads

Downloads (Last 12 months)677
Downloads (Last 6 weeks)172

Reflects downloads up to 28 Sep 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

1 Class Invariants

1.1 Basics

1.2 Contributions and Limitations

1.3 Progression and Preview

1.4 Terminology

2 Working with Invariants

2.1 Working with Heaps

2.2 The Invariant Hypothesis and the Scandalous Obligation

2.3 Background: A Classical Proof Rule for Routine Calls

2.4 Notation

2.5 Sound But Useless Invariant Integration

2.6 The Invariant Preservation Property

2.7 The Ideal Rule

2.8 The Runtime Model

2.9 Current Object, Active Objects, Active Routines

2.10 Consistent and Inconsistent Objects

2.11 Qualified and Unqualified Calls

3 The Simple Model

3.1 The Restrictions

3.2 Global Consistency Property

3.3 The Simple Model Is Sound

4 Threats to the Invariant Hypothesis

4.1 The Marriage Example

4.2 Recursive Marriage and Callbacks

4.3 Non-recursive Marriage and Furtive Access

4.4 Reference Leak

5 Addressing Callbacks: Wrap Up Before Calls

5.1 Wrapping Up before Going Out

5.2 Computational Model for Handling Callbacks

6 Addressing Furtive Access: the Slicing Model

6.1 Selective Export Mechanisms

6.2 Taking Advantage of Selected Exports

6.3 Slicing: Definitions

6.4 Extended-active Objects

6.5 Relative Consistency and the Slicing Model

6.6 The Slicing Rule

6.7 Two Programming Language Conditions

6.8 Handling Furtive Access Correctly: The Slicing Rule Is Sound

7 Programming Style and Smart Invariants

8 Addressing Reference Leaks

8.1 Strong Rule

8.2 Weak Rule, Call-side

8.3 Weak Rule, Supply-side

9 Initialization, Queries, and Concurrency

9.1 Initialization

9.2 Pure Queries

9.3 Concurrency

9.4 Retained Rules

10 Challenge Problems and Solutions

10.1 Marriage, Non-recursive

10.2 Circular Doubly-linked List

10.3 Observer

10.4 Subject

11 Previous Work

12 Limitations, Future Work, and Conclusion

Acknowledgments

References

Index Terms

Recommendations

What Is Object-Oriented Programming?

A Domain-Theoretic Model Of Nominally-Typed Object-Oriented Programming

Enforcing object protocols by combining static and runtime analysis

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

PDF