1 Introduction
Background. Choreographies, broadly construed, are coordination plans for concurrent and distributed systems [W3C
2004; Object Management Group
2011]. Examples of choreographies include distributed authentication protocols [OpenID Foundation
2014; Sporny et al.
2011], cryptographic protocols [Diffie and Hellman
1976], and multiparty business processes [Object Management Group
2011; W3C
2004]. In software development, programmers use choreographies to agree on the interactions that communicating endpoints should enact to achieve a common goal; then, each endpoint can be programmed independently. The success of this development process hinges on achieving
choreography compliance: when all endpoints are run together, they interact as defined by the choreographies agreed upon [Montesi
2023].
Achieving choreography compliance is hard, because of some usual suspects of concurrent and distributed programming: predicting how multiple programs will interact at runtime is challenging [O’Hearn
2018], and mainstream programming languages do not adequately support programmers in reasoning about coordination in their code [Lu et al.
2008; Leesatapornwongsa et al.
2016]. Additionally, choreographies are complex. At a minimum, choreographies define the expected communication flows among their roles (abstractions of endpoints, like “Alice,” “Bob,” “Buyer,” etc.) [Intl. Telecommunication Union
1996]. However, they often include computational details of arbitrary complexity, for example, pre- or post-processing of data (encryption, validation, anonymisation, etc.), state information, and decision procedures to choose among alternative behaviours. These computational details are essential parts of many protocols, ranging from security protocols to business processes. Figure
1 depicts the common situation where a programmer tries to ensure choreography compliance through their subjective interpretation of the choreography and the manual coding of endpoints.
In response to the challenge of choreography compliance, researchers investigated methods to relate choreographies to endpoint programs—many are reviewed in Ancona et al. [
2016]; Hüttel et al. [
2016]. Initially, these methods focused on simple choreographic languages without computation that were inspired by, e.g., communicating automata, process calculi, and session types [Qiu et al.
2007; Bravetti and Zavattaro
2007; Basu et al.
2012; Honda et al.
2016]. Some ideas developed for choreographies were later adopted in the paradigm of
choreographic programming [Montesi
2013; Carbone and Montesi
2013], where choreographies are written in a Turing-complete programming language that allows for defining arbitrary computation at endpoints. Thanks to the capability of combining computation with coordination, choreographic programming languages can capture realistic protocols that include data manipulation and decision procedures, including encryption strategies (as in security protocols), retry strategies (as in transport protocols), and marshalling (as in application protocols). In choreographic programming, compliance is obtained by construction: given a choreography, a compiler automatically translates it to a set of compliant endpoint implementations. Choreographic programming has been shown to have promising potential in multiple contexts, including information flow [Lluch-Lafuente et al.
2015], distributed algorithms [Cruz-Filipe and Montesi
2016], cyber-physical systems [López et al.
2016; López and Heussen
2017], and self-adaptive systems [Dalla Preda et al.
2017].
The problem. Current approaches to choreographic programming come at a significant cost to modularity, and it remains unclear how the benefits of this paradigm can be applied to mainstream software development.
Typically, the endpoint code that implements a choreography comes in libraries that developers can use in their applications through an Application Programming Interface (API) [Murty
2008; Wilder
2012; Atzori et al.
2010]. For example, a library that implements a choreography for user authentication might provide a method
\(\texttt {authenticate}\) that a web service can invoke to run (its part of) the protocol in collaboration with a connected client. The implementation of
\(\texttt {authenticate}\) might involve a series of communications between the client and the web service. Potentially, a third-party identity provider might be involved as well, and messages might include passwords that should be hashed. Usually, and ideally, all these details are hidden from the API (the signature of the method) exposed to the developer of the web service. This abstraction allows the developer to minimise coupling between their implementation of the functionalities offered by the web service and the choreography used for authentication, bringing the expected benefit: the library implementing the choreography and the rest of the code of the web service can be updated independently and recombined, as long as the API provided by the former remains compatible with the one expected by the latter. This benefit is key to large-scale software development, and it is an initial assumption for modern development practices like microservices and DevOps [Dragoni et al.
2017].
Unfortunately, previous frameworks for choreographic programming do not support modular software development [Montesi
2013; Dalla Preda et al.
2017]. The code that these frameworks generate for each endpoint is a “black box” program without an API: it can be executed, but it is not designed to be composed by programmers within larger codebases. Thus, these frameworks fall short of providing the aforementioned benefit of modularity. Furthermore, the common scenario of programming an endpoint that participates in multiple choreographies is not supported, and neither is programming an endpoint where participating in a choreography is only part of what the endpoint does.
A major factor that makes modularity challenging is that current choreographic languages are based on behavioural models (process calculi, communicating automata, etc.), which makes translating choreographies into libraries based on mainstream abstractions (data, functions, objects, etc.) nontrivial. For the simpler setting of choreographic languages without computation, we know that choreographies can be translated to libraries that offer a “fluid” API. For instance, Scalas et al. [
2017] produce object-oriented libraries whose APIs enforce the invocation of
\(\texttt {send}\) and
\(\texttt {receive}\) methods in the right order: If an endpoint should send, receive, and then send, then the developer will be forced to write something like
\(\texttt {o.send(..).receive(..).send(..)}\). However, this approach leaks the structure of the communication flow implemented by the library; thus, future versions of the library that adopt a different structure (e.g., an unnecessary communication might be removed, or some communications might be bunched together for efficiency) would require rewriting the application that uses the library. Furthermore, this approach does not let the choreography programmer decide how the generated API will look; thus, we cannot use this method to generate drop-in replacements for existing libraries.
In summary, we still have to discover how the principles of choreographic programming can be applied to mainstream programming. This article aims to fill this gap.
This article. We present Choral, a new choreographic programming language that supports modularity and is based on mainstream programming concepts (from object-oriented programming). To demonstrate applicability, Choral is compatible with Java, but our ideas apply to most statically typed, object-oriented languages.
The fulcrum of Choral is a new interpretation of choreographies that builds naturally on top of existing language abstractions: Choral is an
object-oriented choreographic programming language, where choreographies are classes and their instances are objects. The starting point for this interpretation is a generalisation of the key idea found in Lambda 5 [Murphy VII et al.
2004], the model that inspired the research line on multitier programming [Neubauer and Thiemann
2005; Cooper et al.
2006; Serrano et al.
2006; Murphy VII et al.
2007; Weisenburger et al.
2020]. In Lambda 5, each data type is located at a (single) place, which enables reasoning on spatially distributed computation. Choral generalises these types from single to
multiple locations, which allows us to express that an object is implemented choreographically: Choral objects have types of the form
T@(A1, ..., An), where
\(\texttt {T}\) is the usual interface of the object, and
A1, ..., An are the roles that collaboratively implement the object.
As an example, consider the case of a multiparty choreography for distributed authentication, where a service authenticates a client via a third-party identity provider. We can define such a choreography as a Choral class with type DistAuth@(Client, Service, IP) (IP is short for identity provider). The class can implement methods that involve multiple roles. For example, it can offer a method \(\texttt {authenticate}\) with the following signature.
Invoking \(\texttt {authenticate}\) with some \(\texttt {Credentials}\) located at Client returns an authorisation token at Service (\(\texttt {Optional}\) since authentication can fail), denoting the movement of data.
We leverage our object-oriented interpretation of choreographies to develop a methodology for choreography compliance that supports modularity and is compatible with mainstream programming. We depict this methodology in Figure
2. Given the code of a Choral class with some roles, a compiler produces a compliant-by-construction software library in pure Java for each role (“coordination code” in Figure
2): each library contains the local implementation of what its role should do to execute the choreography correctly. These libraries offer Java APIs derived from the source choreographies, which reveal only the details pertaining to the implemented role. When a software developer programs an endpoint that should engage in a choreography, they can just take the library compiled for the role that their endpoint needs to play and use the library through its Java API. Those APIs allow developers to modularly compose multiple libraries within their code (“local code” in Figure
2), thus gaining the ability to participate in multiple choreographies.
The Java code compiled from the method \(\texttt {authenticate}\) for the roles Client and Service has the following signatures, respectively.
The compiled signatures follow the principle that we have just mentioned—that is, for each role, only what pertains to that role is reported. This is why the signature compiled for
Client has a parameter of type
\(\texttt {Credentials}\) and return type
\(\texttt {Unit}\): in the source choreography, the parameter is located at
Client whereas the return type is located at another role.
1 Following the same principle, the signature for
Service has return type
\(\texttt {Optional}< \texttt{AuthToken}>\) and no parameters.
Contributions. We outline our main contributions.
Language. We present Choral, the first choreographic programming language based on mainstream abstractions and interoperable with a mainstream language (Java). The key novelty of the Choral language is that data types are higher-kinded types parameterised on roles. We leverage this feature to bring the key aspects of choreographies to object-oriented programming (spatial distribution, interaction, and knowledge of choice). Choral is also the first truly higher-order choreographic language, where choreographies passed as arguments can carry state and invocations of higher-order choreographies require no centralised coordination [Demangeon and Honda
2012; Dalla Preda et al.
2017].
Integrating object-oriented principles with choreographies brings key benefits in the other direction too, in the sense that we gain a much richer choreographic language than the state-of-the-art. By using subtyping, we can define abstract APIs for choreographies that can be implemented in different ways and communication behaviours, bringing the usual substitution principle for objects to choreographies. Method overloading allows us to specialise computation based on the locations of arguments—which is a new dimension for overloading. Semantic parametricity enables the writing of reusable choreographies that treat uniformly parameters that implement a shared API. These features allow us to generalise choreographies from assuming a fixed communication primitive to user-definable communication methods, thus freeing Choral from commitments to any communication technology or middleware. Furthermore, we can define in Choral the first hierarchy of “channel types” for choreographies, which can be used to represent at the level of types the topological assumptions of a choreography. In general, users are free to define other ‘channel types’ to support more topologies.
We implement a type checker that, in addition to the expected checks for an object-oriented language, detects coding errors related to roles. For example, our checker can rule out computation at a role that erroneously accesses data at another role without proper communication. This makes distribution errors manifest to the programmer. Our typing also supports the reuse of all existing Java classes and interfaces in Choral, because every such type can be lifted to a Choral type located at a single role.
In Choral, choreographies are concrete software artefacts. This poses questions such as what code should go in these artefacts, what code should remain local, and how these two parts of software should interact through APIs. We elicit these questions and report on our experience in addressing them throughout the article, where they become relevant.
Compiler. We implement a compiler that translates Choral source into Java libraries that comply with the source choreography: the code generated for each role performs the actions prescribed by the choreography. With our compiler, the programmer of the choreography is in control of the generated APIs: the APIs for each role follow the same structure found in the choreography, but where all data types and parameters in method signatures that do not pertain to the role are omitted, as we exemplified previously for the \(\texttt {authenticate}\) method.
Testing. We present the first testing tool for choreographic programming. Since choreography implementations are spread over multiple components (one for each role), testing choreographies can be difficult, because it calls for integration testing. Our testing tool leverages Choral to write ‘choreographic tests’ that look like simple unit tests at the choreographic level, but are then compiled to integration tests that integrate the respective implementations of all roles.
Evaluation. We explore the expressivity of Choral with use cases of different kinds, covering security protocols, cyber-physical systems connected to the cloud, and parallel algorithms. For these use cases, we discuss relevant code excerpts and development techniques induced by Choral.
We then move from our examples to real-world comparisons.
First, we show how Choral can be used to transition existing Java programs to choreographies. We reimplement in Choral the Java implementation of Karatsuba’s algorithm for fast multiplication, which yields a parallel implementation. Furthermore, we reimplement a clone of Twitter developed by the Spring team. Implementing this system in Choral requires identifying roles, which makes the original monolithic application much more modular.
Second, we compare Choral to a popular framework for concurrent programming in Java based on actors (Akka). In particular, we identify the key differences in the development processes and resulting architectures induced by the two technologies. We find that Choral provides concrete benefits in avoiding subtle concurrency bugs.
Third, we carry out a quantitative evaluation of how Choral impacts software development and runtime performance. Thanks to its choreographic approach, Choral consistently leads to smaller codebases. Moreover, the impact of our compiler on the speed of the edit-compile cycle is negligible (milliseconds). Finally, we show that the runtime performance of the code generated by Choral is not worse (and often better) than that of alternative implementations in Java and Akka.
Outline. We overview Choral with simple examples in Section
2, and give more realistic use cases in Section
3. The syntax and implementation of Choral are discussed in Section
4, and testing in Section
5. We evaluate Choral in depth in Section
6. Related and future work are discussed in Section
7. We draw conclusions in Section
8.
2 Choral in Practice
We start with an overview of the key features of Choral. First, spatial distribution: the expression of computation that takes place at different roles (Section
2.1). Second, interaction: the coding of data exchanges between roles (Section
2.2). Third, knowledge of choice: the coordination of roles to choose between alternative behaviours (Section
2.3).
The Choral language is quite big. Its usefulness depends on the capability to produce software libraries whose APIs look like “idiomatic” Java APIs, so we chose to incorporate a substantial set of features, which would commonly be considered necessary to use and produce Java APIs: Choral supports classes, interfaces, generics, inheritance, and method overloading. APIs generated by Choral support lambda expressions, in the sense that Java programmers can pass lambda expressions as arguments to our APIs. (Just as in Java, Choral sees these arguments as objects.) Supporting the Java syntax for lambda expressions inside Choral programs is not necessary for our objective, since they can be encoded as objects, so we leave it to future work on ergonomics.
In the rest of this section, we explain the key aspects of Choral by assuming that the reader is familiar with Java. The reader can assume that language constructs that have the same syntax as Java behave as expected (modulo our additions, which we explain in the text).
2.1 Roles and Data Types
Hello roles. All values in Choral are distributed over one or more roles, using the
\(\texttt {@}\)-notation seen in Section
1. The degenerate case of values involving one role allows Choral to reuse existing Java classes and interfaces, lifted mechanically to Choral types and made available to Choral code. For example, the literal
"Hello from A"@A is a string value
"Hello from A" located at role A. Code involving different roles can be freely mixed in Choral, as in the following snippet.
The code above defines a class, \(\texttt {HelloRoles}\), parameterised over two roles, A and B. At line 3, we assign the string "Hello from A" located at A ("Hello from A"@A) to variable \(\texttt {a}\) of type “String at A” (String@A). At line 4, we do the same for a string located at B. Then, at line 5, we print variable \(\texttt {a}\) by using the \(\texttt {System}\) object at A (System@A); at line 6, we do the same for variable \(\texttt {b}\) at role B.
Roles are part of data types in Choral, adding a new dimension to typing. For example, the statement String@A a = "Hello from B"@B would be ill-typed, because the expression on the right returns data at a different role from that expected by the left-hand side.
Formally, in Choral,
\(\texttt {String}\) is a type of a higher kind (or type constructor): it takes a role to return a type of the same kind of Java types [Moors et al.
2008]. The code
String@A represents the instantiation of
\(\texttt {String}\) at role
A. Any Java type is automatically liftable to a Choral type with a single role parameter by following the same reasoning, enabling interoperability. Type constructors in Choral are not limited to a single role in general. We are going to see examples with multiple roles and more complex type parameters later in this section.
From Choral to Java. Given class \(\texttt {HelloRoles}\), the Choral compiler generates for each role parameter a Java class with the behaviour for that role, in compliance with the source class. Here, the Java class for role A is \(\texttt {HelloRoles_A}\) and the class for B is \(\texttt {HelloRoles_B}\).
Each generated class contains only the instructions that pertain to that role. If Java developers want to implement the behaviour of method \(\texttt {sayHello}\) for a specific role of the \(\texttt {HelloRoles}\) choreography, say A, then they just need to invoke the generated \(\texttt {sayHello}\) method in the respective generated class (\(\texttt {HelloRoles_A}\)). If all Java programs interested in participating in \(\texttt {HelloRoles}\) do that, then their resulting global behaviour complies by construction with the source choreography.
Notice that the code compiled for
A and
B will not interact and can therefore proceed fully concurrently, because the choreography does not prescribe so. In general, choreographic programming languages are expected to generate code that interacts only to enact the communications that the programmer specified in the choreography [Montesi
2023]. Choral follows this adequacy principle. We discuss how to program interactions in Section
2.2.
Distributed state. Fields of Choral classes carry state and can be distributed over different roles. For example, a class \(\texttt {DPair}\) can define a distributed structure storing two values at different roles.
Class
\(\texttt {DPair}\) is distributed between roles
A and
B and has two fields,
\(\texttt {left}\) and
\(\texttt {right}\). In general, every class or interface in Choral is always parameterised on at least one role and, hence, it is a type constructor. The class is also parameterised on two data types,
\(\texttt {L}\) and
\(\texttt {R}\), which exemplifies our support for generics [Naftalin and Wadler
2006]. At line 1,
L@C specifies that
\(\texttt {L}\) is expected to be a data type parameterised over a single role, abstracted by
C; similarly for
R@D. Naming the role parameters of
\(\texttt {L}\) and
\(\texttt {R}\) does not add any information in this particular example (we only need to know that they have one parameter). However, naming role parameters in generics is useful for expressing type bounds in
extends clauses, as discussed later in this section. Choral interprets role parameter binders as in Java generics: the first appearance of a parameter is a binder, while subsequent appearances of the same parameter are bound. Observe that the scope of role parameters
C and
D is limited to the declaration of the type parameters
\(\texttt {L}\) and
\(\texttt {R}\), respectively—we use distinct names exclusively for readability. At lines 2 and 3, we have the two fields,
\(\texttt {left}\) and
\(\texttt {right}\), respectively, located at
A and
B as stated by the types
L@A and
R@B, the constructor is at line 4, while accessors to the two fields are at lines 5 and 6.
Data structures like
\(\texttt {DPair}\) are useful when defining choreographies where the data at some role needs to correlate with data at another role, as with distributed authentication tokens. We apply them to a use case in Section
3.1.
2.2 Interaction
Choral programs become interesting when they contain interaction between roles—otherwise, they are simple interleavings of local independent behaviours by different roles, as in \(\texttt {HelloRoles}\).
Choreography models typically come with some fixed primitives for interaction, e.g., sending a value from a role to another over a channel [Carbone et al.
2012]. Thanks to our data types parameterised over roles, Choral is more expressive: We can
program these basic building blocks and then construct more complex interactions compositionally. This allows us to be specific about the requirements of choreographies regarding communications, leading to more reusable code. For instance, if a choreography needs only a directed channel, then our type system can see by subtyping that a bidirectional channel is also fine.
Directed data channels. We start our exploration of interaction in Choral from simple, directed channels for transporting data. In Choral, such a channel is just an object that takes data from one place to another. We can specify this behaviour as an interface.
A \(\texttt {DiDataChannel}\) is the interface of a directed channel between two roles, abstracted by A and B, for transmitting data of a given type, abstracted by the type parameter \(\texttt {T}\), from A to B (hence the number of role parameters in \(\texttt {T}\)). Data transmission is performed by invoking the only method of the interface: \(\texttt {com}\), which takes any value of a subtype of \(\texttt {T}\) located at A, S@A, and returns a value of type S@B. The type parameter \(\texttt {S}\) of method \(\texttt {com}\) has \(\texttt {T}\) as the upper bound (we can read the expression S@Dextends T@D as “for any role D, S@Dextends T@D”) and allows us to transmit data of types more specific than \(\texttt {T}\) without losing type information (as it would be if the signature of \(\texttt {com}\) was simply T@B com(T@A m)).
Parameterising data channels over the type of transferrable data (\(\texttt {T}\)) is important in practice for channel implementors, because they often need to deal with data marshalling. Choral comes with a standard library that offers implementations of our channel APIs for a few common types of channels, e.g., TCP/IP sockets supporting JSON objects and shared memory channels. Users can provide their own implementations.
Using a \(\texttt {DiDataChannel}\), we can write a simple method that sends a string notification from a Client to a Server and logs the reception by printing on screen.
Note that \(\texttt {String}\) is a valid instantiation of \(\texttt {T}\) of \(\texttt {DiDataChannel}\), because we lift all Java types as Choral types parameterised over a single role.
Alien data types. Compiling \(\texttt {DiDataChannel}\) to Java poses an important question: what should be the return type of method \(\texttt {com}\) in the code produced for role A? Since the return type does not mention A (we say that it is alien to A), a naïve answer to this question could be void, as follows.
It turns out that this solution does not work well with expressions that compose multiple method calls, including chaining like \(\texttt {m1(e1 ,e2) .m2(e3)}\) and nesting like \(\texttt {m1(m2(e))}\). As a concrete example, consider a simple round-trip communication from A to B and back.
Method \(\texttt {roundTrip}\) takes two channels, \(\texttt {chAB}\) and \(\texttt {chBA}\), which are directed channels, respectively, from A to B and from B to A. The method sends the input \(\texttt {msg}\) from A to B and back by nested \(\texttt {com}\)s and returns the result at A.
A structure-preserving compilation of method \(\texttt {roundTrip}\) for role A would be as follows.
Observe how the inner method call, \(\texttt {chAB. com}< \texttt{T}> \texttt{(msg)}\), should return something, such that it can trigger the execution of the outer method call to receive the response. Therefore, the \(\texttt {com}\) method of \(\texttt {DiDataChannel_A}\) cannot have void as the return type.
Programming language experts have probably guessed by now that the solution is to use unit values instead of void. Indeed, Choral defines a singleton type \(\texttt {Unit}\) (a final class) that our compiler uses instead of void to obtain Java code whose structure resembles its Choral source code.
We now show the Java code produced by our compiler from \(\texttt {DiDataChannel}\) for both A and B.
Given these interfaces, the compilation of \(\texttt {roundTrip}\) for role A is well-typed and correct Java code. Without our usage of \(\texttt {Unit}\), we would be forced to modify the structure of the compiled code wrt to its source. We chose in favour of our solution, because preserving structure makes it easier to read and debug the compiled code (especially when comparing it to the source choreography), and also makes our compiler simpler.
The users of our compiled libraries are not forced to pass \(\texttt {Unit}\) arguments to methods. Indeed, for methods like \(\texttt {com}\) of \(\texttt {DiDataChannel_B}\), our compiler generates corresponding ‘courtesy methods’ that take no parameters and inject \(\texttt {Unit}\)s automatically.
Bidirectional channels. An immediate generalisation of directed data channels brings us to bidirectional data channels, specified by \(\texttt {BiDataChannel}\).
A \(\texttt {BiDataChannel}\) is parameterised over two types: \(\texttt {T}\) is the type of data that can be transferred from A to B and, vice versa, \(\texttt {R}\) is the type of data that can be transferred in the opposite direction. This is obtained by multiple type inheritance: \(\texttt {BiDataChannel}\) extends \(\texttt {DiDataChannel}\) in one and the other direction, which allows us to modularly use a bidirectional data channel in code that has the weaker requirement of a directed data channel in one of the two supported directions. Distinguishing the two parameters \(\texttt {T}\) and \(\texttt {R}\) is useful for protocols that have different types for requests and responses, like HTTP. Extending \(\texttt {DiDataChannel}\) twice does not result in any clashes, since A and B play different roles in each supertype. This “twin” inheritance results in the overload of method \(\texttt {com}\), one for each communication direction supported by the channel. This overload does not result in any clash in the compilation, as illustrated by the code generated for A (the code generated for B is symmetric).
We discuss more types of channels (including symmetric channels) in Section
2.4 and provide more details on inheritance and overloading in Section
4.
Forward chaining. We use bidirectional channels to define a choreography for remote procedure calls, called \(\texttt {RemoteFunction}\), which leverages the standard Java interface \(\texttt {Function}< \texttt{T,R}>\).
In our experience with programming larger Choral programs (as those in Section
3), we found it rather natural to compose method invocations that transfer data, as in line 8 of
\(\texttt {RemoteFunction}\). In these chains, we read data transfers from right to left (innermost to outermost invocation), but most choreography models in the literature use a left-to-right notation (as in “Alice sends 5 to Bob”). To make Choral closer to that familiar choreographic notation, we borrow the forward chaining operator
> > from F# [Petricek and Skeet
2009], so that
exp > > obj::method is syntactic sugar for
\(\texttt {obj.method(exp)}\). For example, we can rewrite method
\(\texttt {call}\) of
\(\texttt {RemoteFunction}\) as follows, which is arguably more readable and recovers a more familiar choreographic notation.
Using Choral libraries. As mentioned for Channels, when we compile the \(\texttt {RemoteFunction}\) class above, we obtain two Java classes: a \(\texttt {RemoteFunction_Client}\), which sends some data to the Server for processing and waits for its response, and a \(\texttt {RemoteFunction_Server}\), which, upon reception, feeds the data into a \(\texttt {Function}\) and sends back to the Client its result.
The \(\texttt {RemoteFunction_Server}\) is an interesting example of how users interact with Choral libraries. The code (snippet) generated from Choral is:
A user of the \(\texttt {RemoteFunction_Server}\) can interact in the choreography by providing the definition of the \(\texttt {Function}\) at the creation of the object. In general, this is how we expect users to integrate Choral-generated code with their “local code,” i.e., code parametric to the choreography that users can implement locally, without any coordination with the other participants (save the APIs induced by Choral-generated code). For example, the snippet below is from a Java class that uses \(\texttt {RemoteFunction_Server}\) to provide a remote procedure for checking if an integer is even.
Here, at line 2 (second argument of the constructor), we provide the definition of the \(\texttt {Function}\) using Java Lambdas functional syntax.
2.3 Knowledge of Choice
Knowledge of choice is a hallmark challenge of choreographies: when a choreography chooses between two alternative behaviours, roles should coordinate to ensure that they agree on which behaviour should be implemented [Castagna et al.
2011].
We exemplify the challenge with the following code, which implements the consumption of a stream of items from a producer A to a consumer B.
Method \(\texttt {consumeItems}\) takes a channel from A to B, an iterator over a collection of items at A, and a consumer function for items at B. Role B works reactively, where its consumer function is invoked whenever the stream of A produces an element: If the iterator can provide an item (line 3), then it is transmitted from A to B, consumed at B, and the method recurs to consume the other items (line 4).
The reader familiar with choreographies should recognise that this method implementation is wrong, due to (missing) knowledge of choice: The information on whether the if-branch should be entered or not is known only by A (since it evaluates the condition), so B does not know whether it should run lines 4 and 5 (receive, consume, and recur), or do nothing and terminate.
In choreographic programming, knowledge of choice is typically addressed by equipping the choreography language with a “selection” primitive to communicate constants drawn from a dedicated set of “labels” [Carbone and Montesi
2013; López et al.
2016]. This makes it possible for the compiler to build code that can react to choices made by other roles, inspired by a theoretical operator known as merging [Carbone et al.
2012]. In Choral, we adapt this practice to objects. Notably, Choral is expressive enough that we do not need to add a dedicated primitive or a dedicated set of labels.
We define a method-level annotation @SelectionMethod, which developers can apply only to methods that can transmit instances of enumerated types between roles (the compiler checks for this condition). For example, we can specify a directed channel for sending such enumerated values with the following \(\texttt {DiSelectChannel}\) interface.
Our compiler assumes that implementations of methods annotated with @SelectionMethod return at the receiver the same value given at the sender. (This is part of the contract for channels, and it is a standard assumption in implementations of choreographies.)
Typically, channels used in choreographies are assumed to support both data communications and selections. We can capture this functionality with \(\texttt {DiChannel}\) (directed channel), a subtype of both \(\texttt {DiDataChannel}\) and \(\texttt {DiSelectChannel}\) (we include inherited methods for convenience).
Using \(\texttt {DiChannel}\)s, we can update \(\texttt {consumeItems}\) to respect knowledge of choice.
Differently from the previous, broken implementation of
\(\texttt {consumeItems}\), now role
A sends a selection of either
\(\texttt {GO}\) or
\(\texttt {STOP}\) to
B. Role
B can now inspect the received enumerated value to infer whether it should execute the code for the if- or the else-branch of the conditional. This information is exploited by our static analyser to check that
\(\texttt {consumeItems}\) respects knowledge of choice, and also by our compiler to generate code for
B that reacts correctly to the choice performed by
A. (We provide a more extensive example, including the code compiled for the receiver, in Section
3.1.)
Our compiler supports three features to make knowledge of choice flexible. First, our check for knowledge of choice works with arbitrarily nested conditionals. Second, knowledge of choice can be propagated transitively. Say that a role
A makes a choice that determines that two other roles
B and
C should behave differently, and
A informs
B of the choice through a selection. Now either
A or
B can inform
C with a selection, because our compiler sees that
B now possesses knowledge of choice. Third, knowledge of choice is required only when necessary: If
A makes a choice and another role, say
B, does not need to know, because it performs the same actions (e.g., receiving an integer from
A) in both branches, then no selection is necessary. We explain the technicalities of this check in Section
4.
2.4 The Family of Choral Channels
Choral types give us a new way to specify requirements on channels that prior work implicitly assumed, leading to the definition of a family of channel interfaces diagrammed in Figure
3.
From the left-most column in Figure
3, at the top, we find
\(\texttt {DiDataChannel}\), representing a directed channel parameterised over
\(\texttt {T}\) (the type of data that can be sent). We obtain
\(\texttt {BiDataChannel}\), a bidirectional data channel, by extending
\(\texttt {DiDataChannel}\) once for each direction: ① it binds the role parameters of one extension in the same order given for the role parameters of
\(\texttt {BiDataChannel}\), giving us a direction from
A to
B and ② it binds the role parameters of the other extension in the opposite way, giving us a direction from
B to
A. The result is that
\(\texttt {BiDataChannel}\) defines two
\(\texttt {com}\) methods: one transmitting from
A to
B, the other from
B to
A. The last lines in ① and ② in Figure
3 complete the picture: the first generic data type
\(\texttt {T}\) binds data from
A to
B, the second generic data type
\(\texttt {R}\) binds data from
B to
A. The
\(\texttt {SymDataChannel}\) in Figure
3, by extending the
\(\texttt {BiDataChannel}\) interface and binding the two generic data types
\(\texttt {T}\) and
\(\texttt {R}\) with its only generic data type
\(\texttt {T}\), defines a bidirectional data channel that transmits one type of data, regardless of its direction.
The right-most vertical hierarchy in Figure
3 represents channels supporting selections and it follows a structure similar to that of data channels. A
\(\texttt {DiSelectChannel}\) is a directed selection channel and a
\(\texttt {SymSelectChannel}\) is the bidirectional version—there is no
\(\texttt {BiSelectChannel}\) since both directions exchange the same enumerated types.
The vertical hierarchy in the middle column of Figure
3 is the combination of the left-most and right-most columns. Interface
\(\texttt {DiChannel}\) is a directed channel that supports both generic data communications and selections.
\(\texttt {BiChannel}\) is its bidirectional extension (③ and ④ in Figure
3), and
\(\texttt {SymChannel}\) is the symmetric extension of
\(\texttt {BiChannel}\). The snippet below contains the definition of the interface
\(\texttt {BiChannel}\).
This definition means that for any pair of distinct roles C, D and for any pair of types \(\texttt {S}\), \(\texttt {P}\) (with one role parameter), BiChannel@(C, D)<S,P> is a subtype of DiChannel@(C,D)<S>, DiChannel@(D,C)<P>, BiDataChannel@(C,D)<S,P>, and BiSelectChannel@(C,D).
Implementing Choral channels. Our channel interfaces can be implemented directly in Choral or in Java. We exemplify the latter case, which lets us also show how one can carry out this task by fully leveraging the existing Java language and ecosystem.
Let us say we want to implement a symmetric channel over TCP/IP that a client (say A) is going to use for transmitting strings to a server (say B). We can implement this channel by writing a class \(\texttt {TCPClientChannel}\) that implements the interface generated for A from \(\texttt {SymChannel}\) (instantiating its generic with \(\texttt {String}\)). The snippet below shows the structure of such an implementation.
The \(\texttt {TCPClientChannel}\) uses \(\texttt {SocketChannel}\) from the Java standard library. The class also offers a factory method for connecting to a remote address in the standard way. This method creates a channel that the user can pass as an argument to the code generated from a choreography. In particular, the \(\texttt {TCPClientChannel}\) is designed to provide a conventional socket set-up experience for Java programmer, where the expected contract relies on opening a connection based on a \(\texttt {SocketAddress}\). Implementing the server-side counterpart of \(\texttt {TCPClientChannel}\) is straightforward and follows the same approach.
2.5 Handling Exceptions
Typically, choreographic languages assume reliable communications [Carbone et al.
2008; Carbone and Montesi
2013; Dalla Preda et al.
2017]. The only exception is the language theory in Montesi and Peressotti [
2017], which shows that one can relax this assumption, by allowing the choreographic language to handle local exceptions. In Choral, we follow the same strategy, which we briefly illustrate here.
Choral can invoke Java code, which might raise an exception. Plain Java code is always located at one role, and therefore the same holds for exceptions (exceptions are “local”). We exemplify how we treat exceptions with the following choreography, where a role B uses the Java standard library to save on disk some text communicated from another role A.
Above, we start by communicating the text to be saved from A to B (line 4). We then declare at B a \(\texttt {result}\) variable (line 5), which will store either a success or error message—that B later communicates to A (line 12). At line 7, B attempts to save the received text.
This choreography might incur execution errors related to communication or file writing. Exceptions encapsulate these errors. For example, the invocation of method \(\texttt {writeString}\) might throw an \(\texttt {IOException}\) located at B, which we handle with the try-catch block at lines 6–11.
Method \(\texttt {com}\) can throw exceptions too, depending on the implementation of channel \(\texttt {ch}\). Channels for remote communication (e.g., based on TCP/IP sockets) in the Choral library use the following strategy: the sender attempts at sending a message until its network stack accepts the task (using exponential backoff and bound by a maximal number of attempts, to guarantee termination); likewise, the receiver attempts at receiving until a timeout expires. If the sender ultimately fails at relaying the message to its local network stack, then the channel throws a \(\texttt {SendException}\) at the sender (A in our example). If the receiver timeouts before receiving a message, then the channel raises a \(\texttt {TimeoutException}\) at the receiver (B in our example).
As a design choice, we left the exceptions of method
\(\texttt {com}\) unchecked. The idea is that the implementation of channels should do their best to deliver messages, and when this is not possible the local code that uses the code generated from a choreography should deem the execution of the choreography unsuccessful. The local code is free to catch these exceptions and attempt recovery, for example by executing the choreography again (as in actor frameworks [Wyatt
2013]).
However, our implementation of \(\texttt {com}\) in the Choral standard library is just a default; Choral does not hardcode any communication semantics. The user is free to implement alternative communication methods that expose an API, which the caller choreography might use to handle network errors, e.g., lost messages. For instance, we might account for lossy communications between A and B within the above choreography as follows.
Channel \(\texttt {chAB}\) is now a \(\texttt {LossySymDataChannel}\) and, in addition to method \(\texttt {com}\), it offers also method \(\texttt {lossyCom}\). The latter does not throw exceptions in case of communication failures but rather returns an \(\texttt {Optional}\) value that contains the received value in case of success or is empty otherwise.
Choosing which errors a choreography should deal with and which errors should be raised as unrecoverable exceptions to the local code is a design trade-off that derives from the usual tension between robustness versus simplicity. This trade-off is typical of coordination protocols and exists independently from Choral, which is why we decided to leave the programmer free to navigate this spectrum. Gathering from our own experience with Choral, we recognised the following design principles. Protocols whose design assumes a reliable network layer should not deal with communication errors within the choreography (e.g., the Diffie-Hellman protocol for key exchange, which we briefly describe below and that we implemented for our evaluation in Section
6). Contrarily, choreographies implementing protocols designed to deal with network errors should specify the handling of those errors (e.g., implementations of objects dedicated to data transfer, like our channels). Choral is quite flexible regarding these aspects. A channel API can offer methods that both raise exceptions—like method
\(\texttt {com}\), meaning that the communication is essential—or wrap failures in data types (raising no exceptions)—like method
\(\texttt {lossyCom}\), meaning that the communication is not essential and that the choreography can handle internally the failure. The programmer can use the different methods within the same choreography to pinpoint which communications are deemed essential and which are not.
2.6 What Goes in a Choreography?
We have just looked at the design issue of deciding whether to deal with errors in the choreography or in the local code that uses the (communication code compiled from the) choreography. This is an instance of the more general issue of protocol design: what should be part of a choreography? This issue exists even when designing choreographies informally (without Choral), because one needs to choose what details are fixed in the protocol and what is instead left to the discretion of the local code. There is no one-size-fits-all solution, since these choices are influenced by the concrete use case that the choreography deals with.
Consider, for example, the widely adopted Diffie-Hellman protocol for cryptographic key exchange [Diffie and Hellman
1976]. Integral parts of the protocol specify both computation and communication. In the protocol, two parties, e.g.,
Alice and
Bob, use two pairs of keys (a private and a public one) to generate a shared secret, which they can later use for symmetric encryption. Formally, let
p be a prime number and
g be a primitive root modulo
p,
\({\color{magenta}{sA}}\) be a secret key held by
Alice, and
\({\color{magenta}{sB}}\) be a secret key held by
Bob. First,
Alice computes her public key
\({\color{green}{pA}} = (g^{\color{magenta}{sA}} \bmod p)\) and, likewise,
Bob computes his public key
\({\color{green}{pB}} = (g^{\color{magenta}{sB}} \bmod p)\). Then,
Alice and
Bob exchange their public keys, which they can use to generate their shared secret
s:
Since the computations performed by
Alice and
Bob are essential to the protocol, any faithful Choral implementation shall include those details too: doing otherwise would mean implementing a different protocol. The following is a snippet of the implementation that we have written for our evaluation in Section
6 (with variables renamed to match our description above).
Differently from these computational details, the Diffie-Hellman protocol does not fix the implementation of the channel used to communicate data. It is therefore reasonable that a Choral implementation of the protocol is parameterised on this implementation—we reflected this condition by having \(\texttt {channel}\) as a parameter of the method that contains our code above.
An example of a choreography where the definition of computation is completely abstracted away is the
\(\texttt {consumeItems}\) method in Section
2.3. The choreography fixes the coordination between the participants, but not how they produce or consume the data to be exchanged. The latter is to be defined by either local code or another choreography that invokes
\(\texttt {consumeItems}\).
In general, how much computation should be defined in a choreography forms a spectrum. A “good” choreographic programming language should thus give freedom to define or abstract away computation at will. Choral provides this capability through the standard facilities of object-oriented programming (parameters, inheritance, etc.).
4 Implementation
We discuss the main elements of the implementation of Choral. First, we show its syntax and comment on the main differences with Java’s. Then, we present the Choral type checker, including examples of the main errors related to roles that it detects and related error messages. Finally, we describe the key components of the Choral compiler.
4.1 Language
Figure
5 displays the grammar of Choral; dashed underlines denote
optional terms and solid overlines denote sequences of terms of the same sort. We omit syntax for packages and imports, which is as in Java. Reserved identifiers like
super and
this are considered identifiers in the grammar, but our compiler treats them like their Java counterparts. The key syntactic novelties are
underlined; they consist of (i) syntax for declaring and instantiating role parameters and (ii) the forward chaining operator
>> (cf. Section
2).
Role parameters have a separate namespace and always appear in expressions like @(A1,...,An) that follow the name of a class, interface, enum, or type parameter e.g., DiChannel@(A,B). Moreover, role parameters are introduced only by the declaration of a type (e.g., class Foo@(A,B)) or a type parameter (e.g., < T@(A,B) extends Foo@(A,B) & Bar@(B,A)>) and their scope is limited to the defining type, similar to type parameters in Java. The snippet below contains an example of shadowing of role parameters; for each use of role A, we show its binding site with an arrow.
The Choral type checker covers all common Java type errors (illegal type conversions, access to type members, etc.), as exemplified below. Indeed, when checking a Choral program with exactly one role parameter, the Choral type checker acts exactly like its Java counterpart.
Roles. The novelties compared to Java compilers emerge when two or more roles are involved. In these settings, programmers can make new kinds of errors that are specifically about the misuse of role-parameterised types—therefore, these errors are pertinent to Choral. In many of the examples discussed so far, we can think of role parameters as Java generics. Although this is a reasonable approximation, some care is necessary when handling type instantiation due to some substantial differences between role and type parameters.
One type of these errors is that data types are instantiated using incompatible roles. Instances of the same type with different role parameters represent values located at different roles, which restricts their usage.
The order in which roles appear carries meaning, since role parameters are positional—like type parameters in Java generics.
Differently from Java generics, role parameters cannot appear multiple times in the same type, since this corresponds to requiring that the same participant plays multiple roles in the same choreography. In the snippet below, A must play both the sender and receiver for the directed channel \(\texttt {c}\).
Forbidding role aliasing is an established restriction in choreographic programming, since aliasing introduces self-communication, which would potentially break deadlock-freedom and the capability to produce separate code for each role (unless roles are provably not-aliased).
Subtyping. Choral types form a hierarchy defined following the same principles used by Java. This hierarchy is used to check if values are compatible type as expected.
Classes and interfaces define their supertypes by extending and implementing other classes and their interfaces with the same set of roles. This restriction provides a substitution principle that elicits all roles involved in a choreography.
In some cases, “hidden roles” in choreographies might be useful, e.g., to add external auditing or data replication as an extension of an existing choreography. Unfortunately, this introduces security concerns (channels may have hidden bystanders) or complex communication semantics (what is the meaning of sending a ReplicatedList@(A,B) over a channel expecting a List@A?). These are general open problems for choreographies, left to future work.
Cyclic inheritance is not allowed and the type checker does not discriminate over role parameters. As an example, consider the \(\texttt {SymChannel}\) interface; given its symmetric nature, one might be tempted to force this equality by having SymChannel@(A,B) subtype SymChannel@(B,A).
However, allowing declarations like the one above in Choral would result in cyclic inheritance errors in Java, as exemplified by the following “manual” compilation of the code above.
To have channels that are instances of both SymChannel@(A,B) and SymChannel@(B,A) one needs to define a subtype of both as in the snippet below.
By returning an instance of the same interface but with the roles flipped, the method \(\texttt {flip()}\) introduced by the interface PeerChannel@(A,B), prescribes that the roles A and B are interchangeable peers.
Finally, primitive types (int@A, bool@A, etc.) follow the same rules of Java for subtyping, conversions, autoboxing, and autounboxing (when roles match, otherwise the compiler will return a role mismatch error).
Overloading. The Choral type checker refines overload equivalence: it can discriminate overloaded methods by considering roles. For example, m(Char@B x) and m(Char@A x) can be distinguished, because one parameter is located at A whereas the other at B. However, we need to be careful with preventing potential clashes in the compiled Java code. Consider the following snippet and error message.
The last two signatures are distinguishable in Choral, since each method has different parameter types. However, this information is only available to role A, while the projection of both signatures at role B coincide (they would both be void m()). This is an instance of knowledge of choice but, differently from conditionals, it cannot be addressed locally (within the class/interface), because extending classes may introduce new branches and new points of choice by overriding and overloading, as in the example below.
Exceptions. Like every other type lifted from Java, exceptions are located at one role. This design choice allows us to preserve the expected type hierarchy in the generated code and have \(\texttt {java.lang.Exception}\) as the supertype of all exceptions. The Choral compiler then enforces that a try-catch block is located at exactly one role.
Allowing multiple roles in the mechanics of exceptions introduces a knowledge-of-choice situation where all roles need to obtain information about which handler to execute, if any, and when. The general problem of exceptions and their choreographic handling has been investigated in some theories, but all models proposed so far assume reliable communications among all roles and either rely on specialised orchestration primitives in the target language (i.e., some form of middleware) [Carbone et al.
2008; Carbone
2009; Fowler et al.
2019] or synthesise new communications for recovery [Neykova and Yoshida
2017].
4.2 Compiler
The Choral compiler consists of several steps organised in a pipeline, which we illustrate in Figure
6. From left to right, the first step is (as expected) parsing the input Choral source code to obtain an Abstract Syntax Tree (AST)—the chain operator
>> is desugared in this step. Next, we perform type checking as previously discussed. This step also annotates the nodes in the AST with type information, which is used in the following projection step. The next step—projection—transforms this annotated AST into a collection of Choral ASTs, each representing the implementation of a single role. At this stage, all types are located at exactly one role, representing the fact that all code is fully local. Finally, in the last step, we output Java code by erasing all role annotations.
We discuss the most important aspects of projection. For clarity, we model and present it as a partial function from (well-typed) Choral terms to Java code: the projection of a Choral term
Term on a role
A, written
\({\color{skyblue}{(\!|}}Term{\color{skyblue}{|\!)}}^{\tt{\color{skyblue}{A}}}\), is a Java term that implements the behaviour of
A in
\({Term}\). Intuitively, this passage covers the last two steps in Figure
6. The full definition of projection is given in Appendix
A.
The projection of a Choral class, interface, or enum generates a corresponding Java term for each role parameter. If there are two or more roles, then each Java artefact name is suffixed with the role that it implements, e.g., the Java class compiled from class Foo(A,B) for role A is called \(\texttt {Foo_A}\). If the Choral class has exactly one role, then we use the same name, e.g., class \(\texttt {Integer@A}\) becomes class \(\texttt {Integer}\)—this practice minimises the friction of integrating Java types within Choral.
The projection \({\color{skyblue}{(\!|}}TE{\color{skyblue}{|\!)}}^{\tt{\color{skyblue}{A}}}\) of a type expression \({TE}\) at a role A is recursively defined below—we use the auxiliary function roleName(id,i) to retrieve the name of the ith role parameter from the definition of id.
The projection \({\color{skyblue}{(\!|}}Exp{\color{skyblue}{|\!)}}^{\tt{\color{skyblue}{A}}}\) of an expression \({Exp}\) at role A is defined following a similar intuition: it is a recursive stripping of role information as long as A occurs in the type of \({Exp}\) or any of its subterms (written A \(\in \mathrm{rolesOf}(\mathit {Exp})\)), otherwise it is the only instance of the singleton \(\texttt {Unit}\) (stored in its static field \(\texttt {id}\)), as illustrated by the cases of static field access and constructor invocation below:
The projection
\({\color{skyblue}{(\!|}}Stm{\color{skyblue}{|\!)}}^{\tt{\color{skyblue}{A}}}\) of a statement
\(\mathit {Stm}\) at
A is defined following the above intuition, save for the cases of conditionals and selections, which require care to address knowledge of choice (cf. Section
2.3). Specifically, the rule for projecting
\(\texttt {if}\) statements: for the role evaluating the guard (read from its type), it preserves the conditional; for all other roles, the
\(\texttt {if}\) disappears and it is replaced by the projection of the guard (since it might have side effects) followed by the
merging \({\color{red}{\sqcup}}\) of the projections of the bodies of the two branches and the projection of the continuation
\(\mathit {Stm}\).
The merge operator
\(\mathit {Stm}{\color{red}{\sqcup}}\mathit {Stm}^{\prime }\) is a partial operator that tries to combine branching code [Carbone et al.
2012], which we adapt to Java for the first time. Essentially, given two Java terms, merging recursively requires them to be equivalent
unless they are switch statements. Appendix
A contains the full definition of merging. Here, we report its most interesting case, merging switch statements:
Above, the merging of two switch statements is a switch whose guard is the merging of the original guards (\(\mathit {Exp}{\color{red}{\sqcup}}\mathit {Exp}^{\prime }\)). In the merging, for each case present in both the input switches (\(\mathit {id}_a, \ldots , \mathit {id}_x\)), we get a case in the result whose body merges the respective bodies of the original cases; all cases that are not shared are simply put in the result as they are (the lists of cases \(\overline{\hbox{$\texttt {case}$}. \mathit {id}_y. \hbox{${-> }$} . \mathit {Stm}_y }\) from the first and \(\overline{\hbox{$\texttt {case}$}. \mathit {id}_z: \mathit {Stm}_z }\) from the second).
An example of the result of merging was presented for
\(\texttt {DistAuth_Client}\) in Section
3.1, where the cases for
\(\texttt {OK}\) and
\(\texttt {KO}\) are combined from the respective projections for
Client of the two branches in the source choreographic conditional evaluated by
IP. These cases are produced by the rule for projecting selections, which applies to statements of the form
\(\mathit {Exp}\hbox{$\texttt {;}$}. \mathit {Stm}\) when
\(\mathit {Exp}\) calls (possibly in a chain call) a method annotated with
@SelectionMethod. (Our type checker checks that these annotations are used only for methods that take enumerated types as parameters, cf. Section
2.3.) For compactness, let
\(S = \overline{{Exp}.}{\langle \overline{{TE}}\rangle }{id}_1(\mathit {id}_2@{\tt{\color{skyblue}{A}}}^{\prime }.{id}_3)\) where
@SelectionMethod \(\in \texttt {annotations}(\mathit {id}_1)\)For the recipient of the selection (first case), the statement becomes a switch on the projection of the \(\mathit {Exp}\)ression that will receive the selection, while the projection of the continuation \(\mathit {Stm}\) becomes the body of the corresponding case in the argument. The projection for the other roles (second case) is standard, projecting the \(\mathit {Exp}\)ression followed by the projection of the continuation \(\mathit {Stm}\).
Our implementation of merging is smart enough to deal with some “non-effectful” usages of \(\texttt {Unit}\). For instance, consider the following choreography:
If we project it at a role different from
A, say
B, then we obtain the code
Unit.id(Unit.id) for the then-branch, and
\([blank]\) for the (missing) else-branch. These fragments are not mergeable, but our compiler uses a
unit-normalising operator, given in Appendix
A, which transforms also the first fragment into
\([blank]\) by removing the irrelevant usages of
\(\hbox{$\texttt {Unit}$}\).
5 Testing
Testing implementations of choreographies is challenging, since the distributed programs of all participants need to be integrated (integration testing). We introduce ChoralUnit, a testing tool that enables the writing of integration tests as simple unit tests for choreographic classes.
Writing tests. Following standard practice in object-oriented languages and inspired by JUnit, tests in ChoralUnit are defined as methods marked with a
@Test annotation [Hamill
2004; Murphy VII et al.
2004]. For example, we can define the following unit test for the
\(\texttt {VitalsStreaming}\) class from Section
3.2.
The test method \(\texttt {test1}\) checks that data is pseudonymised correctly by \(\texttt {VitalsStreaming}\). Test methods must be annotated with @Test, be static, have no parameters, and return no values.
At lines 4 and 5, we create a channel between the Device and the Gatherer by invoking the \(\texttt {TestUtils .newLocalChannel}\) method, which is provided by ChoralUnit as a library to simplify the creation of channels for testing purposes. This method returns an in-memory channel, which both Device and Gatherer will find by looking it up in a shared map under the key "VST_channel 1". Thus, both roles must have the same key in their compiled code, which is guaranteed, here, by the fact that the expression "VST_channel 1"@[Device,Gatherer] is syntactic sugar for "VST_channel 1"@Device, "VST_channel 1"@Gatherer.
At lines 6 and 7, we create an instance of \(\texttt {VitalsStreaming}\) (the choreography we want to test). We use a \(\texttt {FakeSensor}\) object to simulate a sensor that sends some data containing sensitive information (omitted). We then invoke the \(\texttt {gather}\) method, passing an implementation of a consumer that checks whether the data received by the Gatherer has been pseudonymised correctly.
Given a class like \(\texttt {VitalsStreamingTest}\), ChoralUnit compiles it by invoking our compiler with a special flag (\(\texttt {-annotate}\)). This flag makes the compiler annotate each generated Java class with an @Choreography annotation that contains the name of its source Choral class and the role that the Java class implements.
When the compilation is finished, we can invoke ChoralUnit to run the tests from the class \(\texttt {VitalsStreamingTest}\). Once launched, the tool finds all the Java classes with an @Choreography annotation whose \(\texttt {name}\) value corresponds to \(\texttt {VitalsStreamingTest}\). By construction, each discovered class has a method with the same name, corresponding to the namesake method from the source Choral test class (\(\texttt {test1}\), in our example). ChoralUnit exhaustively runs all the tests found in the test class. Namely, for each @Test-annotated method and for each class generated from the Choral source, ChoralUnit starts a thread that runs the local implementation of that method implemented by that class.
In our example, \(\texttt {VitalsStreamingTest}\) is compiled to a class for Device and another for Gatherer, each with a \(\texttt {test1}\) method. Thus, ChoralUnit starts two threads, one running \(\texttt {test1}\) of the first generated Java class and the other running \(\texttt {test1}\) of the second generated Java class.
Multiparty assertions. In the previous example, we have written an assertion (in
\(\texttt {PseudoChecker}\)) that checks a condition at a single role (
Gatherer). Sometimes, it is useful to assert conditions that involve multiple roles. A typical example is testing the correct implementation of protocols that aim at making two parties agree on a symmetric cryptographic key, like the Diffie-Hellman protocol [Diffie and Hellman
1976]. In particular, after running the protocol, the two participants (say
A and
B) should have the same key. We can express this assertion as follows.
Above, method \(\texttt {assertEquals}\) of class \(\texttt {Assert2}\) uses the channel \(\texttt {chAB}\) to communicate the key at A (\(\texttt {keyA}\)) from A to B, and then checks locally at B that it is equal to the key at B (\(\texttt {keyB}\)). If the check fails, then an assertion error is raised at B.
Class \(\texttt {Assert2}\) can be user-defined, and likewise, developers can define classes that allow for assertions that involve more roles (e.g., \(\texttt {Assert3}\), \(\texttt {Assert4}\), etc.). In these implementations, the user can also freely code different protocols for communicating the data among the participants.
7 Related Work, Discussion, and Future Work
Choral is a
choreographic programming language, in that it makes the flow of interactions and their related computations manifest from a global viewpoint [Montesi
2013]. While Choral suffices already in tackling different kinds of use cases, as we have discussed in this article, the literature on choreographies is vast. Choral includes and generalises many features found in previous work on choreographies. There are other features that we have not considered in this work, along with open problems that we pointed out when appropriate in the previous sections. We discuss related work and other potential future developments of Choral in the rest of this section.
Previous implementations of choreographic programming. The idea of synthesising local participant specifications that comply with choreographies has been a hot research topic for more than 20 years, and work in this line of research is typically based on automata or process calculi abstractions [Alur et al.
2000; Qiu et al.
2007; Basu et al.
2012; Honda et al.
2016; Autili et al.
2018]. Previous implementations of choreographic programming consist of Chor [Carbone and Montesi
2013] and AIOCJ [Dalla Preda et al.
2017], which are based on process calculi and generate executable Jolie code. Compared to them, Choral solves the modularity problems mentioned in the Introduction, by revisiting choreographies under the light of mainstream abstractions. Another advantage is that the types of channels needed by a choreography are made explicit and can be user-defined [Qiu et al.
2007; Carbone et al.
2012; Carbone and Montesi
2013; Honda et al.
2016].
Other approaches to spatially distributed programming. The types that support our choreography-as-objects interpretation have been inspired by ideas found in modal logics for mobile ambients [Cardelli and Gordon
2000] and, later, in the line of work on multitier programming [Murphy VII et al.
2004; Neubauer and Thiemann
2005; Cooper et al.
2006; Serrano et al.
2006; Murphy VII et al.
2007; Liu et al.
2009; Weisenburger et al.
2018]. In the words of Murphy VII et al. [
2004], these works represent other approaches to “spatially distributed computation.” For example, in the most recent incarnation of multitier programming (ScalaLoci, by Weisenburger et al. [
2018]), a distributed application is essentially defined as a single program that composes different functions, each localised at a single participant. A function can then invoke special primitives to request remote computation by another participant, whose implementation must always be ready for such requests [Weisenburger et al.
2020]. Differently from choreographies, this makes the flow of communications implicit and dependent on an underlying middleware—indeed, multitier programming was not designed to address the problems of defining choreographies and addressing choreography compliance as an aim. Choral generalises data types localised at a single participant to data types localised at many participants (roles), which enables our novel development process for choreography-compliant libraries. Castro-Perez and Yoshida [
2020] explored the parallelisation of a simple multitier first-order functional language, for which they can infer abstract (computation is not included) choreographies of the communication flows that these programs can enact; Choral could be a candidate implementation language for this kind of models. Fowler et al. [
2019] explored the idea of incorporating simple choreographic languages (without computation) to ensure that multitier code between two participants enacts specific protocols.
Choral’s clear relation to the ideas found in logics for mobile ambients has already proven useful. In particular, Giallorenzo et al. [
2021] use Choral to kickstart an investigation of the links and differences between choreographic and multitier programming, by taking Choral and ScalaLoci as representative languages. After identifying the core abstractions that differentiate the two approaches, the authors provide algorithms for translating Choral code into ScalaLoci code and vice versa. Going from a multitier program to a choreographic one requires synthesising one of the many possible protocols (a choreography) that implements the necessary communications to execute the multitier program (which does not specify this aspect). This connection paves the way for joint research and cross-fertilisation between the two communities [Giallorenzo et al.
2021].
Higher-order choreographies. Interpreting choreographies as objects enables, for the first time, higher-order composition of choreographies that carry state (the fields of the objects): stateful choreographies (objects) can be passed as arguments. Stateful choreographies have been investigated before without higher-order composition—see, for example, the works by Carbone et al. [
2012]; Chen and Honda [
2012]; Cruz-Filipe and Montesi [
2020]. Demangeon and Honda [
2012] studied how parameters that abstract choreographies can be expanded syntactically, but their choreographies cannot carry state and there must be a role that acts as an orchestrator to “enter” into a choreography (whereas in Choral, control is fully distributed). In the setting of multitier programming, Weisenburger and Salvaneschi [
2019] introduced a module system to write multitier programs as compositions of submodules. Differently from Choral and the work by Demangeon and Honda [
2012], their approach requires fixing roles statically, whereas in our case roles can be freely instantiated at runtime—for example, our merge sort example in Section
3.3 exploits this feature when roles are exchanged in recursive calls. Thus, our new data types might be interesting in the setting of multitier programming too.
Choral’s principles in other settings. The core idea in Choral’s design is to have data types at multiple locations (the roles of the choreography): \(\texttt {T@(A_1, ... A_n)}\). This information is the compass that guided us in lifting the various aspects of Java to the choreographic level and in designing our notion of projection. We believe that this idea can be easily applied to other object-oriented languages—C#, Kotlin, Scala, and others—by extending their types in the same way and retracing our steps.
The applicability of this core idea goes even beyond object-oriented languages. Choral’s first technical report and release [Giallorenzo et al.
2020; Choral Development Team
2020] have already inspired the investigation of theories and implementations of functional choreographic programming languages [Cruz-Filipe et al.
2022;,
2023a; Hirsch and Garg
2022; Shen et al.
2023; Graversen et al.
2023]. These studies confirm the generality of our approach: just like extending Java types with roles yields an object-oriented choreographic programming language (Choral), doing the same to the
\(\lambda\)-calculus supports the development of a theory of functional choreographic programming [Cruz-Filipe et al.
2022;,
2023a].
Safety and liveness. The theory of choreographic languages comes with strong safety and liveness properties, which stem from the high-level abstractions provided by choreographies and their projections to distributed code [Montesi
2023]. Under the same assumptions made (sometimes implicitly) in these proofs, Choral promises the traditional safety and liveness properties of choreographic programming languages. We discuss these properties and how they relate to Choral’s modularity and flexibility wrt communication middleware. Furthermore, in the end, we report useful references and outline future research for the formalisation of these promises.
Choreographic programming languages guarantee
communication safety. That is, processes never try to interact by performing incompatible communication actions—if one action is “send,” then the other is a “receive,” and the type of the sent message is always (a subtype of) the one expected by the receiver [Montesi
2023; Carbone et al.
2012]. Proofs of this result rely on the assumption that communications is reliable (messages are never lost, duplicated, or corrupted) or at least that message exchange primitives adopt suitable best-effort and timeout strategies [Montesi and Peressotti
2017; Montesi
2023]. Furthermore, communication primitives should respect a locality principle: messages should be dispatched to their intended recipients—and vice versa, receiving from a role gives a message that was sent by that role [Montesi
2013; Giallorenzo et al.
2018]. In Choral, like in any previous implementation of choreographic languages, these assumptions form a contract for the implementation of channels. This contract must be manually enforced, or communication methods might be “wrong.” A trivial example is a channel for communicating integers that always returns the constant
\(\texttt {1}\) at the receiver. In the future, it would be interesting to explore the formalisation of contracts for channel middleware and libraries and the development of verified implementations.
Another property of choreographic programming—and probably the most known—is
deadlock-freedom by design: the code compiled from a choreography is deadlock-free, because it is not possible to write mismatched communications in choreographies [Carbone and Montesi
2013; Dalla Preda et al.
2017; Hirsch and Garg
2022; Montesi
2023; Jongmans and van den Bos
2022]. The relevant assumptions, here, are that foreign code (in our case, Java code) used within a choreography always terminates, and that communication never blocks the sender or receiver indefinitely. In the real world, these assumptions usually do not hold, so it is important to adopt timeout and supervision mechanisms to avoid divergence and deal with failures—as exemplified in Sections
2.5 and
3.2.
Under the same assumptions for deadlock-freedom, a choreography that is tail-recursive gives the stronger liveness property of
starvation-freedom (or livelock-freedom): no role ever gets stuck, or “starves” [Montesi
2023]. (In deadlock-freedom, it is sufficient that some part of the system keeps running [Kobayashi
2000].) The same holds for Choral.
Last, choreographic programming languages typically guarantee
race-freedom, provided that no role uses the same channel in parallel via internal threads [Honda et al.
2016; Carbone et al.
2012; Montesi
2023; Lanese et al.
2008]—unless the underlying middleware can disambiguate messages, but this is not often the case. Choral gives finer control over race freedom, thanks to its expressive (Channel) types (cf. Section
2.4). For example, given a bidirectional channel, we can pass it as a directed channel in one direction and as a directed channel in the other direction to two separate threads. The API of directed channels would then allow the two threads to use the channel, respectively, only for sending or for receiving, which is safe to do in parallel. This control over channel usage was found to be useful in the choreographic implementation of full-duplex asynchronous communication (a pattern where roles can freely interleave different requests and responses in both directions), as found in the IRC client-server protocol. Lugovic and Montesi [
2023] present a detailed discussion of this pattern and an interoperable implementation of IRC in Choral.
Since channel implementations play a key role in the properties discussed above, these implementations should ideally be simple, well-tested, or even verified. Choral’s features allow for encapsulating more sophisticated interaction behaviour as choreographies. Lugovic and Montesi [
2023] started this activity, offering a reusable Choral library for programming asynchronous reactive protocols.
While, in principle, composing choreographies in Choral preserves safety and liveness, one must pay attention to the interplay with local code. Most notably, local code is free to compose and intertwine multiple instances of the same or different choreographies, as we exemplified in Section
3.2. This flexibility makes it possible to write two local programs that instantiate two choreographies and try to communicate with each other in incompatible orders, share the same channel across different instances of the same choreography running in parallel, and so on. Doing so can create communication errors and make choreographies trivially timeout and fail. This problem can be tackled in several ways, which we leave to future work. Briefly, one option is to devise middleware for making local programs agree on which choreography they want to engage in at any time. One such middleware can avoid some incompatibilities, create dedicated channels, or, in general, throw a runtime exception in case of disagreements. Other potential solutions include devising static or runtime checks for how local code composes instances of choreographies. Options for these checks encompass constraints such as keeping the graph of connections among local programs acyclic; making (projections of) choreographies that run in parallel have dedicated channels; ensuring that call to different methods of different choreographic objects follow compatible orderings; and/or synthesising an overall choreography that describes the collective behaviour of how local programs combine their (sub)choreographies, thereby providing a witness of safety and liveness [Montesi and Yoshida
2013; Coppo et al.
2016; Voinea et al.
2020; Lange et al.
2015; Carbone and Montesi
2013; Honda et al.
2016; Jacobs et al.
2022; Cruz-Filipe et al.
2017]. Some of these methods could be applied via structured concurrency libraries backed by middleware.
“Bad” programming of code that interacts with projections of choreographies is not a new issue, as it was already present in previous implementations of choreographic languages [Scalas et al.
2017; Dalla Preda et al.
2017; Carbone and Montesi
2013; Montesi
2013]. In fact, this is a general issue when composing code from libraries that engage in communications, even if these are not generated from choreographies. At the very least, Choral already guarantees type safety: the APIs of choreography projections are always respected.
Formalising Choral’s ideas, implementation, and guarantees is an interesting area of research. As already mentioned, the principles of compiling terms and data types equipped with roles as in Choral have been studied in the simple setting of the
\(\lambda\)-calculus [Cruz-Filipe et al.
2022;,
2023a; Graversen et al.
2023]. The proofs in these models are not machine-checked. Mechanisation could be achieved by building on the existing formalisations of the foundations of choreographic languages made with theorem provers [Cruz-Filipe et al.
2021b;,
2021a;,
2023; Castro-Perez et al.
2021; Pohjola et al.
2022; Hirsch and Garg
2022]. These efforts are still focused on choreographic programming languages that are far less expressive than Choral, which by contrast is a large programming language that inherits all the complexities of both choreographic and object orientation.
Selection inference. Choral requires the programmer to insert the necessary selections to achieve knowledge of choice (Section
2.3). Developing techniques for inferring these selections automatically is an ongoing research topic. Typically, these techniques either modify the source choreography to include extra selections or inject hidden communications in the generated endpoint code [Lanese et al.
2013; Jongmans et al.
2015; Basu and Bultan
2016; Dalla Preda et al.
2017; Cruz-Filipe and Montesi
2020]. However, there is no silver bullet.
(1)
In general, it is unfeasible to detect automatically what the optimal selection strategy is. This is a problem for both approaches (modifying the source choreography or injecting hidden communications in the generated code). Say that A needs to inform B and C of a choice by using point-to-point channels. Should A send the first selection to B or to C? That might depend on whether B has a longer task to perform in reaction to the selection compared to C, or vice versa. (Whichever has the longest task to start should get the selection first, to increase parallelism.) Then, what if multiple channels are available? For example, if A shares a fast channel with B but not with C, and B shares a fast channel with C, then it might be good that A informs B and subsequently B informs C (instead of A informing C directly). These issues become even more sophisticated when considering choreographies with more complex network topologies, scatter-gather channels, recursion, and so on.
(2)
If a compiler injects hidden communications in the generated code, then the source choreography program does not faithfully represent the communications enacted by the system any more. This makes the choreography less useful when reasoning about, for example, efficiency—network communications like selections are especially a huge performance factor—and security—hidden extra communications might leak information in ways not intended by the designer of the original protocol.
Since both issues are still the object of active investigation, we decided that the first version of Choral should be a base that future work can use to explore their design spaces. A promising compromise could be a hybrid, assisted way. That is, the programmer should be able to write a choreography including some selections deemed important, but also potentially missing some necessary other selections; then, a tool should detect the missing selections and propose a solution. The programmer could thus decide whether to accept the proposal or improve it manually to achieve their requirements.
In general, we believe that there is a lot of potential in future research on how to optimise communications in Choral. New algorithms might leverage annotations of channels, static analysis, and profiling data. Some algorithms might choose simpler approaches at the expense of parallelism, whereas others might take a more decentralised approach to spreading knowledge of choice to favour parallelism or energy saving (in the Internet of Things, spreading battery consumption evenly or in a focused way might be an advantage depending on the scenario).
Expressivity. We discuss a few interesting directions for future work regarding the expressivity of Choral and its type system. In general, we believe that our choreographic interpretation of object-oriented programming (OOP) allows for importing established techniques from type theory to reason statically about roles in useful ways.
Choral can capture a variety of interaction patterns, like scatter-gather and producers-consumers with races. Nevertheless, there are cases where our types can be coarse. For instance, the following could be an interface of a channel where two receivers, B and C, race to consume a message.
The return type of method \(\texttt {com}\) above guarantees that both receivers will have a value of type \(\texttt {Optional}< \texttt{S}>\) located at them. However, depending on the behaviour that the programmer wishes to model, the interface above could be an over-approximation. For example, implementations that simply discard the message sent from A or that deliver the message to both B and C will satisfy the return type. Currently, we cannot express the type of a channel that forbids such implementations, e.g., a channel guaranteeing that exactly one between B and C will obtain the message. This means that, in such cases, we have to “pollute” the continuation of the choreography with local checks at both potential receivers.
A way to achieve a more specific type for races could be to extend Choral types with existential quantification over role parameters. For example, we could write
to express an instance of \(\texttt {S}\) located at some role D in the list [B,C] (i.e., S@D can be either S@B or S@C). With this type, we can write a more specific signature: method \(\texttt {com}\) returns a value of type S@B or of type S@C, but we cannot statically know which of the two types.
Although there is some work on the use of existential quantification in simple choreography languages [Jongmans and Yoshida
2020], its application and integration with a general-purpose language like Choral poses some challenges and design choices. For instance, should roles that lose the race in method
\(\texttt {com}\) be blocked? If so, then is this specific to this method or the standard interpretation of every method with an existential return type? These and similar questions beg for a thorough investigation and go beyond Choral. In fact, a satisfactory and general handling of races in choreographic programming languages is still missing. In addition to the ideas just proposed, useful inspiration to address this aspect might come from nondeterministic choice in choreographies [Lanese et al.
2008; Montesi
2023] and choreographic languages for the verification of message-passing parallel programs [Vasconcelos et al.
2022].
Another limitation of the current type system is that the number of role parameters of a choreography is fixed. This limitation is common to many choreographic languages. [Deniélou and Yoshida
2011] developed a theory for parameterising choreographies over “collections of roles,” whose sizes are determined at runtime. All roles in the same collection must be treated uniformly (e.g., broadcast). We can import that feature to Choral by allowing for role parameters to be collections. For example, we could prefix a role parameter declaration with
\(\texttt {*}\), as in *
Ds, to specify that it is a collection of roles. Then, we can write the type of a channel for broadcasting data from
A to all roles in the collection
Ds as follows.
The method \(\texttt {com}\) for this channel should take a value of type S@A and return a value of type S@D for every role D in the collection Ds. This would require investigating how Choral can be extended with types for distributed data collections, as well.
Error handling. Choral supports exception handling at a single role, which can then propagate errors to others via knowledge of choice. However, in our experience, it is more convenient to represent failures in return types, like we did in Section
3.1 by using
\(\texttt {Optional}\). The channel APIs that we showed in this article are implemented by performing automatic retries. These APIs also have equivalent versions that wrap results in
\(\texttt {Result}\) objects—essentially sum types of the transmitted value type and an error type, as in Go and Rust. Choosing among these implementations is up to the choreography programmer, and programmers might also devise channel implementations with their strategies (e.g., exponential backoff with a bound on the number of retries). Our compiler can, in principle, be extended to synthesise coordination for distributed exceptions, theorised by Carbone et al. [
2008].
Type and role inference. Choral’s intended audience consists of developers familiar with OOP and, more specifically, Java. For example, our syntax extends that of Java and our library of channels follows common OOP practices, like coding to interfaces [Gamma et al.
1995]. Our design choices rely heavily on generics to achieve reusability, similar to what is done in the standard Java Collections framework [Naftalin and Wadler
2006]. However, this comes at the cost of requiring that programmers have experience with generics and additional parameters in code.
Standard remedies to the verbosity of generics include shorthands like the diamond notation and type inference [Stadelmeier et al.
2022]. We plan to lift these features to Choral and expect that the standard solutions adopted for Java will be applicable in scenarios where roles are known, without major modifications. For example, with such facilities, we could rewrite method
\(\texttt {consumeItems}\) from Section
2.3 by removing all generic instantiations in its body, as follows.
Similar to generics, role parameterisation adds crucial flexibility at the cost of added verbosity. It would therefore be interesting to explore inference mechanisms for role parameters, as well, in order the lighten the syntax of Choral even further. At a basic level, programmers would be able to omit roles when these can be unambiguously determined from the context, e.g., in assignments and some method invocations. The next snippets exemplify the potential gain in simplicity from the current situation (left) to one with this feature (right).
We conjecture that this feature can be achieved by machinery similar to that already used for inferring generic parameters, because role information is available from type declarations and the typing context without ambiguity.
At a more advanced level, we could allow programmers to delegate to the Choral compiler decisions on the placement of data and computation. Consider the snippet below.
Since \(\texttt {x}\) and \(\texttt {y}\) reside at different roles, the location of \(\texttt {z}\) is ambiguous. Even more, performing the addition at Line 2 requires communication, which the compiler would need to infer and inject by appropriately using channels available from the context. This faces similar challenges to the ones previously discussed for selection inference. We believe that exploring methods for the synthesis of communication strategies in choreographies is an interesting research line in general, with a scope that goes beyond that of Choral’s.
Other features from Java. There are other features provided by Java (or other object-oriented languages) that Choral could benefit from. We do not discuss them in detail, because we do not expect that lifting them to choreographies would pose significant challenges. For example, we believe that adding wildcard types (\(\texttt {?}\)) would be a natural adaptation of Java’s mechanism.
The patterns and libraries that support idiomatic Java programming, like streams (from the package \(\texttt {java .util .stream}\)), are immediately available in Choral thanks to our lifting of Java types to Choral types with a single role. In some cases, it can be interesting to generalise these patterns to multiple roles. For example, a possible interpretation of a “choreographic stream” at two roles A and B could be that of a stream of elements distributed at these roles (Stream@(A, B)< T@(C, D)>). Methods for stream operations would then take choreographies at A and B as input. Whether these choreographies would perform communications between the two roles or not would be left to the implementor and is irrelevant to the implementation of the stream.
Asynchronous programming. The choreographies that we presented here use channel APIs as if they were blocking. This does not mean that an endpoint must dedicate a thread for participating in a choreography: future versions of Java will include fibers and the asynchronous execution of blocking APIs (reactor pattern) [OpenJDK
2020]. Choral is compatible with this direction. Should programmers want to program a choreography explicitly for asynchronous execution by using continuation-passing style, our channel APIs can be extended to take choreographic continuations as parameters.
Fluid APIs from choreographic specifications. As we mentioned and discussed in Section
1, previous work explored the automatic generation of fluid APIs that enforce the code to follow a choreographic specification [Scalas et al.
2017]. Such a choreographic language cannot include computation, so it cannot express any of our use cases, and its approach does not support modularity and API control, as we discussed more in detail in the Introduction. Thus, Choral brings two improvements. First, our APIs are more reusable: they change only if the source API is changed, not if the communication behaviour of a method is simply updated. Second, the APIs of our compiled Java code are more idiomatic: they are plain object APIs that look like the typical task-oriented APIs distributed by cloud vendors [Murty
2008; Wilder
2012], which makes Choral a candidate drop-in replacement for current development practices.
Choreography-based verification and testing. Choreographic languages that are less expressive than Choral (e.g., they cannot include computation) have been used also to verify that interactions among objects respect a protocol. This is obtained by statically checking method invocations, either by using typestates [Kouzapas et al.
2018] or model checking [Scalas et al.
2019]. As noted by Hirsch and Garg [
2022], choreographic programming offers a simpler development method. Indeed, verification approaches require the programmer to design both a choreographic specification and then manually take care of writing a correct implementation of the projection of each role. On the contrary, choreographic programming (and hence Choral) generates the latter automatically. Additionally, since our approach is based on compilation instead of verification, we can provide a more expressive choreographic language and apply formal reasoning techniques at the level of choreographies as in, e.g., Jongmans and van den Bos [
2022]; Cruz-Filipe et al. [
2023b]. Because choreographies syntactically elicit interactions, carrying out verification at this level avoids the cost of reconstructing interactions from endpoint implementations, which usually leads to a combinatorial explosion of cases.
Choreographic languages without computation have been used also in a tool for testing abstract specifications of components given as labelled transition systems [Coto et al.
2021]. The purpose, there, is to test that the communication behaviour of a component (given as a labelled transition system) complies with a choreography. By contrast, our testing tool is the first aimed at testing the functional correctness of a choreography and its generated implementation. Choral’s capability of expressing internal computation is important to this end, since it allows us to write arbitrary checks on the distributed state of the system.