The VERDICT framework is described below in detail, including the workflow, model construction process, and various features of the tool-suite.
5.1. VERDICT Workflow
An overview of the VERDICT workflow is illustrated in
Figure 1. VERDICT is developed as a plugin to the Open Source AADL Tool Environment (OSATE). The user captures an architecture model using AADL that represents the high-level functional components of the system along with the data flow between them. Additional security and safety models are required to perform MBAAS and ACFG; similarly, additional behavior models are needed to perform CRV. The input models are translated into a VERDICT internal data model, which will be further translated into inputs for MBAAS, ACFG, and CRV, accordingly. From within the OSATE environment, the user invokes each one of these via a drop-down menu. Each functionality comes with a configuration panel allowing users to set parameters for analysis.
For MBAAS and ACFG, the system engineer needs to identify and specify mission requirements, safety and cyber requirements, and cyber and safety relations of the system. The cyber and safety relations define how vulnerabilities propagate through the system. The cyber requirements of the system are defined in terms of confidentiality, integrity and availability outputs of the top-level system; the safety requirements are defined in terms of integrity and availability of outputs of the top-level system. For safety analysis, MBAAS performs fault-tree analysis to calculate the probability of system failure based on the error events and propagations defined in the system. For security analysis, MBAAS analyzes the architecture to identify cyber vulnerabilities and recommend defenses. These defenses are recommendations to either improve the resiliency of existing components, such as control access and encrypt communication links, or recommendations to add new components to reduce dependence on a specific source of information. By running MBAAS, the designer is able to identify whether or not the system can achieve its goals due to subcomponent vulnerabilities. Specifically, MBAAS helps the designer identify such vulnerabilities and further suggests alternate design options. Implementation of defenses often has associated costs. Given the cost of each defense, MBAAS can synthesize an optimal solution with a minimal set of defenses with minimal implementation costs. In addition, the assurance case fragments generation capability can automatically assemble the MBAAS assurance evidence into Goal Structure Notation (GSN) form for certification purposes.
Once the architectural analysis is complete, VERDICT supports refinement of the architecture model with behavioral information using AGREE. The VERDICT CRV back-end tool performs a formal analysis of the updated model with respect to formal cyber properties to identify vulnerabilities to cyber threat effects. Specifically, CRV performs architectural analysis by focusing on the behavioral information of the system and its sub-components. The behavioral information (i.e., details of how inputs are transformed into outputs) can be expressed as “assume-guarantee contracts” in AGREE, along with the system architecture (defined in AADL), goals, behavioral information, and information related to threats are fed as inputs to CRV. CRV analyzes and reports if the goals/ requirements are satisfiable even in the presence of threats. If the goals/requirements are satisfiable, CRV returns a merit assignment result (i.e., a set of components that played a vital role in goal satisfaction). If the requirements/goals are unsatisfiable, then CRV returns a blame assignment result (i.e., a set of components that played a vital role in requirement violation). The CRV capability provides an additional depth-of-analysis of a model that includes behavioral details of the architectural component models that help to identify design mistakes early in the development process. Once the CRV analysis is complete, the developer can create a detailed implementation. MBAAS and CRV work collaboratively to analyze the system design for resiliency. MBAAS analyzes the system design for resiliency using minimal information; CRV captures behavioral information to validate requirements that need such detailed behavioral information for their analysis.
5.2. Model Construction in VERDICT
We will use a delivery drone to demonstrate how to construct an AADL model that is analyzable by VERDICT. The AADL source code is publicly available on the Github repository [
28]. The drone system architecture is shown in
Figure 2. The drone is part of the delivery system that consists of a van with packages to be delivered and one or more of the delivery drones. After the van arrives at a location that is close to multiple delivery sites, the delivery drones are initialized with their current position, delivery location, and the package to be delivered is loaded. After a delivery drone is launched, it uses the inputs from the GPS and IMU to navigate to the delivery location. When the drone reaches the delivery location, it uses its Camera to capture an image of the receiving site to confirm that it is free of obstacles and it is safe for the package to be dropped off. For a high-value package, the Delivery Planner will use the Radio component to get confirmation from the operator in the van. If there are no obstacles on the receiving site and confirmation (if needed) is received from the operator, then the Delivery Planner will activate the Delivery Item Mechanism to drop off the package. The delivery drone also needs to avoid certain airspace, for example airports, schools and government buildings. The Global Positioning System (GPS) and Inertial Measurement Unit (IMU) are subcomponents of Guidance, Navigation and Control (GNC) to illustrate model hierarchy.
5.2.1. Modeling System Architecture in AADL
The VERDICT tool-suite supports a core subset of AADL constructs shown in
Table 2, which can be extended as needed.
To analyze the system design of the delivery drone using VERDICT, the system designer will need to model the system architecture in AADL, specify the meta-level and defense properties of the system and its subcomponents, and encode safety and cyber requirements at the system and subcomponent level in VERDICT annex. Since we will model only a conceptual view of the drone system, each subcomponent will be modeled as either a system type or a system implementation.
Figure 3 shows a snippet of the AADL code for the top-level delivery drone system type and its implementation (some implementation and connection details are omitted for the sake of space). The system type has a feature section with data port (
inports and
outports) declarations. The system implementation encompasses various subcomponent instances and internal connections. Each subcomponent is an instance of a system type or implementation.
5.2.2. VERDICT Properties
VERDICT analyzes a system architecture model based on a set of built-in meta-level properties. The properties reflect various design properties of the system. The complete set of VERDICT properties is summarized on the Wikipedia page of the VERDICT repository
2. There are six types of VERDICT properties: (1) mandatory properties; (2) port properties; (3) connection properties; (4) component properties; (5) connection cyber defense properties; and (6) component cyber defense properties. Mandatory properties are those that must be set for every system for VERDICT to perform analysis. If no value is set, default values (that correspond to least secure settings) are set for those properties. The only port property is probe, and it signifies to CRV that the port is not part of the original architecture, but that it is included just for the convenience of CRV’s reasoning. When a system is modeled hierarchically, a probe port allows CRV to look inside a component’s behavior without needing to expose the behavior completely at a higher level. Connection properties are those that apply to connections but are optional for performing analysis. If not applied, the VERDICT tool suite will default to the conservative values as defined. Component properties are those that apply to systems but are optional for performing analysis. If not applied, VERDICT tools will default to the conservative values as defined. The connection and component defense properties are of enumerated integer type [0; 3; 5; 7; 9], which describes the implementation rigor of defense, with 0 being the lowest rigor and 9 being the highest rigor. VERDICT properties are set in the “feature (port)” section of component type and “subcomponents” and “connections” sections of the component implementation, accordingly. The AADL code snippet in
Figure 3 shows the port, component, and connection property associations in the delivery drone model in blue color.
5.2.3. Modeling Cyber-Security and Safety Aspects of System Architecture in VERDICT Annex
The cyber-security and safety aspects of the system are modeled using the VERDICT annex introduced in
Section 3. Cyber, safety, and mission requirements may only be declared in a VERDICT annex within the top-level system type. Cyber and safety requirements can be aggregated and associated to a particular mission requirement. Cyber and safety relations may only be declared within a subcomponent system type. They describe the vulnerability and failure flow, respectively, between the inputs and outputs of an individual component within the system. Safety relations model how faults or (erroneous) events or inputs affect the Integrity (I) and Availability (A) of the outputs. Cyber relations model how cyber-attacks affect the Confidentiality (C), Integrity (I), Availability (A) of the outputs.
Figure 4 shows an example of a mission requirement in the VERDICT annex. Cyber Requirements “CyberReq01”and “CyberReq02” and a safety requirement “SafetyReq01” are aggregated to support the mission requirement “MReq01”.
Figure 5 shows an example for defining a cyber requirement in VERDICT annex. Cyber requirements must only be declared at the topmost system level of the AADL project. It is recommended to enter a message of the form “The loss of <Confidentiality, Integrity, Availability >of the subject variable shall be <None, Minor, Major, Hazardous or Catastrophic> in the “description” field. For example—”The loss of Integrity of the estimated position signal input of the Navigator shall be Hazardous”. The following list shows the acceptable likelihood of successful attack value for each of the severity levels: Catastrophic = 1e-9; Hazardous = 1e-7; Major = 1e-5; Minor = 1e-3; None = 1e-0.
Cyber relations are used to map component vulnerability of inputs to outputs. VERDICT requires the user to declare the cyber relations for each component in the AADL model. Cyber relations represent the relationship of the input and output signals of a component. Cyber relations are defined in the declaration section of the component type in AADL using the VERDICT annex.
Figure 6 shows two cyber relations for a DeliveryItemMechanism component. The cyber relations specify that the integrity and availability of
delivery_status_out is impacted by the integrity and availability of
delivery_cmd_in, respectively.
Figure 7 shows an example of a safety requirement in the VERDICT annex. Safety requirements must only be declared at the top-most system level of the AADL project.
Figure 8 shows a loss of availability (LOA) error event and a safety relation for the
DevliveryItemMechanism component in the VERDICT annex. The probability of loss of availability event is 1.0e-8. The safety relation describes that the availability of
delivery_status_out is affected by the occurrence of LOA event of the component or if the availability of
delivery_cmd_in is affected.
5.2.4. Modeling System Behavior and Cyber Properties in AGREE Annex
Once the system architecture and the associated meta-data, which are necessary for making the system-under-test amenable to the MBAAS’ architecture-level security and safety analysis, have been filled, the next step is to add the system’s behavior-level information and guarantees. This is critical for making the system ready for a more fine-grained analysis performed by the CRV, compared to the conservative analysis performed by MBAAS. This behavior-level information includes two types of information, namely, the assume-guarantee style contracts for each component as well as the formal functional property that the system should verifiably maintain even under attack. For explaining the process of populating the architectural model of the system with behavioral information, we will use the delivery-drone model presented in
Figure 2.
Specifying behavioral information of the system: In the first step, we need to add behavioral information for the components in the architectural model of our delivery drone system. This will include abstractly specifying the intended functionality of that component. The component-level behavior is abstract in the sense that it does not necessarily capture all the fine-grained behavioral details of the component in question. As an example, for the analysis performed by CRV of an unmanned aerial vehicle (UAV) system, it may be sufficient to just model that the UAV moves from one waypoint to another without capturing details regarding how it precisely navigates. Abstractly capturing the component-level behavior also allows CRV to take advantage of automated formal reasoning techniques, such as model checking, while avoiding scalability challenges often incurred by such formal techniques (i.e., state-space explosion). For behavioral specification, we use AGREE contracts that allow us to choose a level of abstraction in the specification that is sufficient to prove the system-level cyber-resilience property (as opposed to provide a complete and detailed specification). At a high-level, this functional behavior for each component simply expresses temporal constraints between the values of outports and the value of inports. Sometimes these constraints may need to be expressed also in terms of the internal state of a component. In that case, relevant information on the internal states may be provided using virtual output ports that we refer to as probes. Note, these virtual output ports are introduced for reasoning purposes and do not need to be realized in real-life deployments of the system.
The overall system is assumed to run on a universal base clock that represents the smallest time span the system can distinguish. Note that the restriction to a synchronous model of computation is intentional, as this model is better suited to translation into a programming language, as it more naturally matches the behavior of a computer program. Moreover, usually this restriction can be overcome by faithfully simulating asynchronous systems in the synchronous model in a variety of ways.
For review purposes of prior information, the DeliveryItemMechanism component in our Delivery Drone model has the following inports and outports:
delivery_cmd_in: inport representing a command received by the delivery mechanism
delivery_status_out: outport representing the current status of the delivery
package_is_secure: outport representing whether the package is secure or not
Specifying the relation between outports and inports (and/or the internal states) depends on the level of abstraction we want to have for this component which eventually hinges on the properties we want to verify. However, the rule of thumb is to stay at the highest level of abstraction first and then try to verify the properties. If we need to specify more about this component to prove the desired system-level properties, we can refine its specification later. Assuming for the DeliveryItemMechanism component that at least the following information is known:
The component should accept two commands: release a package and abort a delivery.
The component must be in one of these four possible states: the delivery has not started, it is in progress, it has been completed, or it has failed.
Initially the delivery has not started.
If a delivery command is issued, the delivery status must become different from not started.
If no command is issued or an abort command is received, then delivery status gets reset to not started.
This is a minimal level of information one can anticipate about the expected behavior of the
DeliveryItemMechanism component. For formalizing (1) and (2) in AADL, we define two enumeration types shown in
Figure 9:
Notice that the enumeration type PackageDeliveryCommand has a value NO_OPERATION, that allows us to distinguish the case where no command is issued, which is required by the system (5). Alternatively, one could use event ports to implicitly model the case where no signal is present and keep the enumeration type with only two values. This modeling choice relies on whether an explicit value is needed to reason about other parts of the system.
To state aspect (3) formally, we add the following guarantee stating that the delivery status is equal to
NOT_STARTED initially, as it is safe to assume that the system starts in a proper initial state. We achieve this by using an auxiliary AGREE operator
InitiallyX whose definition is shown in
Figure 10.
The
InitiallyX operator accepts a Boolean expression as input and evaluates to true only if the Boolean expression is true initially. Note that, the infix initialization operator -> is natively supported in the AGREE language. An expression of the form e1 -> e2 evaluates to the value of expression e1 initially and to the value of e2 at all later steps of the system’s execution. For formally capturing aspect (4), we add the following guarantee, in which
release_cmd is an auxiliary variable that we have introduced for readability purposes and is shown in
Figure 11.
Similarly, we capture aspect point (5) with the following guarantee that uses two new auxiliary variables shown in
Figure 12.
So far, we have only added constraints about the delivery_status_out outport. Any value that satisfies those constraints will be considered valid during the analysis performed by CRV. This also means that since we have not added any constraints regarding the signal package_is_secure, CRV is free to assume that it can take any value allowed by its type. We want to emphasize that there are situations when one needs to add more detail to the specification to capture the behavior of a component more precisely for a finer-grained analysis. Such a situation can be observed in the specification of the DeliveryPlanner component. The AGREE specification of this component relies on an abstract representation of the component internal states (using the notion of modes) to specify its functionality.
Specifying desired formal property of the system: The next step is to review the list of safety functional requirements for the system that may affect its integrity and formalize them as formal cyber-resiliency properties in the AGREE language. For instance, consider that we are given the following cyber-requirement for DeliveryDroneSystem that regulates when a package could be released:
The drone never initiates packet release to an off-limits location (labeled as P7). To formalize the cyber-requirement P7, we must first identify the components and ports of the system that are relevant to the description of the property. In our example, the
DeliveryPlanner is the component that issues the command to release a package by setting the output port
delivery_cmd to
RELEASE_PACKAGE, and the
DeliveryItemMechanism is the component that receives the command and proceeds with the delivery. Moreover, to know where the drone should release the package, the
DeliveryPlanner reads the delivery location from the input port
bus1 through the Connector component when the drone is in the van, and then it passes this value to the Navigation component. In addition, we also need to know when a location is off-limits. For that, we must define a new predicate over locations that evaluates to true only if the location is within a restricted area. The specific definition of the predicate is irrelevant to the analysis and could have been left abstract as an uninterpreted predicate. However, AGREE does not allow the user to declare an uninterpreted predicate, so one needs to fix an arbitrary definition. The following is an example definition of the predicate in which X_LOW, X_HIGH, Y_LOW, and Y_HIGH are fixed constants of type real (
Figure 13).
Once we have defined the predicate, we can express the cyber-property with the guarantee in
Figure 14, in which
started (defined below) is an auxiliary predicate which returns true if and only the delivery has started.
The above guarantee needs access to the
delivery_location information, which is a problem. Because the
delivery_location port is not accessible from the Navigation interface. For these cases, the interface of Navigation and the DeliveryDroneSystem can be extended with a probe signal which is not part of the actual architecture of the system. This will only be used for specification and verification purposes. To identify the new port as a probe, the user can set the VERDICT property “
probe” to
true for the new port as shown in
Figure 15.
Note, it may be convenient to add a prefix to the port name to make it clear that the new port is a probe. This is only a modeling practice and is not required for the analysis. If the delivery_status port of the DeliveryItemMechanism component were also not accessible, we could proceed similarly by introducing a new probe signal for that information.
5.3. Model-Based Architectural Analysis and Synthesis (MBAAS)
The MBAAS tool consists of two functionalities: analysis and synthesis. The analysis takes in architecture models, mission, safety, and cyber-resiliency requirements, then generates fault and attack-defense trees with resiliency metrics. For the attack-defense tree analysis, the tool first leverages the security threat evaluation and mitigation (STEM) tool to identify possible CAPEC attacks and NIST-800-53 controls (defenses) to various components of the system based on annotated VERDICT properties. This information will be fed into SOTERIA++ to calculate the likelihood of successful attacks to determine the success or failure of cyber requirements. For safety analysis, the tool calculates the probability of system failure based on the error events, safety relations, and requirements. Conversely, the synthesis functionality focuses on cyber-security, using the attack-defense tree information along with cybersecurity requirements as inputs and generates defense properties associations that meet predefined resiliency design constraints. The MBAAS tool enables the system engineer to model components and then synthesize architectures that meet both safety (based on fault tree analysis) and security (based on attack-defense tree analysis) design goals.
5.3.1. Analysis of the Safety of System Architectures (SOTERIA++)
Safety and security have traditionally been handled separately—these topics are discussed in their own professional communities; businesses set up different departments to handle product safety and product security. Analyses have also traditionally been done separately and sequentially, where a system will normally first undergo safety analysis and then security. There is great benefit in analyzing safety and security simultaneously as they have undeniable interdependencies. For instance, consider an everyday example of an exit door: a locked door is secure but does not necessarily provide a safe egress in an emergency. Recognizing the ever-increasing complexity of cyber-physical systems, we developed a tool that reports on both safety and security of a system. In this section, we cover the foundation of safety analysis of system architectures.
MBAAS extends the original framework developed under a program called Safe and Optimal Techniques Enabling Recovery, Integrity, and Assurance (SOTERIA) [
29]. The original work was limited to safety analysis. We extended the framework to include a security aspect (described in the next section), hence the name SOTERIA++. The overall philosophy is a compositional, model-based framework for better security and safety analysis, shifting the engineer’s focus away from generating artifacts to expressing the properties that he wants in a system design. One such artifact for measuring safety is fault tree analysis (FTA). It is commonly used to examine the combination of undesired events that result in the system’s inability to perform a mission under identified hazards. These hazards impact the loss of availability (inability to perform a function) and loss of integrity (inability to perform a function correctly). Fault trees—though powerful—are difficult to construct by hand. According to Banach, et al. [
30], “The manual construction of fault trees relies on the ability of the safety engineer to understand and foresee the system behavior … it is a time consuming and error-prone activity.” In the product lifecycle, FTA is used in multiple stages of development. At the early stages, it is used to draft candidate architectures; at later stages, after a design is created, it is used to verify compliance with qualitative and quantitative safety objectives. Fault trees must be kept up-to-date with inevitable updates to the architecture. The SOTERIA++ framework, which automatically generates the fault tree, supports such modifications and continual evaluation of complex system architectures. Furthermore, the SOTERIA++ framework is compositional in that fault propagation is defined at the component level, making it easy to maintain the architectural model.
Regarding the modeling of faults, there exists an Error Modeling Annex (EMV2) created by Julien Delange and Peter Feiler from Carnegie Mellon University to address the gap between safety requirements specification and systems engineering. An objective of According to EMV2’s documentation, EMV2 is “to automate safety analysis methods by supporting them through analyzable architecture fault models”. This is done by annotating the AADL model with hazard, fault propagation, failure modes and effects due to failures, as well as compositional fault behavior specifications. The steps to do safety analysis are well documented on GitHub and elsewhere [
31,
32,
33] EMV2 is a very expressive, covering fault modeling in three levels: (1) error propagation—for each component, the user can specify the outgoing and incoming error types; (2) error behavior—for each component, the user can specify the error event, how, when coupled with incoming error propagations affect the error state, and under what conditions outgoing error propagations occur. These behaviors are specified via reusable error behavior state machines; and (3) compositional error behavior—for each component with subcomponents, the user can specify under what conditions in terms of subcomponent error states the component is in a particular error state. The composite error behavior specification must be consistent with the component error behavior. The expressivity of EMV2 give the tool the ability to generate many types of artifacts to support different certification documents, such as Functional Hazard Assessment (FHA), Fault Tree Analysis (FTA), and Failure Mode and Effect Analysis (FMEA).
We considered EMV2, and chose to not to integrate it into VERDICT. Instead, we developed our own representation in the VERDICT annex. In MBAAS, our focus was on generating fault trees. While EMV2 is very expressive, we found that only a small subset of the information required for EMV2 is needed. The information needed to generate fault trees are basically (1) a set of safety formulas for each subsystem, (2) a set of probabilities for those internal events of each subsystem, and (3) a set of safety requirements for the top-level system describing what must hold for that system’s outputs. As an experiment we created a side-by-side comparison of the information needed for EMV2 to generate fault trees and the information needed for VERDICT to do the same. We created a simple AADL architecture with 3 sensors and a voter expressed in 31 lines of code. The architecture when annotated with EMV2 grew to 129 lines, while the model annotated with the VERDICT annex for safety was only 85 lines long. In addition to conciseness, we made an additional observation about EMV2 that had to do with usability: we always had to create a system that extends a previous system because it was not possible to express “properties” and “component error behavior” in the same code block. For these reasons, we opted not to adopt EMV2 into VERDICT.
In addition to conciseness and usability, another reason we opted for our SOTERIA framework is that EMV2 requires users to provide a composite error behavior at the system level which specifies “the logic in a fault tree” according to Delange, et al [
17]. SOTERIA does not require such information from the user. It generates the fault tree from the safety relations specified at the component level as described in
Section 5.2.3.
Here’s an illustration of the safety analysis of the Delivery Drone System presented in
Figure 2. The safety requirement presented in
Figure 7 says, “
The loss of actuation shall be less than 1e-7 pfh.” More formally, this means that on the Delivery Drone System the probability of losing the output
actuation_out must be less than 1e-7. Having modeled compositionally the safety relations of each component of the Delivery Drone (such as the one illustrated in
Figure 8 for the
DeliveryItemMechanism) and having modeled the component connections of the Delivery Drone System, the fault tree for
SafetyReq01 is illustrated in
Figure 16.
5.3.2. Analysis and Synthesis of the Security for System Architectures
The assessment of the security of cyber-physical systems typically takes the form of an analysis of the risk that identified attacks will overcome mitigations within the system to cause undesired effects. A successful assessment can provide a blueprint for a secure architecture but must satisfy a few key concerns to be successful. First, a sufficient set of possible attacks must be identified to adequately cover the set of actual attacks that the system will experience when fielded. Second, appropriate mitigations must be determined and properly implemented to prevent these attacks from succeeding in causing the undesired effects. Third, the coverage of the attacks and sufficiency of the mitigations must be measured and communicated to the designer to compare the relative strength of alternative architectures and find the best solution. In addition to these significant elements of the analysis, synthesis can help the designer find the most secure architectural alternative faster. The following sections describe how MBAAS can be used to support each of these important aspects of a security assessment.
Security Threat Evaluation and Mitigation (STEM)
This approach depends on an appropriate library of threats, as controls will only be selected if they are useful in mitigating attacks that have a defined effect on the system under consideration. Threats are identified in terms of MITRE’s CAPEC [
34]. The use of a common threat library has two benefits. First, use of a common library incorporates a more diverse collection of attacks by sourcing information from a broad community. Second, a common library encourages threat information sharing using a common classification system and language. The controls are from NIST’s Security and Privacy Controls 800-53 [
35] and provide a set of security controls originally intended for federal information systems and organizations. However, these controls are now commonly applied to systems in domains outside of traditional information systems. Controls and enhancements are continually updated—there are 863 controls and enhancements as of revision four. Detailed mapping between the CAPECs, NIST controls and the architectural level component properties can be found in [
36].
MITRE CAPEC defines a few classifications to attack patterns in their attack pattern hierarchy. Currently, not all levels can be readily translated to high-assurance embedded systems. These classifications include Category to generally break up the attacks by some common characteristics: Meta Attack Pattern defined as “a decidedly abstract characterization of a specific methodology or technique used in an attack”; Standard Attack Pattern, which is “focused on a specific methodology or technique used in an attack”; and Detailed Attack Pattern, which “provides a low level of detail, typically leveraging a specific technique and targeting a specific technology and expresses a complete execution flow”.
Attacks defined at the Standard Attack Pattern and the Detailed Attack Pattern levels are tied to specific technologies and techniques used. As the basis of this hierarchy is a list of known enterprise IT vulnerabilities and weaknesses, these would not be readily translatable and often do not apply, and therefore were excluded from consideration in STEM. Instead, attacks at the Meta Attack Pattern level were used. In this way, STEM can consider abstract classes of attacks as they might be applied to embedded system architectures and create a rule set for the application of these attacks. For example, the Standard Attack Pattern “ICMP Flood” often does not apply to embedded systems, but the more abstract Meta Attack Pattern “Flooding” can be applied to incoming connections of any type. From this attack pattern, defenses can be prescribed to handle excessive traffic on the connection.
As an example, detailed CAPECs related to memory corruption are not included in the Meta Attack Pattern library. There is a total of 61 Meta Attack Pattern CAPECs, but not all of them are relevant to embedded systems of interest to STEM. 37 Meta Attack Pattern CAPECs that are relevant to an embedded system are identified have been incorporated into STEM. Mitigations are linked to CAPECs so that controls are only suggested if they are useful in mitigating attacks that have a defined effect on the system under consideration. The mapping between CAPECs, VERDICT cyber defense properties and NIST-800-53 controls is shown in
Table A1 in
Appendix A adapted from [
36]. CAPECs that are mitigated by the same defense have been grouped together. Note that most of the CAPECs are mapped to a single defense, but some CAPECs require a combination of two defenses. Each defense is, in turn, mapped to a combination of more detailed NIST Controls. The detailed description of the NIST Controls can be found on the NIST website [
37].
STEM uses a semantic model and rules to identify the vulnerabilities and defenses for an attack scenario. The model and rules in STEM are authored in Semantic Application Design Language (SADL) [
38]. SADL is a controlled-English language that maps directly into the Web Ontology Language (OWL) [
39]. It also contains additional constructs to support rules, queries, testing, debugging, and maintaining of knowledge bases. Besides being a language, SADL is also an integrated development environment (IDE) for building, viewing, exercising, and maintaining semantic models over their lifecycle. The SADL grammar is implemented in Xtext [
40], which also provides the features of a modern IDE including semantic coloring, hyperlinking, outlining, and content assistance. The SADL grammar falls into two sections: it supports declaring the ontological concepts of a domain (classes, properties, individuals, property restrictions, and imports), and expressions that may appear in queries, rules, and tests.
As an illustration, consider CAPEC-28 which is Fuzzing. The NIST controls for it are SI-10 (Information Input Validation) and SI-10-5 (Restrict Inputs to Trusted Sources and Approved Formats). If CAPEC-28 impacting integrity (“CAPEC-28I” in rule below) is applicable, the mitigation (‘applicableCM’ in rule below) is InputValidation. CAPEC-28 is applicable to a component if the component involves software, is inside the trusted boundary and the data received is untrusted. This is captured as a rule in SADL shown in
Figure 17.
Attack-Defense Tree Analysis of System Architectures (SOTERIA++)
Attack trees are similar in nature to fault trees—they use the same logic gates and they both propagate events (either attack steps or component failures) from leaf nodes up to the root node. It is a disciplined and formal way of capturing attacks in a tree structure. The root node is the attack goal and the details on how to achieve that goal are the leaf nodes. Analysis of attack trees produces a logical, hierarchical, and graphic decomposition of attack paths and the conditions necessary for a threat to be realized. While they provide a formal and methodical way of thinking and describing threats, building attack trees is quite manual. Our philosophy, once again, is to provide a compositional, model-based framework to shift the engineer’s focus away from generating and maintaining these artifacts. SOTERIA++ generates attack trees automatically, more specifically, attack-defense tree which we will describe in more details.
Attack trees are one arsenal of many, but it’s one that’s used widely in industry and is accepted by agencies and certification authorities. Other than it being widely accepted, it has many foundational aspects that make it a decent security artifact. Attack trees were first conceived from the information security industry in 1999, accredited to Bruce Schneier [
41], but are generally applicable and are not restricted to analyzing information system. Attack trees are focused on defining an attack and refining it from the attacker’s point of view. They integrate well with the ancestor, the fault tree, which is focused on an undesired system failure as its top-level event. Marrying the two provides a system design tool that analyzes both safety and security. There are several recent works to extend attack trees. One is a multi-parameter attack tree to account for multiple interdependent parameters in each node [
42]. Another is the inclusion of countermeasures within the attack tree, called an attack-defense tree [
43,
44]. According to the authors, an attack-defense tree is a better representation of a system over attack trees because the latter only captures attack scenarios and does not model the interaction between attacks and defenses that could be put in place to guard against the attacks. More importantly, system security is constantly evolving—as better control measures are put in place, more sophisticated attacks are implemented. Therefore, modeling only attacks without considering the defenses in place is very limiting. Guided by the formalisms introduced in [
43,
44], we extended their concepts to include guidelines and considerations from DO-326A and DO-356A so that the terminology used in the tree is relevant. Furthermore, we defined precisely the qualitative and quantitative aspects of the attack-defense tree, because just as fault trees are rooted to the theory of probability, we wanted our attack-defense trees to be grounded in theory from mathematics. The theoretical foundations are covered in [
45], therefore a short summary is provided in the following paragraphs.
One of the terminologies in DO-356 is “likelihood of successful attack”. The top node of an attack-defense tree represents an attacker’s goal. The quantitative measure for fault trees is probability; the quantitative measure we defined for the attack-defense tree is likelihood of successful attack. Likelihood is not a probability and does not follow a probability distribution. Ref. [
46] distinguishes the terms in this way: “Probability attaches to possible results; likelihood attaches to hypotheses … The results to which probabilities attach are mutually exclusive and exhaustive; the hypotheses to which likelihoods attach are often neither”. Another interesting statement from this article is that, “Because we generally do not entertain the full set of alternative hypotheses …, the likelihoods that we attach our hypotheses do not have any meaning in and of themselves; only the relative likelihoods—that is, the ratio of two likelihoods—have meaning.” VERDICT reports to the user likelihoods for each cyber requirement that can be compared across the system architecture being analyzed. The user should not attempt to compare the safety requirement probabilities against the cyber requirement likelihoods.
An attack-defense tree is made up of two types of nodes: attack nodes and defense nodes. The attack nodes come from STEM: based on component properties annotated on the AADL system components and connections (see
Figure 3), STEM outputs CAPEC attacks. These attacks are parallel to the undesired events in safety analysis. In safety, undesired events such as loss of availability or integrity of a component are identified by an engineer either through lab tests or manufacturer specifications and specified in the VERDICT annex as Event (see
Figure 9). In security, the attacks are identified by STEM and are always given a value of 1 for likelihood of success of attack. Assigning a number to the level of attack is quite difficult and will only hold true for a short period of time. According to Javaid et al. [
47], “more emphasis should be put on countermeasures for threats”. Therefore, in SOTERIA++ we assume a worst-case number for attacks and focus on assigning a score for the defenses.
STEM also returns suggested defenses for each attack. The user specifies which of these suggested defenses have been implemented in the actual system by annotating the AADL component with the right properties (see
Figure 3, “Cyber Defense and Mitigation DAL”). The VERDICT annex allows the user to specify the Design Assurance Level (DAL) which indicates the level of rigor applied to implementing the mitigation on the system. DAL is how the likelihood of successful attack gets lowered.
An illustration of the security analysis result of the Delivery Drone System is presented in
Figure 2. The security requirement presented in
Figure 5 says, “
The drone shall be resilient to loss of ability to deliver a package to the appropriate consumer location.” More formally, this is a loss of integrity concern on the Delivery Drone System, as specified by the field “cia=I”. Furthermore, any of the following system outputs can contribute to this loss: loss of integrity of
actuation_out, loss of availability of
actuation_out, loss of integrity of
delivery_status, or loss of availability of
delivery_status. Finally, the severity of not meeting this requirement is labeled as “
Hazardous”, which equates to 1e-7. This means that identified attacks that impact
CyberReq01 must be designed with enough rigor to lower the likelihood of successful attack to 1e-7 or less. Having modeled compositionally the security relations of each component of the Delivery Drone (such as the one illustrated in
Figure 8 for the
DeliveryItemMechanism) and having modeled the component connections of the Delivery Drone System, the attack-defense tree snippet for
CyberReq01 is illustrated in
Figure 18.
Synthesis of Optimal Defenses for Cyber-Vulnerabilities of System Architectures
Model-based Architecture Synthesis is to synthesize a minimal set of defenses with Design Assurance Level (DAL) while minimizing their implementation costs and satisfying all the cyber requirements. There are several operating modes for the synthesis tool based on the input. Synthesis can ignore the existing implemented defenses and yield a globally optimal defense solution, or it can find a locally optimal defense solution where it uses the existing implemented defenses. Synthesis with implemented defenses has two modes depending on whether the existing implemented defenses satisfy all the cyber requirements or not. The idea of synthesis is to transform attack-defense tree to only defense tree, where each defense node is associated with a user-specified cost. Then the defense tree with costs is further encoded into logical formulas in MaxSMT.
To run synthesis, the user needs to provide an implementation cost model for all possible combinations of component, defense property, and DAL. The cost model may be specified under linear scaling mode. Designers need to provide a scaling factor, and then the cost for a <component-property-DAL> triple will be the DAL multiplied by the scaling factor, thus is monotonic increasing with respect to DAL (i.e., higher DAL demands higher costs in general). The cost model is specified in an XML file in compliance with XML syntax and must be named “costModel.xml” placed at the top-level of the AADL project repository. In addition, the interpreter of the cost modeling abides by the following rules:
By default (no costs specified), a scaling factor of 1 is used for all <component-property-DAL> triples.
Unless a scaling factor for the generic cases is specified as shown in item #2 below, a scaling factor of 1 will be used for all unspecified costs.
More specific costs take precedence over less specific costs. For instance, costs specified in item #4 below take precedence over costs specified in item #2 and #3; and costs specified in item #3 take precedence over costs specified in item #2.
We will use the delivery drone example to demonstrate the XML syntax to encode different scenarios under linearly scaling cost mode.
Default mode (a scaling factor of 1 will be used for all <component-defense-DAL> triples):
<?xml version=“1.0” encoding=“UTF-8” standalone=“no”?>
<cost-model> </cost-model>
- 2.
A scaling factor of 2.0 for all <component-defense-DAL> triples
<cost>2.0</cost>
- 3.
A scaling factor of 3.0 for all defense properties of the actuation subcomponent of DeliveryDroneSystem.Impl. (Note that the triple colon “:::” is used to concatenate the component implementation with the subcomponent instance. Also, the XML field keyword for connections is also component.)
<cost component=“DeliveryDroneSystem.Impl:::actuation”>3.0</cost>
- 4.
A scaling factor of 4.0 for the “antiJaming” defense property of the camera subcomponent of DeliveryDroneSystem.Impl.
<costcomponent=“DeliveryDroneSystem.Impl:::camera” defense=“antiJamming”>4.0</cost>
Under this mode, synthesis will synthesize a globally optimal defense solution that satisfy all cyber-requirements without considering existing implemented defenses. The solution is globally optimal with respect to the set of NIST-800-53 controls supported by VERDICT. It is a minimal (not necessarily unique) set of entity-defense-DAL triples. The synthesis problem is reduced to MaxSMT to be solved with Z3 solver [
48]. Satisfiability Modulo Theories (SMT) task is to check the satisfiability of first-order formulas containing constraints from various theories such as arithmetic, arrays, and strings. MaxSMT extending SMTs task is to solve optimization problems over SMT formulas. We give a high-level overview of the encoding algorithm. To obtain a globally optimal defense solution, all applicable defenses are generated by STEM (every attack has a defense in STEM), and costs are provided by user. We create a SMT variable for each component-defense pair (p, d) as V
p_d. Note, an individual component-defense pair may occur multiple times in the attack-defense tree, but each occurrence is represented by the same variable in SMT. We define two functions f
AD and f
D. to transform an attack-defense tree to SMT encodings inductively (prefix notation is used), where AND/OR
AD(x
1, ..., x
k) and AND/OR
D(y
1, ..., y
k) denote attack nodes and defense nodes respectively; a, t, C(p, d, l) denote a CAPEC attack, a defense tree and the cost corresponding to the component p, defense d, and DAL l respectively. DAL l is chosen based on the severity in cyber requirement: 0 to No Effect; 3 to Minor; 5 to Major; 7 to Hazardous; and 9 to Catastrophic.
- −
fAD(ANDAD(x1, ..., xk)) => (or fAD(x1)... fAD(xk))
- −
fAD(ORAD(x1, ..., xk)) => (and fAD(x1) ... fAD(xk))
- −
fAD(ATTACK(a, p, t, l)) => fD(p, t, l)
- −
fD(p, ANDD(y1, ..., yk), l) => (and fD(p, y1, l)... fD(p, xk, l))
- −
fD(p, ORD(y1, ..., yk), l) => (or fD(p, y1, l)... fD(p, yk, l))
- −
fD(p, DEFENSE(d), l) => (>= vp_d C(p, d, l))
The objective function to be minimized by MaxSMT solvers is the sum of all implementation costs. It is important to note that when encoding the attack-defense nodes, we flip the AND and OR nodes. This transposition follows from the definitions, but the intuitive explanation is that attack-defense nodes are effectively opposites of their defense node counterparts because the attack-defense nodes are from the perspective of the attacker and the defense nodes are from the perspective of the defender. The model returned by the solver is constructed for each component p and defense d by using the inverse function of C−1(c) = max{l|C (p, d, l) = c}. Essentially, the inverse selects the maximum DAL with the given cost.
A globally optimal defense solution example for the delivery drone example returned by the synthesis is shown in
Figure 19. There are five columns on the result panel: (1) Designer Action shows what the designer has to do, (2) Component/Connection shows to which component the designer has to take the action, (3) Defense Property shows the defense property that the designer has to act on, (4) Target DAL shows the DAL level that the designer has to implement the defense property, and (5) Target Cost shows the costs to implement the defense property.
Under this mode, synthesis will synthesize a locally optimal solution using existing implemented defenses based on whether the implemented defenses satisfy the cyber requirements or not. In the case that implemented defenses do not satisfy the cyber requirements, synthesis selects a minimal set of component-defense-DAL triples to implement or upgrade in order to satisfy all cyber requirements. An example of such synthesis solution is shown in
Figure 20. This mode considers the already-implemented defenses to be free. The output of the tool still specifies the cost of the already-implemented defenses. The implemented defenses are free for the purposes of minimizing total cost. This solution is minimal among the solutions that can be obtained without removing or downgrading any defenses. Note that there may be already implemented defenses that are unnecessary, which if removed (as recommended by the other mode below) will yield a solution with smaller cost. In this case, additional SMT constraint is introduced to encode the cost for the implemented DAL as lower bound for each component and defense. Thus, (>= v
p’_d’ C(p’, d’, l’)), where l’ is the implemented DAL for the component-defense pair (p’, d’).
Conversely, if the implemented defenses already satisfy the cyber requirements, synthesis performs additional cost reductions, which is to select a maximal set of entity-defense-DAL triples to remove or downgrade while still satisfying the cyber requirements. In essence, we no longer treat the already-implemented defenses as free. This solution is minimal among solutions that can be obtained without implementing or upgrading any defenses. The resulting solution is locally minimal because achieving a more minimal solution, if one exists, requires removing or downgrading some defenses and implementing or upgrading others. In this case, additional SMT constraint is introduced to encode the cost for the implemented DAL as upper bound for each component and defense. Thus, (<= v
p’_d’ C(p’, d’, l’)), where l’ is the implemented DAL for the component-defense pair (p’, d’). An example of the synthesis solution for this case is shown in
Figure 21.
5.4. Assurance Case Fragments Generation
An assurance case is an argument-based documentation that can provide explicit assurance that a system satisfies certain desired safety, security, or reliability properties. Assurance cases have been widely used for system certification in industry as they are easy to understand by domain experts and certifiers and they provide structured and irrefutable arguments about system requirements. We have, therefore, incorporated in VERDICT toolchain the functionality of generating both safety and security assurance case fragments automatically to provide system designers with an option to quickly assemble the assurance evidence in an industry-accepted format for certification.
Multiple standards exist for assurance case representation such as CAE (Claim, Argument and Evidence) [
49], GSN (Goal Structuring Notation) [
50,
51], and SACM (Structured Assurance Case Metamodel) [
52]. However, all approaches involve three basic elements: claims, arguments, and evidences, which justify how the evidences can satisfy the claims. In VERDICT, we use the GSN notation for representing assurance case fragments because it provides a graphical representation of the elements, which can be more easily analyzed by human designers and certifiers than non-graphical representations. The GSN standard contains four principal elements that are connected to create a structured argument in the form of a graph as shown in
Figure 22. They are:
Goals represent the requirements a system is expected to satisfy.
Solutions are lowest-level evidence that may be used to claim that a goal has been satisfied.
Strategies state how goals are satisfied by sub-goals or solutions.
Contexts contain contextual information and can be used with goals, solutions, or strategies.
5.4.1. GSN Assurance Case Fragments in VERDICT
A meaningful and complete GSN fragment should include the four components mentioned above to form a comprehensible and structured argument about the requirements of a system. In the context of the VERDICT toolchain, a system can have three types of requirements—mission requirements, cyber requirements, and safety requirements. The information present in the corresponding constructs in the verdict annex for these requirements can be used to populate the goals, strategies and contexts of a GSN fragment. Currently, the assurance case tool only supports evidence generated after MBAA which are available in the form of XML outputs from SOTERIA++. We explain below how this information is used to generate the various components of the VERDICT GSN fragments.
Goals are requirements specified in the VERDICT annex. The highest-level goals are the mission requirements, which are supported by cyber and safety requirements. The goal statement is the statement declared in the description construct of a requirement.
Strategies are used to connect each goal with its sub-goals or to the SOTERIA++ solutions that prove or disprove them. For mission-level goals, the strategies argue correctness by validity of sub-goals. For cyber-level and safety-level goals, the strategies argue correctness by SOTERIA++ analysis of attack-defense trees and fault-trees respectively.
Contexts are provided in VERDICT as follows:
For mission-level goals, a context is what the requirement is concerned for the system.
For strategies of mission-level goals, a context describes the cyber and safety requirements specified in the reqs construct of the mission requirements.
For cyber-level and safety-level goals, contexts are provided for the ports specified in the condition construct of the requirements.
For strategies of cyber-level and safety-level goals, a context provides reference to the information present in the condition, severity, and targetprobability constructs of the requirements and another context provides reference to the VERDICT properties associated with the system model.
Solutions: two types of evidence are provided by VERDICT and are used as solutions in the VERDICT GSN:
Extensions to the GSN standard
When flaws in a system model cause one or more requirements to fail during the analysis, the solutions that are generated from SOTERIA++ cannot support the goals that they are expected to support. Under such circumstances, complete assurance case fragments cannot be generated for the higher-level goals that are supported by lower-level goals that have failed. However, even the information about failed goals can be presented in a graphical manner that is beneficial to system designers. We have extended the VERDICT GSN by providing colored indications about the success or failure of goals, strategies, and solutions in a GSN. The rules that dictate the color of a node are given below:
If the SOTERIA++ output for a cyber requirement has computed likelihood > acceptable likelihood, then the solution has failed and is colored red. Otherwise it is colored green.
If the SOTERIA++ output for a safety requirement has computed probability > acceptable probability, then the solution has failed and is colored red. Otherwise it is colored green.
If a strategy is supported by a solution, it bears the same color as the solution.
If a strategy is supported by one or more goals, it is colored green if and only if all supporting goals are green. Otherwise it is colored red.
A goal bears the same color as the supporting strategy.
We refer to a GSN graph in which all goals, strategies, and solutions are green as a complete assurance case fragment. If a GSN graph has at least one red node, it is an incomplete assurance case fragment, but can still be useful to designers as it provides a clear visual indication of which goals and sub-goals have failed, allowing the designers to easily isolate, locate, and mitigate flaws in a model.
The GSN notation allows additional supporting elements for incorporating information in assurance case fragments. We have also extended the VERDICT annex with two optional String constructs called justification and assumption. In a GSN graph, justifications and assumptions are represented by oval nodes.
5.4.2. An Illustration of Assurance Case Fragments Generation in VERDICT
The ACFG of VERDICT tool-suite generates three types of artifacts for each GSN fragment (1) a dot file; (2) an SVG file for the GSN graphs and (3) an XML file with the GSN data embedded. The dot file is intended for rendering SVG files, and the SVG file displays GSN assurance case fragments visually. The XML file is used for data exchange with other assurance case tools. An SVG graph displaying the assurance case fragments for the mission requirement MReq01 of the delivery drone example is shown in
Figure 23. A GSN example generated by VERDICT for the mission requirement
MReq01. The example shows three branches with one failing branch colored red and two succeeding branches colored green.
In addition, ACFG could generate fine-grain views of assurance case fragments with user-settings. Examples of GSN security assurance case fragments for cyber requirement
CyberReq02 (top) and
deliveryItemMechanism component (bottom) are shown in
Figure 24. A fine-grain view of assurance case fragments for cyber requirement
CyberReq02 and for the
deliveryItemMechanism component. Each solution node in the graph is clickable, which can redirect to the evidence file.
5.5. Cyber Resiliency Verifier (CRV)
In our context, we say a system is cyber-resilient with respect to a given set of desired functional properties and threat models only if the system can ensure the satisfaction of these properties even when the system is susceptible to attacks considered in the threat models. The cyber-resiliency verifier (CRV) enhances the VERDICT tool’s capability of performing architecture-level security and safety analysis with the capability of formally reasoning about a system design’s cyber-resiliency. More precisely, given a system design and a list of functional properties (expressed in some formal logic) that the system architect expects the design to satisfy, CRV checks to see whether the design is resilient to integrity attacks, that is, the stipulated functional properties can be guaranteed even when some components and channels of the system are vulnerable to integrity attacks (e.g., logic bomb, network spoofing, remote code execution, control-flow integrity violation). In our context, a system design consists of not only the architecture of the system (i.e., components and their interconnections) but also the behavioral information of each system component. Once a system design along with formal properties have been added to the system design model (or, just system model) under analysis, as discussed before, CRV allows one to analyze the system in both a benign case as well as under a library of different threat models. Currently, the library currently encompasses eight threats: logic bomb, insider threat, location spoofing, network injection, hardware trojans, outside user threat, remote code injection, and software virus/malware/worm/trojan, but can be extended as needed. In case CRV can identify an execution under which one of the desired functional properties can be violated, either in the benign case or in the adversarial case, as an evidence it generates a counterexample, which is a sequence of execution steps of the system that demonstrates the violation. In addition, it generates some other diagnostic information in the forms of blame assignment (when a counterexample is discovered) and merit assignment (when no counterexamples exist). These additional two pieces of diagnostic information can be used by the designer to refine the design to avoid violating a desired property and hence making the system more cyber-resilient.
Given a set of threat models, the cyber-resiliency analysis performed by CRV on the original design model can be reduced to a model checking problem of an instrumented model where the instrumented model is an enhancement of the original one while considering integrity attacks according to the input threat models. Our automatic instrumentation process is enabled by the novel notion of attack effects instead of concrete attacks. By capturing attack effects that influence a system component’s/channel’s integrity, one can abstractly group together multiple attacks. In addition, one can simulate attack effects by carefully placing non-determinisms. As an example, let us assume a system design contains two components A and B such that A takes its input from the environment whereas A’s outputs are fed into B’s inputs, and B’s outputs are considered to be system outputs. Now, if we consider that A is vulnerable to control-flow integrity attack (e.g., buffer overflow, ROP), then component A can behave arbitrarily according to the adversary’s choice. To capture such integrity attacks on A’s behavior, in our instrumented model, we can just introduce non-determinism in the logical channel that is carrying information from A’s outputs to B’s inputs. In short, this can be viewed as simulating a lossy channel that does not always deliver A’s outputs to B’s inputs faithfully (i.e., the lossy channel can modify A’s output arbitrarily). As a side-note, we introduce an enable switch for each of the non-determinisms we place in the form of a Boolean variable. When the Boolean variable is set to true the channel acts in lossy channel whereas false value of the Boolean enable switch indicates the channel carries information faithfully. These switches are considered to be as a model parameter (i.e., symbolic value) when considering blame assignment analysis and will be made clear later.
We essentially use non-determinism to consider all possible attack strategies modifying the behavior of A (e.g., after the ROP attack, we do not know what code the attacker would run in place of A’s original code) and in turn using the model checker as an attacker to find plausible attacker strategies to violate the desired functional properties. In a typical model checking context, an attacker can only control the system inputs whereas we extend this influence to internal communication between components and claim that it is sufficient to simulate integrity attacks on components and channels.
Advantage of CRV’s Analysis: Considering the design model enables CRV to more precisely identify or rule out security vulnerabilities of a system beyond what was just analyzed at the architectural level. For example, suppose a system uses a GPS component for navigation. By only inspecting the architecture of a system, the best we could do at the architectural level is to identify that the system is possibly vulnerable to a GPS spoofing attack. Even if the system uses a robust GPS sensor to resist spoofing attacks, due to the lack of behavioral details in the architecture we would not be able to rule out such attacks. Thanks to its access to the design model, however, CRV is able to rule out such false positives. Furthermore, it may identify vulnerabilities missed by inspecting only the system architecture. For instance, suppose a system includes other location sensors (e.g., a LIDAR) together with the GPS sensor and uses a majority voting scheme to rule out spoofing attacks under the assumption that an attacker cannot simultaneously compromise/attack all the positioning sensors on-board. Such a system will be deemed resistant against spoofing attacks by architecture analysis tools. However, by the way this security-enhancing mechanism is designed, it can have logical errors resulting in spoofing attacks. For instance, the majority voting scheme design in the above scenario could have a vulnerability that, under certain conditions, makes it adjudicate the spoofed location value to be the correct value. CRV can identify such a situation through a formal analysis of the components’ behavior.
Another way that CRV adds to the analysis is in the threat model it uses for its analysis. More precisely, architectural analysis tools only consider previously documented attacks as part of their threat model. In contrast, CRV groups possible attacks based on their effects on the system and collectively reasons about them considering an abstract threat effect model. For example, buffer/heap overflow, SQL injection, and return-oriented-programming (ROP) attacks, in the worst case, all provide an attacker with the ability to run malicious code in a system without having prior proper privileges. CRV will thus consider “running malicious code” as a possible effect instead of considering all known attacks that achieve this effect. Such an approach has a number of clear benefits. First, the number of effects one must consider for covering a wide variety of attacks is substantially smaller than the number of concrete attacks. Second, this approach can capture also unknown or even future attacks if these attacks have an effect already captured in the threat effect model.
Detailed CRV Workflow: CRV is meant to be used in the system design phase. Its workflow is depicted in
Figure 25. The system architect/designer is responsible for providing the system design model (refer to 1) as input to CRV. The designer is also responsible for choosing an appropriate threat effect model (refer to 2) from a library of such models under which to carry out the necessary formal resiliency analysis. The CRV workflow intentionally separates the system’s design model from the adversarial threat effect models. This decoupling enables the system designer to develop the design model without having to include aspects of the system’s security properties. Given the system design model, the chosen threat effect model, as well as a number of desired cyber-resiliency properties for the system, the threat model instrumentor (refer to 3) first automatically composes the two model to obtain a threat-instrumented model (refer to 4). Then, the satisfaction of the cyber-resiliency properties (refer to 5) on the threat-instrumented model is checked by the reasoning engine powered by the Kind 2 model checker (refer to 6) [
53]. Cyber-resiliency properties are provided by the designer and expressed in past-time linear temporal logic. The analysis artifacts produced as feedback (refer to 7) by CRV include the counterexample, the localized diagnosis information to trace possibly vulnerable components in the system (blame assignment) and identifying components who may have played a critical role in verifying the cyber-resiliency requirement even in an adversarial environment (merit assignment).
We envision one to carry out the analysis of their system design with respect to the formal property in the benign case before moving on to the adversarial case. This recommendation is to ensure that the system can verifiably maintain the required functional formal properties when there are no adversarial interferences. If this analysis step were to be skipped, and one directly starts the analysis with respect to some threat models in the CRV library, then if CRV were to generate a counterexample, it is difficult for the user to pinpoint whether the violation is due to adversarial interference or a design flaw. Once a user has verified their design in the benign case, when they move on to the adversarial case and observes a counterexample, they can be sure that the violation is due to the adversarial interference.
5.5.1. Blame Assignment
When the instrumented model is fed into the model checker during CRV workflow, for each formal property, the model checker can come back with one of the following three possible answers: Safe (signifying the model checker was able to prove that the model satisfies the property); Unsafe (signifying the model does not satisfy the property); Unknown (signifying the model checker timed out and could not prove/disprove the property). For each property determined to be unsafe, in addition to the attack trace, CRV can also generate information regarding misbehaving components that may have contributed to the violation. We call this functionality blame assignment. CRV supports a locally optimized and a globally optimized forms of blame assignment both of which are achieved by posing a series of queries to the backend model checker.
Technically, the blame assignment problem is an optimization problem that requires minimizing the number of enabled switches (i.e., the number of vulnerable components/channels) required to obtain a counterexample. This can be viewed as identifying a minimum cut set of enabled switches that is sufficient for demonstrating a violation of the property in question.
Suppose during instrumentation we introduce non-determinisms in n channels, each of which is controlled by a unique enable switch (i.e., a Boolean variable), and after analysis the model checker discovered a counterexample containing l steps of the system execution. The locally optimized form of blame assignment checks to see whether it is possible to generate a l-step counterexample using a smaller number of enable switches turned on compared to n. Note that, the counterexample observed during the minimization step of locally optimized blame assignment may not be identical to the one observed during initial model checking; it just needs to be of the same length l. The globally optimized form of blame assignment is similar to the locally optimized one with one exception: there is no restriction of the counterexample length being equal to the original one obtained during model checking. The insight is that it may be possible to generate a counterexample with a smaller number of enable switches turned on (i.e., vulnerable components/channels) when there are no restrictions that the obtained counterexample during this analysis is of the same length as the one obtained during original model checking.
The algorithm: We now present our algorithm for finding the minimum cut set, first in the global optimization setting and then in the local optimization one. In the original model checking problem, all switch parameters are all set to true. For blame assignment, however, all switch parameters are unset. Solving the blame assignment problem then corresponds to finding a truth assignment to these parameters that produces a violation of the property with the smallest number of parameters set to true. In other words, let S and O denote the total number of enabling switches and the total number of those that are turned on, respectively. We want to minimize the value of O while still violating the property. For this, we pose a series of model checking queries with the following two additional restrictions added to the original model checking problem:
(R1) each switch should maintain the same value during the whole execution (that is, it cannot switch between on and off);
(R2) O is bounded above by some constant C (O ≤ C).
Initially, we choose C = S − 1 and keep decreasing the value of C as long as the model checker returns unsafe for the enhanced model with the two restrictions above. Once we find the smallest value of O, we identify from the attack trace the components whose enabling switches are turned on and return the corresponding compromised component in the blame assignment.
In the global optimization setting, the attack trace corresponding to the smallest blame assignment set can be longer than the original trace, the one found by the model checker before the optimization. If, on the other hand, the designer is interested in attack traces of minimal length, it is possible to ask CRV for a locally optimized blame assignment. In this setting, we generate a bounded model checking problem [
54] to find attack traces of the same length as the original trace, but we force a MaxSMT solver to minimize the number of parameters set to
true by setting the negation of the Boolean switch parameters as
soft constraints. This produces the desired effect if that trace is already of minimum length.
5.5.2. Merit Assignment
For merit assignment, which is relevant when the system satisfies the given cyber-resiliency properties even when under attack, CRV can provide traceability information between those properties and design elements such as assumptions, guarantees or other behavioral constraints. Specifically, CRV uses the new capability of Kind 2 for identifying minimal sets of design elements, known as Minimal Inductive Validity Cores (IVCs) [
49], which are sufficient to prove the given set of cyber-resiliency properties. The functionality can be enabled as post-analysis in the CRV Settings panel. When CRV can prove that one or more properties are satisfied by the model, the user can view inductive validity core of the satisfied properties. It will show the components involved in the satisfaction of the property, and for each component, which assumptions and guarantees were used in the proof.
5.5.3. An Illustrative Example Using CRV
We now describe how a designer/user would interact with the CRV functionality of VERDICT on the delivery drone system in three cases: benign case (no threats against the system), threat effect case (CRV threats against the system), and mitigation case (a mitigation plan is implemented in the system).
Once we have extended the system design with behavioral information as well as formal properties of interest, the first thing a user has to check is that the system design model satisfies the property in a benign scenario, that is, without considering the effects of any threat model. To do that, the user would need to make sure to no CRV threat models is selected and then invoke Cyber Resilience Verifier (CRV). This initiates the translation of the AADL model to a VERDICT internal data model and then finally to the input to CRV. CRV will find a scenario where property P7 is violated. Since we know that cannot be the case in the benign scenario, one can add the following assumption to the
DeliveryDroneSystem specification (
Figure 26):
In the case, since no threat is enabled, property P7 is proved valid as shown in
Figure 27.
We choose to apply all CRV threats on the model. In this case, CRV will find a scenario where property P7 is violated. It is evident that CRV has detected a possible violation of property P7 due to a Logic Bomb that may be hidden in the logic of the
DeliveryPlanner. Moreover, the blame assignment analysis identifies a minimal set of critical ports, namely
dest_location and
delivery_cmd, that are sufficient for carrying out the Logic Bomb attack as shown in
Figure 28. In addition, one can view counter-example to display trace that leads the system to the violation of the property. Examining the trace, one can observe that the property might be violated if the delivery location provided through the
bus1 is off-limits.
Once the vulnerable components and ports are identified, the user can consider multiple possibilities to address the root cause of the attack. Let us assume the designer decides to place a runtime monitor called “
PositioinRuntimeMonitor” between the
DeliveryPlanner, and the
DeliveryItemMechanism that checks continuously whether the delivery location is off-limits, and raises a warning flag when it is the case and a command for releasing a package is issued. The AADL modeling of “
PositioinRuntimeMonitor” is shown in
Figure 29.
With this runtime monitor in place, the
DeliveryItemMechanism can be extended to use the warning flag to prevent the delivery of the package in a restricted area. For that, one will need to add a new input port
warning_flag that reads the signal from the runtime monitor, and also add a new guarantee to the
DeliveryItemMechanism to ignore the command when the signal is true. The definition is shown in
Figure 30.
A new analysis of the system confirms the applied fix is sufficient to rule out new Logic Bomb attacks as shown in
Figure 31.
Moreover, the
Merit Assignment post-analysis can assure the user that the runtime monitor really plays an important role in the satisfaction of property P7 under the adversarial environment. The result is shown in
Figure 32.
In fact, thanks to the mitigation implemented, the assumption “Delivery locations set through bus1 are never off-limits locations”, that the user added before during the analysis in the benign scenario, is not required anymore.