Formal Methods for Rust

How to use the slides - Full screen (new tab) Slides Content --- title: Introduction to Formal Methods for Rust description: Introductory lesson on formal methods for Rust verification duration: 60 minutes ---

Introduction to Formal Methods for Rust

Outline

Intro to Formal Methods
Landscape of Techniques for Rust
Focus on Kani: Bounded Model Checker
Applications to Substrate

Introduction to Formal Methods

Story Time!

---v

Ariane 5 Rocket - Flight 501

in 1996, the launcher rocket disintegrated 39 secs after take-off.
Failure: An overflow, caused by a conversion from 64-bit to 16-bit floating point
Mistake: reusing inertial reference platform of Ariane-4, where overflow cannot happen due to different operational conditions
Cost: $500M payload, $8B development program

Notes:

Link to article: (https://www-users.cse.umn.edu/~arnold/disasters/ariane.html)

---v

Software Correctness is very important

Program testing can be used to show the presence of bugs,
but never to show their absence!

--Edgard Dijkstra--

Hence, the necessity to go beyond testing

---v

Formal Methods to the Rescue!

Given a system (code) and Specification (behavior), verify/prove correctness with reasonable mathematical guarantees.
Traditionally, costs and efforts were justifiable in safety-critical software like avionics, nuclear reactors, medical imaging, etc.
however, things have changed ...

Notes:

this is how Formal Methods were motivated; to prove the absence of Bugs! A bit of fear-mongering in my opinion.

---v

It is no longer Rocket Science!

AWS formally verifies Key-Value storage nodes in Amazon S3 (Rust Implementation).
Meta detects resource leaks and race conditions in Android apps
Uber uses static analysis to find Null-pointer exceptions
Ethereum's Beacon chain and Tendermint consensus formally verified for safety and liveness guarantees

Notes:

Personally think of formal methods as a more systematic way of detecting bugs.
Ideally, verifying if your property holds on all possible inputs.

---v

Formal Methods Today

From being theoretical research interests
to delivering practical cost-effective tools

goals more focused, promises less lofty
verification tools more efficient
combination of analysis techniques

Notes:

Limiting attention to a particular class of bugs, resource leaks, data-races, etc.
Drastic Speed-up in Underlying Constraint-Solver engines. For example, Z3 by microsoft, can solve constraints with billions of variables.
Unified theory with blurring lines; Combining both static and dynamic techniques.

---v

More like Light-weight Formal Methods

Rigorously detecting bugs → proving overall correctness of system.
Developer-centric Usability (e.g. workflow integration)

Notes:

Realized the importance of Developer experience.
No more obscure logic that the developer has to learn to write specifications.
You will see how intuitive it is to verify code.

---v

Formal Methods ↔ Blockchains

Hammer finally found the nail!

Lot at stake, justifies the cost and efforts
Business logic is compact and modular, within limits

Note:

Reputation along with money at stake.
A simple android app has 100k java classes. Techniques are not scalable on large codebases.
Complexity of runtime business logic is magnitude lower. Lot of interest in Smart Contract verification.
Check out Certora, Echidna, Securify, and more here

---v

Key Takeaways

Formal Methods are...

Not a Panacea but can improve software quality
Getting more and more accessible
Useful for increasing reliability and security of blockchains

Notes:

Great blog that explains the trade-offs between soundness and tractability

Tools Landscape

Notes:

Links to listed tools

---v

Tools Landscape

Quint/ State-Right (Model-checkers)

Humongous effort modelling the system & specifying properties
Abstraction gap
Reason about complex properties: safety & liveness of consensus mechanism

Notes:

Design-level, verifying protocol design.
Always a discrepancy in your model and actual code.
Safety: nothing bad ever happens; no two honest nodes agree on different state
Liveness: something good eventually happens; eventually 2/3rds reach consensus

---v

Tools Landscape

Static Analyzers

Code-level
Information/ dataflow properties; access control for code;
Specify expected behavior (properties). Roundtrip property: decode (encode (x)) == x
Default checks: bugs like arithmetic overflow, out-of-bound access panics

Notes:

Eg. for code access control: ensure that certain sensitive parts of runtime are only accessible by Root origin
MIRAI is developed by Meta uses technique called abstract interpretation; specifically useful for detecting panics statically and information flow properties
Kani: we will dive deeper soon

---v

Tools Landscape

Linters

Code-level
Checks for code smells
Other syntactic Properties

Notes:

Substrace is a linter specifically for Substrate
Flowistry allows you to track dependency between variables; slices only the relevant portion for a given location.

Our Focus: Kani

Kani: Model Checking tool for Rust

Open-source Rust verifier by AWS
Underlying technique used: Bounded Model Checking
Can be used to prove:
- Absence of arithmetic overflows
- Absence of runtime errors (index out of bounds, panics)
- User Specified Properties (enhanced PropTesting)
- Memory safety when using unsafe Rust
Provides a concrete test-case triggering the bug if verification fails

Notes:

Link to Bounded Model Checking paper for interested folks here. For example when you are accessing/modifying mutable static variable

---v

Lets see some Magic first

Demo of the Rectangle-Example

---v

Proof Harness

#![allow(unused)]
fn main() {
use my_crate::{function_under_test, meets_specification, precondition};

#[kani::proof]
fn check_my_property() {
   // Create a nondeterministic input
   let input = kani::any();

   // Constrain it according to the function's precondition
   kani::assume(precondition(input));

   // Call the function under verification
   let output = function_under_test(input);

   // Check that it meets the specification
   assert!(meets_specification(input, output));
}
}

Kani tries to prove that all valid inputs produce outputs that meet specifications, without panicking.
Else, Kani generates a trace that points to the failure.

---v

Property: decode(encode(x)) == x

Test

#![allow(unused)]
fn main() {
#[cfg(test)]
fn test_u32 {
  let val: u16 = 42;
  assert_eq!(u16::decode(&mut
    val.encode()[..]).unwrap(), val)
}
}

fixed value 42

Fuzzing

#![allow(unused)]
fn main() {
#[cfg(fuzzing)]
fuzz_target!(|data: &[u8]|) {
  let val = u16::arbitrary(data);
  assert_eq!(u16::decode(&mut
    val.encode()[..]).unwrap(), val)
}
}

multiple random values of u16

Kani Proof

#![allow(unused)]
fn main() {
#[cfg(kani)]
#[kani::proof]
fn proof_u32_roundtrip {
  let val: u16 = kani::any();
  assert_eq!(u16::decode(&mut
    val.encode()[..]).unwrap(), val)
}
}

verifies exhaustively all values of u16

---v

Under the Hood: Bounded Model Checking

Idea:

Search for counterexamples in (bounded) executions paths
However, this search is an NP-hard problem

Method:

Efficiently reduce problem to a Constraint Satisfaction (SAT) problem
verification reduced to problem of searching satisfiable assignment to a SAT formula.
leverages highly optimized SAT solvers making the search tractable.

Notes:

Kani uses miniSAT as the backend engine; a lot of other verification tools use Z3 solver.

---v

Translation to constraints

Code

#![allow(unused)]
fn main() {
fn foo(x: i32) -> i32 {
    let mut y: i32 = 8;
    let mut w: i32 = 0;
    let mut z: i32 = 0;
    if x != 0 {
        z -= 1;
    } else {
        w += 1;
    }
    assert!(z == 7 || w == 9);
    w+z
}
}

Constraints

#![allow(unused)]
fn main() {
y = 8,
z = x? y-1: 0,
w = x? y+1: 0,
z != 7 /\ w != 9 (negation of the assert condition)
}

Constraints fed into a Solver (minisat)
For no value of x the constraints hold $\implies$ Assert conditions verified
Else the solver found a failing test (counterexample)

---v

How does it handle loops?

Bounded in BMC to the rescue!
Loops are unwound up to a certain bounded depth $k$, else the verification does not terminate.
Determining the sweet-spot $k$ is a trade-off between tractability and verification confidence .

---v

Demo: Unwinding Loops

#![allow(unused)]
fn main() {
fn initialize_prefix(length: usize, buffer: &mut [u8]) {
    // Let's just ignore invalid calls
    if length > buffer.len() {
        return;
    }

    for i in 0..=length {
        buffer[i] = 0;
    }
}

#[cfg(kani)]
#[kani::proof]
#[kani::unwind(1)] // deliberately too low
fn check_initialize_prefix() {
    const LIMIT: usize = 10;
    let mut buffer: [u8; LIMIT] = [1; LIMIT];

    let length = kani::any();
    kani::assume(length <= LIMIT);

    initialize_prefix(length, &mut buffer);
}
}

---v

Dealing with Loops: Summary

Process:

Start with unwinding $k$ times
If no bug is found, increase $k$ until either:
- A bug is found
- verifier times-out
- predetermined upper-bound $N$ for $k$ is reached

---v

Implementing Arbitrary for custom type

#![allow(unused)]
fn main() {
use arbitrary::{Arbitrary, Result, Unstructured};

#[derive(Copy, Clone, Debug)]
pub struct Rgb {
    pub r: u8,
    pub g: u8,
    pub b: u8,
}

impl<'a> Arbitrary<'a> for Rgb {
    fn arbitrary(u: &mut Unstructured<'a>) -> Result<Self> {
        let r = u8::arbitrary(u)?;
        let g = u8::arbitrary(u)?;
        let b = u8::arbitrary(u)?;
        Ok(Rgb { r, g, b })
    }
}
}

Exercise

Verify Fixed-width & Compact Encoding for integer types in SCALE.

Open Ended properties!

RoundTrip: Decode (Encode (x)) == x
DecodeLength(x) == Decode(x).length()
EncodeAppend(vec,item) == Encode(vec.append(item))
......

Notes:

Potentially, we might play around with a few of these properties during a workshop this weekend.

More Verification

Less Bugs

Questions

Polkadot Blockchain Academy