Formal Methods for Rust
How to use the slides - Full screen (new tab)
Introduction to Formal Methods for Rust
Outline
- Intro to Formal Methods
- Landscape of Techniques for Rust
- Focus on Kani: Bounded Model Checker
- Applications to Substrate
Introduction to Formal Methods
Story Time!
---v
Ariane 5 Rocket - Flight 501
- in 1996, the launcher rocket disintegrated 39 secs after take-off.
- Failure: An overflow, caused by a conversion from 64-bit to 16-bit floating point
- Mistake: reusing inertial reference platform of Ariane-4, where overflow cannot happen due to different operational conditions
- Cost:
$
500M payload,$
8B development program
Notes:
Link to article: (https://www-users.cse.umn.edu/~arnold/disasters/ariane.html)
---v
Software Correctness is very important
Program testing can be used to show the presence of bugs,
but never to show their absence!--Edgard Dijkstra--
Hence, the necessity to go beyond testing
---v
Formal Methods to the Rescue!
- Given a system (code) and Specification (behavior), verify/prove correctness with reasonable mathematical guarantees.
- Traditionally, costs and efforts were justifiable in safety-critical software like avionics, nuclear reactors, medical imaging, etc.
- however, things have changed ...
Notes:
this is how Formal Methods were motivated; to prove the absence of Bugs! A bit of fear-mongering in my opinion.
---v
It is no longer Rocket Science!
- AWS formally verifies Key-Value storage nodes in Amazon S3 (Rust Implementation).
- Meta detects resource leaks and race conditions in Android apps
- Uber uses static analysis to find Null-pointer exceptions
- Ethereum's Beacon chain and Tendermint consensus formally verified for safety and liveness guarantees
Notes:
- Personally think of formal methods as a more systematic way of detecting bugs.
- Ideally, verifying if your property holds on all possible inputs.
---v
Formal Methods Today
From being theoretical research interests
to delivering practical cost-effective tools
- goals more focused, promises less lofty
- verification tools more efficient
- combination of analysis techniques
Notes:
- Limiting attention to a particular class of bugs, resource leaks, data-races, etc.
- Drastic Speed-up in Underlying Constraint-Solver engines. For example, Z3 by microsoft, can solve constraints with billions of variables.
- Unified theory with blurring lines; Combining both static and dynamic techniques.
---v
More like Light-weight Formal Methods
- Rigorously detecting bugs → proving overall correctness of system.
- Developer-centric Usability (e.g. workflow integration)
Notes:
- Realized the importance of Developer experience.
- No more obscure logic that the developer has to learn to write specifications.
- You will see how intuitive it is to verify code.
---v
Formal Methods ↔ Blockchains
Hammer finally found the nail!
- Lot at stake, justifies the cost and efforts
- Business logic is compact and modular, within limits
Note:
- Reputation along with money at stake.
- A simple android app has 100k java classes. Techniques are not scalable on large codebases.
- Complexity of runtime business logic is magnitude lower. Lot of interest in Smart Contract verification.
- Check out Certora, Echidna, Securify, and more here
---v
Key Takeaways
Formal Methods are...
- Not a Panacea but can improve software quality
- Getting more and more accessible
- Useful for increasing reliability and security of blockchains
Notes:
- Great blog that explains the trade-offs between soundness and tractability
Tools Landscape
Notes:
Links to listed tools
---v
Tools Landscape
Quint/ State-Right (Model-checkers)
- Humongous effort modelling the system & specifying properties
- Abstraction gap
- Reason about complex properties: safety & liveness of consensus mechanism
Notes:
- Design-level, verifying protocol design.
- Always a discrepancy in your model and actual code.
- Safety: nothing bad ever happens; no two honest nodes agree on different state
- Liveness: something good eventually happens; eventually 2/3rds reach consensus
---v
Tools Landscape
Static Analyzers
- Code-level
- Information/ dataflow properties; access control for code;
- Specify expected behavior (properties). Roundtrip property: decode (encode (x)) == x
- Default checks: bugs like arithmetic overflow, out-of-bound access panics
Notes:
- Eg. for code access control: ensure that certain sensitive parts of runtime are only accessible by Root origin
- MIRAI is developed by Meta uses technique called abstract interpretation; specifically useful for detecting panics statically and information flow properties
- Kani: we will dive deeper soon
---v
Tools Landscape
Linters
- Code-level
- Checks for code smells
- Other syntactic Properties
Notes:
- Substrace is a linter specifically for Substrate
- Flowistry allows you to track dependency between variables; slices only the relevant portion for a given location.
Our Focus: Kani
Kani: Model Checking tool for Rust
- Open-source Rust verifier by AWS
- Underlying technique used: Bounded Model Checking
- Can be used to prove:
- Absence of arithmetic overflows
- Absence of runtime errors (index out of bounds, panics)
- User Specified Properties (enhanced PropTesting)
- Memory safety when using unsafe Rust
- Provides a concrete test-case triggering the bug if verification fails
Notes:
Link to Bounded Model Checking paper for interested folks here. For example when you are accessing/modifying mutable static variable
---v
Lets see some Magic first
Demo of the Rectangle-Example
---v
Proof Harness
#![allow(unused)] fn main() { use my_crate::{function_under_test, meets_specification, precondition}; #[kani::proof] fn check_my_property() { // Create a nondeterministic input let input = kani::any(); // Constrain it according to the function's precondition kani::assume(precondition(input)); // Call the function under verification let output = function_under_test(input); // Check that it meets the specification assert!(meets_specification(input, output)); } }
- Kani tries to prove that all valid inputs produce outputs that meet specifications, without panicking.
- Else, Kani generates a trace that points to the failure.
---v
Property: decode(encode(x)) == x
Test
#![allow(unused)] fn main() { #[cfg(test)] fn test_u32 { let val: u16 = 42; assert_eq!(u16::decode(&mut val.encode()[..]).unwrap(), val) } }
fixed value 42
Fuzzing
#![allow(unused)] fn main() { #[cfg(fuzzing)] fuzz_target!(|data: &[u8]|) { let val = u16::arbitrary(data); assert_eq!(u16::decode(&mut val.encode()[..]).unwrap(), val) } }
multiple random values of u16
Kani Proof
#![allow(unused)] fn main() { #[cfg(kani)] #[kani::proof] fn proof_u32_roundtrip { let val: u16 = kani::any(); assert_eq!(u16::decode(&mut val.encode()[..]).unwrap(), val) } }
verifies exhaustively all values of u16
---v
Under the Hood: Bounded Model Checking
Idea:
- Search for counterexamples in (bounded) executions paths
- However, this search is an NP-hard problem
Method:
- Efficiently reduce problem to a Constraint Satisfaction (SAT) problem
- verification reduced to problem of searching satisfiable assignment to a SAT formula.
- leverages highly optimized SAT solvers making the search tractable.
Notes:
Kani uses miniSAT as the backend engine; a lot of other verification tools use Z3 solver.
---v
Translation to constraints
Code
#![allow(unused)] fn main() { fn foo(x: i32) -> i32 { let mut y: i32 = 8; let mut w: i32 = 0; let mut z: i32 = 0; if x != 0 { z -= 1; } else { w += 1; } assert!(z == 7 || w == 9); w+z } }
Constraints
#![allow(unused)] fn main() { y = 8, z = x? y-1: 0, w = x? y+1: 0, z != 7 /\ w != 9 (negation of the assert condition) }
- Constraints fed into a Solver (minisat)
- For no value of
x
the constraints hold $\implies$ Assert conditions verified - Else the solver found a failing test (counterexample)
---v
How does it handle loops?
- Bounded in BMC to the rescue!
- Loops are unwound up to a certain bounded depth $k$, else the verification does not terminate.
- Determining the sweet-spot $k$ is a trade-off between tractability and verification confidence .
---v
Demo: Unwinding Loops
#![allow(unused)] fn main() { fn initialize_prefix(length: usize, buffer: &mut [u8]) { // Let's just ignore invalid calls if length > buffer.len() { return; } for i in 0..=length { buffer[i] = 0; } } #[cfg(kani)] #[kani::proof] #[kani::unwind(1)] // deliberately too low fn check_initialize_prefix() { const LIMIT: usize = 10; let mut buffer: [u8; LIMIT] = [1; LIMIT]; let length = kani::any(); kani::assume(length <= LIMIT); initialize_prefix(length, &mut buffer); } }
---v
Dealing with Loops: Summary
Process:
- Start with unwinding $k$ times
- If no bug is found, increase $k$ until either:
- A bug is found
- verifier times-out
- predetermined upper-bound $N$ for $k$ is reached
---v
Implementing Arbitrary for custom type
#![allow(unused)] fn main() { use arbitrary::{Arbitrary, Result, Unstructured}; #[derive(Copy, Clone, Debug)] pub struct Rgb { pub r: u8, pub g: u8, pub b: u8, } impl<'a> Arbitrary<'a> for Rgb { fn arbitrary(u: &mut Unstructured<'a>) -> Result<Self> { let r = u8::arbitrary(u)?; let g = u8::arbitrary(u)?; let b = u8::arbitrary(u)?; Ok(Rgb { r, g, b }) } } }
Exercise
Verify Fixed-width & Compact Encoding for integer types in SCALE.
Open Ended properties!
- RoundTrip:
Decode (Encode (x)) == x
DecodeLength(x) == Decode(x).length()
EncodeAppend(vec,item) == Encode(vec.append(item))
- ......
Notes:
- Potentially, we might play around with a few of these properties during a workshop this weekend.