← All posts

Cure v0.33.0 :: Formalisation

by Aleksei Matiushkin

release formalisation match pickup specification semantics

A small language pays dearly for staying small. Every release adds a construct, and every construct has a paragraph in the tutorial, an example in the README, three asides in old commit messages, and a hundred quiet assumptions in the type checker that someone -- the author, a reviewer, a future self -- promised to write down later. Later is now.

v0.32.0 was the trust release: proof-carrying packages, cross-language ADT export, REPL snapshots, narrative STORY.md. The compiler caught up with the ecosystem. v0.33.0 turns the lens inward. With the surrounding machinery stable, there is no excuse left to keep deferring the one piece of paperwork that has been on the backlog since v0.18.0: a formal specification of the two language primitives at the heart of every Cure program.

Two constructs, one mental model

Cure has exactly two branching constructs -- match and pickup -- and they are governed by sentences that differ in one word:

match walks the clauses and picks up the first one whose pattern matches the scrutinee.

pickup walks the clauses and picks up the first one whose guard is true.

The whole release exists to make those two intuitions mechanically precise. Each construct now has its own version-1.0.0 specification: docs/MATCH.md and docs/PICKUP.md. Both ship in mix.exs docs.extras, both follow RFC 2119, both end in a soundness proof sketch.

What "formalisation" means here

A formal specification is not a tutorial with more headings. It is a contract a third party can read, implement against, and test. Both documents follow the same five-stratum layout:

  1. Surface. Introduction, conformance terminology (RFC 2119 MUST / MUST NOT / SHALL / SHOULD / MAY), lexical syntax, EBNF grammar.
  2. Semantics. Static semantics (typing rules, exhaustiveness, reachability, scoping, refinement-type interaction, decidability), dynamic semantics (evaluation order, side-effect observability, exceptions, divergence), and operational semantics (big-step and small-step rules, evaluation contexts, determinism, progress, preservation, equivalence, confluence, cost and memory models).
  3. Surface conventions. Formatter rules (alignment, comment fidelity, idempotence, round-trip, performance bounds, plugin interface, editor-folding integration), interaction with the companion construct, worked examples.
  4. Theory and machinery. Algebraic laws, exception and divergence contracts, tail-position behaviour, lowering / compilation rules, constant-folding obligations, macro hygiene and quote interaction.
  5. Closure. Diagnostic catalogue with stable identifiers, non-goals, stability and versioning policy, summary.

Eleven appendices follow each:

  • A. Acceptance test corpus -- runnable inputs with expected outputs and expected diagnostics.
  • B. Glossary.
  • C. Per-document change log (with 1.0.0 -- Initial release and the new v0.33.0 publication line).
  • D. Index of normative requirements -- one bullet per MUST clause.
  • E. Reference implementation sketch.
  • F. Worked examples / migration walkthroughs.
  • G. Style guide and idioms.
  • H. Anti-patterns.
  • I. Reserved future syntax.
  • J. Soundness proof sketch.
  • K. Bibliography and related work.
  • L. Open questions and future directions.
  • M. Specification metadata and colophon.

The implementation already honours every clause of both documents. v0.33.0 is the moment the contract becomes reviewable, not the moment the behaviour changes.

pickup, made precise

pickup is the only way in Cure to branch on a free-standing boolean condition. It replaced if / elif / else in an earlier release and ships now with a complete operational story:

pickup
  status >= 500 -> :server_error
  status >= 400 -> :client_error
  status >= 300 -> :redirect
  status >= 200 -> :ok
  else          -> :informational

The headline guarantees a reader can rely on:

  • Total by construction. The grammar requires the block to terminate in else -> e (canonical) or in a literal-true guard (alternative form, normalised by the formatter). A well-typed pickup cannot fail with a "no clause matched" condition at runtime. Compare this with match: non-exhaustive match is only a warning, and the non-covered case raises case_clause at runtime. With pickup, totality is syntactically guaranteed.
  • Strict Bool typing. Every guard MUST type to Bool. No truthy / falsy coercion. A non-Bool guard is rejected with E-PICKUP-GUARD-TYPE.
  • Source-order short-circuit. Once a guard yields true, no subsequent guard runs. The order is contractual, not an optimisation; reordering by the compiler is admissible only when the rearrangement is statically constant.
  • Refinement-context strengthening. Inside the i-th branch the refinement context is strengthened with g_i ∧ ¬g_1 ∧ ... ∧ ¬g_{i-1}. Inside e_else it is strengthened with the conjunction of every preceding negation. v0.30.0's path-sensitive refinement flow finally has a normative description.
  • Tail-position guarantee. A branch right-hand side is in tail position with respect to pickup iff pickup is itself in tail position. The classical loop n acc -> pickup ; n == 0 -> acc ; else -> loop(n-1, acc+n) shape is properly tail-recursive.

The static semantics formalise the typing rules:

                       Γ ⊢ e_else : τ
   ─────────────────────────────────────────────  (T-Pickup-Else)
        Γ ⊢ pickup (else -> e_else) : τ

   Γ ⊢ g : Bool       Γ ⊢ e : τ_1
   Γ ⊢ pickup C : τ_2     τ = τ_1 ⊔ τ_2
   ─────────────────────────────────────────────  (T-Pickup-Cons)
        Γ ⊢ pickup (g -> e ; C) : τ

The dynamic semantics formalise selection as five small-step rules (SP-Guard / SP-Hit / SP-Skip / SP-Else / SP-Guard-Raise) and five big-step rules (P-Else / P-Hit / P-Skip / P-Guard-Raise / P-Guard-Diverge). Appendix J discharges the standard Progress and Preservation argument: a well-typed pickup either terminates with a value of the join type, raises an exception, or diverges. Stuck states are unreachable.

The formatter chapter is twenty-five clauses long: alignment of arrows, normalisation of true -> to else ->, comment fidelity, idempotence (format(format(s, c), c) = format(s, c)), round-trip (formatted source re-parses byte-identically), performance bounds (O(N + M log M) time, O(N) space), and a final formatter grammar in EBNF.

The migration story is concrete. The legacy if / elif / else chain is removed, and the cure rewrite if-to-pickup tool migrates surviving code mechanically:

-- Before (no longer accepted):
let label =
  if   score >= 90 then "A"
  elif score >= 80 then "B"
  elif score >= 70 then "C"
  else                   "F"

-- After:
let label = pickup
              score >= 90 -> "A"
              score >= 80 -> "B"
              score >= 70 -> "C"
              else        -> "F"

Post-migration, the parser rejects if with E-IF-REMOVED and a fix-it pointing at the rewriter.

The diagnostic catalogue is closed at thirteen codes covering every failure mode: E-PICKUP-NO-ELSE, E-PICKUP-ELSE-NOT-LAST, E-PICKUP-MULTIPLE-ELSE, E-PICKUP-GUARD-TYPE, E-PICKUP-BRANCH-MISMATCH, E-IF-REMOVED, plus the warnings W-PICKUP-UNREACHABLE, W-PICKUP-DEAD-ELSE, W-PICKUP-EFFECTFUL-GUARD, and the formatter hints H-PICKUP-PREFER-ELSE, H-PICKUP-DEGENERATE, H-PICKUP-LINE-TOO-LONG, H-PICKUP-COMMENT-RELOCATED.

Reserved future syntax is fenced explicitly: pickup as which, pickup with shared_env, pickup async, and trailing where blocks are all reserved at version 1.0.0 and rejected by a 1.0.0 conforming parser. They exist so future revisions can add them without breaking source.

match, made precise

match shares the mental model. The specification covers the same ground from the structural-dispatch side:

  • The full pattern sub-grammar. Literal patterns, wildcard, silent bindings (_name), variable patterns, tuple / list / map / record / ADT constructor patterns, repeated variables, the pin operator (^x), and binary-segment patterns with the full Erlang-style specifier chain.
  • Maranget-style exhaustiveness. The compiler's coverage analysis follows the structure of Maranget's Warnings for Pattern Matching (2007). The specification mandates the algorithm's shape (column specialisation, default matrix, recursion on ADT constructors) while leaving early termination on the first witness as implementation-defined.
  • Refinement narrowing. Each branch's right-hand side is type-checked under a refinement context strengthened by the structural assertion that the scrutinee matches the clause's pattern, the negation of every preceding pattern, and the user-written guard. Pattern-bound variables therefore carry refinement information drawn from both structural and predicate constraints.
  • Pattern surface positions. Patterns are not confined to match. The same grammar appears in multi-clause function heads, let bindings, fn parameters, comprehension generators, and try ... catch. The specification cross-references every position through the central pattern sub-grammar.
  • Soundness. Appendix J sketches the standard Progress and Preservation argument over the small-step rules SM-Scrut / SM-Hit / SM-Skip-Pat / SM-Skip-Guard / SM-NoClause; the corollaries record totality (statically exhaustive blocks cannot reach case_clause) and pattern purity (matching itself contributes no observable effects).

Twelve diagnostic codes are catalogued: E004, E021 -- E025, E031 -- E034, plus the warnings W-MATCH-UNREACHABLE and W-MATCH-COMMENT-RELOCATED. Every code carries a description, an example, and fix guidance. Numeric and descriptive aliases exist for cross-spec consistency: E033 / E-MATCH-BRANCH-MISMATCH, E024 / E-MATCH-PIN-UNBOUND, and so on.

Reserved future syntax (Appendix I) names the candidates: or-patterns (Red() | Green() | Blue() -> :primary), view patterns (view(reverse, [last | _]) -> last), range patterns (0..12 -> :child), dependent / type-level patterns, and as-patterns ([h | _] as whole). All are rejected at v1.0.0 and reserved for future revisions.

Where the documents land

Both specifications are wired into the mix.exs docs.extras list, so HexDocs renders them alongside docs/PATTERNS.md, docs/BINARIES.md, and the rest of the language references.

docs/LANGUAGE_SPEC.md's ## Pattern Matching section now opens with a normative pointer to docs/MATCH.md. A new ## Conditional Dispatch (pickup) section opens with a pointer to docs/PICKUP.md and recapitulates the mental model in a single paragraph for inline readers.

The website grows a new /pickup page that mirrors the existing /match shape -- grammar, totality requirement, evaluation order, scoping, refinement narrowing, tail-position guarantee, migration from if, formatter conventions, diagnostic catalogue, idioms. Both pages cross-link the formal specifications. The roadmap entry on /roadmap marks v0.33.0 as the most recent implemented release.

Numbers

  • No new modules. No compiler code changes; v0.33.0 is a documentation and metadata release.
  • No new error codes. The catalogue is unchanged.
  • No language-surface behaviour changes. Code that compiled and ran under v0.32.0 compiles and runs identically under v0.33.0.
  • Two new HexDocs extras: docs/MATCH.md and docs/PICKUP.md.
  • One new website page: /pickup.
  • One updated website page: /match (leading and trailing pointers to the new normative documents).
  • Cross-references in docs/LANGUAGE_SPEC.md for both constructs.

Why now

Cure's compiler has matured to the point where a third party could plausibly write an alternative implementation. The package registry exists. Proof-carrying packages exist. Cross-language type export exists. The one missing piece was a normative description of the two constructs without which no Cure program does anything at all. With v0.33.0, that piece lands.

The specifications are deliberately conservative. Every clause describes behaviour the implementation has shipped for several releases. The non-goals sections name the things both documents deliberately do not commit to (truthy / falsy coercion, pattern-level effects, fall-through, implicit terminators). The reserved-syntax appendices fence off the obvious future extensions so they can land in minor revisions without breaking source. The soundness proof sketches discharge their obligations from the typing rules.

If you want a one-line summary, here it is, twice:

match walks the clauses and picks up the first one whose pattern matches the scrutinee.

pickup walks the clauses and picks up the first one whose guard is true.

Everything else in either document exists to make those intuitions mechanically precise.

What's next

The long-horizon items carry over. The Cure-native notebook (.cnb format, Livebook-style runner, inline FSM diagrams, live type hints) moves from v0.33.0's slot to v0.34.0; the groundwork is already in place via Cure.Doc.Mermaid, Cure.Doc.HtmlGenerator, and the Cure.REPL evaluation pipeline. Typed hot code upgrades (cure release --upgrade-from, SMT-verified @migration functions, E057 / E058) keep their cycle. Automatic PGO instrumentation, the TypeScript and Rust emitters for cure export-types, and broader IDE support (Helix, Zed, an upgraded VS Code extension) remain on the roadmap.

The repository lives at github.com/am-kantox/cure-lang. The new on-disk references are docs/MATCH.md and docs/PICKUP.md. The full CHANGELOG records every touch point.