← All posts

Cure v0.21.0 :: Through the Segments

by Aleksei Matiushkin

release binaries bitstring pattern destructuring let adt exhaustiveness derive formatter

v0.20.0 was The Shape of Things—the release where we took the AST scaffolding that the language had been quietly missing and made it first-class: plain # comments as nodes, full Erlang-style segment grammar on <<...>>, a Wadler/Inspect.Algebra-style pretty printer behind cure fmt --algebra, and a structural refinement narrower that could surface disjoint-tag and literal-equality witnesses. The four tracks all landed, but user-visible language features had to wait: the scaffolding didn't yet carry any weight.

v0.21.0—Through the Segments—is where the segments start to carry weight. The headline is the one v0.20.0 set up for us: full Erlang-style destructuring of binaries in match arms, multi-clause function heads, and let bindings, with a PatternChecker-driven exhaustiveness pass that surfaces as E031. On the way there we also closed three language gaps that surfaced during the v0.20.0 cycle: ADT constructor payloads carrying function types, multi-line type ADT declarations, and deep destructuring on the LHS of let. The algebra pretty-printer becomes the default cure fmt, and @derive grows three new targets (Functor, Monoid, JSON) on top of the existing Show/Eq/Ord trio.

Binaries

If you had tried to destructure a binary in v0.20.0 you would have seen the compiler accept the syntax but then refuse to compile the body:

fn first_byte(buf: Bitstring) -> Int =
  match buf
    <<b, _rest::binary>> -> b
    <<>> -> 0
# error: undefined variable 'b'

The segment AST was there—v0.20.0 had already wrapped every <<...>> element in {:bin_segment, meta, [value]} nodes with full specifier metadata—but no one had wired it into the type checker’s bind_pattern_vars/3. So the variable that the segment introduced never made it into the pattern’s scope. Same story for let <<a, rest::binary>> = buf and for multi-clause function heads that tried to pattern-match on binary parameters.

v0.21.0 adds the missing clauses:

defp bind_pattern_vars(env, &lbrace;:literal, meta, segments&rbrace;, _type) do
  case Keyword.get(meta, :subtype) do
    :bytes when is_list(segments) ->
      Enum.reduce(segments, env, fn seg, e ->
        bind_pattern_vars(e, seg, bin_segment_type(seg))
      end)

    _ ->
      env
  end
end

defp bind_pattern_vars(env, &lbrace;:bin_segment, _meta, [inner]&rbrace;, type) do
  bind_pattern_vars(env, inner, type)
end

The bin_segment_type/1 helper reads the segment’s :type meta and maps it to the scalar the segment binds: integer or an explicit size(n) yields Int, float yields Float, utf8/utf16/utf32 yield Char (the code point), and binary/bytes/bitstring/bits all yield Bitstring. That last one is deliberately conservative: a rest::binary tail at the end of a pattern could in principle carry a byte_size(rest) == byte_size(scrutinee) - sum_of_preceding_sizes refinement, but the SMT translator does not yet have the arithmetic to consume one. The hook for that lives in Cure.Types.PatternRefinement.narrow/2, which v0.21.0 extends with a {:literal, [subtype: :bytes], segments} branch that walks the same segment children and collects bindings separately from any scrutinee narrowing—so when the SMT side catches up, the compiler will already know which segments to index.

The same bind_pattern_vars/3 now runs over every clause parameter in check_multi_clause/7, which previously only extracted bare-variable params and silently dropped every structured shape. That single change makes binary function heads, ADT-head dispatch, tuple-head clauses, and record-head clauses work consistently.

Binary exhaustiveness

Once binary destructuring works, the natural follow-up question is "does my match cover every possible input?" v0.21.0 adds a dedicated pass:

@spec check_binary_exhaustiveness(term(), [tuple()]) :: check_result()
def check_binary_exhaustiveness(scrutinee_type, patterns) do
  cond do
    not bitstring_scrutinee?(scrutinee_type) ->
      :exhaustive

    Enum.any?(patterns, &top_level_wildcard?/1) ->
      :exhaustive

    true ->
      has_empty? = Enum.any?(patterns, &empty_binary_pattern?/1)
      has_tail? = Enum.any?(patterns, &binary_with_open_tail?/1)

      cond do
        has_empty? and has_tail? -> :exhaustive
        has_tail? and not has_empty? -> &lbrace;:non_exhaustive, ["<<>>"]&rbrace;
        has_empty? and not has_tail? -> &lbrace;:non_exhaustive, ["<<_, _rest::binary>>"]&rbrace;
        true -> &lbrace;:non_exhaustive, ["<<>>", "<<_, _rest::binary>>"]&rbrace;
      end
  end
end

The heuristic keeps its footprint small and honest: on a Bitstring-typed scrutinee, the set of arms is exhaustive if there’s at least one wildcard, or if both the empty-binary case (<<>>) and an open-ended tail case (a binary pattern whose last segment is ::binary/::bits/::bitstring/::bytes with no :size specifier) are present. Otherwise the compiler prints a concrete witness string such as "<<>>" or "<<_, _rest::binary>>" under the new E031 error code. The report is a warning, not an error—the match compiles and raises {:case_clause, value} at runtime if the uncovered input shows up, matching Erlang’s semantics.

The hook lives right next to the existing nested-exhaustiveness check in do_infer({:pattern_match, ...}), so the same match can trigger E025 (nested non-exhaustive) and E031 (binary non-exhaustive) independently.

let destructuring

Until v0.21.0 the :assignment branch of the type checker called a tiny three-clause bind_pattern/3 that only handled bare-variable LHS:

defp bind_pattern(env, &lbrace;:variable, _, "_"&rbrace;, _type), do: env
defp bind_pattern(env, &lbrace;:variable, _, name&rbrace;, type), do: Env.extend(env, name, type)
defp bind_pattern(env, _, _type), do: env

Which meant let Ok(x) = parse(input) parsed fine, compiled fine, and then failed type-checking with undefined variable 'x' -- because bind_pattern happily dropped the ADT constructor pattern on the floor.

v0.21.0 routes :assignment through the same bind_pattern_vars/3 engine that match arms use. let Ok(x) = expr, let %[a, b] = pair, let [h | t] = xs, let Point{x, y} = p, and let <<tag, _::binary>> = buf all bind the right variables with the right narrowed types. Non-exhaustive patterns (a let Ok(x) = expr on a Result(T, E)-typed RHS, for example) emit a dedicated warning under E034. The binding still compiles, and Erlang’s = raises at runtime on a failed match -- setting partial: true on the assignment metadata suppresses the warning for tooling that knows the pattern is acceptable by construction.

There’s a small piece of path-preservation at play: the trivial let x = 42 path is still a no-op in the exhaustiveness gate, so plain bindings pay no cost they didn't pay before.

Multi-line type ADT declarations

Writing type Shape = Circle(Int) | Square(Int) | Triangle(Int, Int, Int) on one line was always fine. Writing it on multiple lines—the way every OCaml, Haskell, or Rust developer instinctively reaches for—was not:

type Shape =
  | Circle(Int)
  | Square(Int)
  | Triangle(Int, Int, Int)
# error: unexpected token :bar at line 3, col 3

The lexer was emitting an :indent token after =, and parse_type_def/1 had no idea what to do with it: it called parse_type_variant/1 immediately, which tried to read the :indent token as a constructor name and produced a phantom variant whose name was the indent’s column number. Then the real | separators dangled loose and the next call to parse_expr rejected them.

v0.21.0 absorbs the optional wrapping :indent/:dedent pair inside parse_type_def/1, allows an optional leading :bar before the first variant, and has parse_more_variants/1 skip newlines before peeking for the next |. Both of these layouts now parse to the same AST as the single-line form:

type Shape =
  | Circle(Int)
  | Square(Int)
  | Triangle(Int, Int, Int)

type Shape =
  Circle(Int)
  | Square(Int)
  | Triangle(Int, Int, Int)

E033 catches the still-invalid shapes (continuation lines at the same indent as type, or a leading | followed directly by a :dedent).

ADT function-type payloads

Along the way, we validated that ADT constructor payloads already accept arbitrary type expressions including function arrows. parse_type_variant/1 routes payload parsing through parse_type_param_list/1 -> parse_type_expr/1, and parse_type_expr/1 has recognised (A, B) -> C and A -> B since the effect-system work in v0.15.0. So

type Callback = On(Int -> Int) | Off
type Transform = Morph((Int, Int) -> Int) | Id

parses and type-checks end-to-end; pattern matching binds the callable to a variable callable from inside the arm. E032 is reserved for the case where the payload expression isn't a valid type.

Algebra formatter as default

The v0.20.0 post ended with: "the plan for v0.21.0 is to promote --algebra to the default once it has had one full release of shake-down." v0.21.0 honours that plan. cure fmt now runs the algebra formatter by default; cure fmt --safe opts into the legacy byte-level formatter as an escape hatch for sources that trip the algebra formatter’s round-trip check; cure fmt --check routes through the algebra formatter so CI agrees with interactive use; and cure fmt --algebra stays around as an explicit synonym for the default.

@derive(Functor, Monoid, JSON)

Cure.Types.Derive.derive/3 gains three new targets to extend the Show/Eq/Ord trio from v0.19.0:

  • :functor emits fmap(x, f) that applies f to every field of the record. Intended for single-parameter records like rec Box(A)\n value: A, where it lines up with the canonical Functor instance; works on any record.
  • :monoid emits combine(a, b) that pairwise combines each field with the polymorphic <> operator. Users supply empty/0 separately—the derivation intentionally does not guess a neutral element.
  • :json emits to_json(x) that renders a record as a JSON object whose keys are the field names and whose values come from the per-field to_json/1 dispatch. from_json/1 is reserved for a future release.

The existing @derive(Show, Eq, Ord) wiring from v0.19.0 -- signature collection, codegen expansion pass—carries these new targets without any further plumbing.

By the numbers

  • 4 new error-catalog codes (E031-E034) with examples and fix guidance.
  • 5 new example files under examples/ for each v0.21.0 surface: binary_destructuring.cure, adt_fn_payload.cure, multi_line_adt.cure, let_destructuring.cure, json_derive.cure.
  • 1 new authoritative doc (docs/BINARIES.md) plus updates to docs/LANGUAGE_SPEC.md (multi-line ADT, function-type payloads, let destructuring, binary patterns) and docs/TUTORIAL.md (new Chapter 11 "Binary parsing"; FSMs move to Chapter 12).
  • 1078 tests pass (up from 1050; 3 doctests + 1075 tests); mix credo --strict: 0 issues across 137 source files; mix cure.check.stdlib: 25/25; mix cure.check.examples: 40/40.

What’s next (v0.22.0)

v0.22.0 is scoped to three language-surface closers that finish the work v0.20.0 and v0.21.0 started:

  • Multi-statement lambda bodies. Brace-delimited (fn (x) -> { stmt1; stmt2; final }) and end-terminated (fn (x) ->\n stmt1\n stmt2\nend) block forms for anonymous functions embedded in argument lists—where the existing indented-block form is not usable—will land in v0.22.0.
  • Binary comprehension generators. for <<b <- buf>> is still tokenised as a less-than comparison inside the <<...>> literal. A dedicated binary-generator path in parse_generator_or_filter/1 will unlock this shape.
  • Full byte_size arithmetic through rest::binary tails. v0.21.0 binds rest to plain Bitstring; v0.22.0 emits a proper byte_size(rest) == byte_size(scrutinee) - sum_of_preceding_sizes refinement once the SMT translator grows the arithmetic.

Looking further out (v0.23.0)

The package-registry story has been resliced to keep v0.22.0 focused. Now targeting v0.23.0:

  • Remote package-registry index service, publication signing, and Hex.pm cross-publishing. The HTTP-based read-only index protocol, Ed25519 archive signing, and cure publish --hex export path all roll into v0.23.0 so the next release can stay a pure language-surface release.

Getting started

git clone https://github.com/am-kantox/cure-lang.git
cd cure
mix deps.get && mix test
mix escript.build
./cure version            # Cure 0.21.0
./cure fmt lib/           # algebra formatter by default

The repository is at github.com/am-kantox/cure-lang. The v0.20.0 AST polished every edge; v0.21.0 puts that polish to work.