Cure v0.21.0 :: Through the Segments
by Aleksei Matiushkin
v0.20.0 was The Shape of Things—the release where we took the
AST scaffolding that the language had been quietly missing and made
it first-class: plain # comments as nodes, full Erlang-style
segment grammar on <<...>>, a Wadler/Inspect.Algebra-style
pretty printer behind cure fmt --algebra, and a structural
refinement narrower that could surface disjoint-tag and
literal-equality witnesses. The four tracks all landed, but
user-visible language features had to wait: the scaffolding didn't
yet carry any weight.
v0.21.0—Through the Segments—is where the segments start
to carry weight. The headline is the one v0.20.0 set up for us:
full Erlang-style destructuring of binaries in match arms,
multi-clause function heads, and let bindings, with a
PatternChecker-driven exhaustiveness pass that surfaces as
E031. On the way there we also closed three language gaps that
surfaced during the v0.20.0 cycle: ADT constructor payloads
carrying function types, multi-line type ADT declarations, and
deep destructuring on the LHS of let. The algebra pretty-printer
becomes the default cure fmt, and @derive grows three new
targets (Functor, Monoid, JSON) on top of the existing
Show/Eq/Ord trio.
Binaries
If you had tried to destructure a binary in v0.20.0 you would have seen the compiler accept the syntax but then refuse to compile the body:
fn first_byte(buf: Bitstring) -> Int =
match buf
<<b, _rest::binary>> -> b
<<>> -> 0
# error: undefined variable 'b'
The segment AST was there—v0.20.0 had already wrapped every
<<...>> element in {:bin_segment, meta, [value]} nodes with
full specifier metadata—but no one had wired it into the type
checker’s bind_pattern_vars/3. So the variable that the segment
introduced never made it into the pattern’s scope. Same story for
let <<a, rest::binary>> = buf and for multi-clause function
heads that tried to pattern-match on binary parameters.
v0.21.0 adds the missing clauses:
defp bind_pattern_vars(env, {:literal, meta, segments}, _type) do
case Keyword.get(meta, :subtype) do
:bytes when is_list(segments) ->
Enum.reduce(segments, env, fn seg, e ->
bind_pattern_vars(e, seg, bin_segment_type(seg))
end)
_ ->
env
end
end
defp bind_pattern_vars(env, {:bin_segment, _meta, [inner]}, type) do
bind_pattern_vars(env, inner, type)
end
The bin_segment_type/1 helper reads the segment’s :type meta
and maps it to the scalar the segment binds: integer or an
explicit size(n) yields Int, float yields Float,
utf8/utf16/utf32 yield Char (the code point), and
binary/bytes/bitstring/bits all yield Bitstring. That
last one is deliberately conservative: a rest::binary tail at
the end of a pattern could in principle carry a
byte_size(rest) == byte_size(scrutinee) - sum_of_preceding_sizes
refinement, but the SMT translator does not yet have the
arithmetic to consume one. The hook for that lives in
Cure.Types.PatternRefinement.narrow/2, which v0.21.0 extends with
a {:literal, [subtype: :bytes], segments} branch that walks the
same segment children and collects bindings separately from any
scrutinee narrowing—so when the SMT side catches up, the
compiler will already know which segments to index.
The same bind_pattern_vars/3 now runs over every clause
parameter in check_multi_clause/7, which previously only
extracted bare-variable params and silently dropped every
structured shape. That single change makes binary function heads,
ADT-head dispatch, tuple-head clauses, and record-head clauses
work consistently.
Binary exhaustiveness
Once binary destructuring works, the natural follow-up question is "does my match cover every possible input?" v0.21.0 adds a dedicated pass:
@spec check_binary_exhaustiveness(term(), [tuple()]) :: check_result()
def check_binary_exhaustiveness(scrutinee_type, patterns) do
cond do
not bitstring_scrutinee?(scrutinee_type) ->
:exhaustive
Enum.any?(patterns, &top_level_wildcard?/1) ->
:exhaustive
true ->
has_empty? = Enum.any?(patterns, &empty_binary_pattern?/1)
has_tail? = Enum.any?(patterns, &binary_with_open_tail?/1)
cond do
has_empty? and has_tail? -> :exhaustive
has_tail? and not has_empty? -> {:non_exhaustive, ["<<>>"]}
has_empty? and not has_tail? -> {:non_exhaustive, ["<<_, _rest::binary>>"]}
true -> {:non_exhaustive, ["<<>>", "<<_, _rest::binary>>"]}
end
end
end
The heuristic keeps its footprint small and honest: on a
Bitstring-typed scrutinee, the set of arms is exhaustive if
there’s at least one wildcard, or if both the empty-binary case
(<<>>) and an open-ended tail case (a binary pattern whose last
segment is ::binary/::bits/::bitstring/::bytes with no
:size specifier) are present. Otherwise the compiler prints a
concrete witness string such as "<<>>" or
"<<_, _rest::binary>>" under the new E031 error code. The
report is a warning, not an error—the match compiles and raises
{:case_clause, value} at runtime if the uncovered input shows up,
matching Erlang’s semantics.
The hook lives right next to the existing nested-exhaustiveness
check in do_infer({:pattern_match, ...}), so the same match
can trigger E025 (nested non-exhaustive) and E031 (binary
non-exhaustive) independently.
let destructuring
Until v0.21.0 the :assignment branch of the type checker
called a tiny three-clause bind_pattern/3 that only handled
bare-variable LHS:
defp bind_pattern(env, {:variable, _, "_"}, _type), do: env
defp bind_pattern(env, {:variable, _, name}, type), do: Env.extend(env, name, type)
defp bind_pattern(env, _, _type), do: env
Which meant let Ok(x) = parse(input) parsed fine, compiled fine,
and then failed type-checking with undefined variable 'x' --
because bind_pattern happily dropped the ADT constructor pattern
on the floor.
v0.21.0 routes :assignment through the same
bind_pattern_vars/3 engine that match arms use. let Ok(x) = expr, let %[a, b] = pair, let [h | t] = xs,
let Point{x, y} = p, and let <<tag, _::binary>> = buf all bind
the right variables with the right narrowed types. Non-exhaustive
patterns (a let Ok(x) = expr on a Result(T, E)-typed RHS, for
example) emit a dedicated warning under E034. The binding still
compiles, and Erlang’s = raises at runtime on a failed match --
setting partial: true on the assignment metadata suppresses the
warning for tooling that knows the pattern is acceptable by
construction.
There’s a small piece of path-preservation at play: the trivial
let x = 42 path is still a no-op in the exhaustiveness gate, so
plain bindings pay no cost they didn't pay before.
Multi-line type ADT declarations
Writing type Shape = Circle(Int) | Square(Int) | Triangle(Int, Int, Int)
on one line was always fine. Writing it on multiple lines—the
way every OCaml, Haskell, or Rust developer instinctively reaches
for—was not:
type Shape =
| Circle(Int)
| Square(Int)
| Triangle(Int, Int, Int)
# error: unexpected token :bar at line 3, col 3
The lexer was emitting an :indent token after =, and
parse_type_def/1 had no idea what to do with it: it called
parse_type_variant/1 immediately, which tried to read the
:indent token as a constructor name and produced a phantom
variant whose name was the indent’s column number. Then the
real | separators dangled loose and the next call to
parse_expr rejected them.
v0.21.0 absorbs the optional wrapping :indent/:dedent pair
inside parse_type_def/1, allows an optional leading :bar
before the first variant, and has parse_more_variants/1 skip
newlines before peeking for the next |. Both of these layouts
now parse to the same AST as the single-line form:
type Shape =
| Circle(Int)
| Square(Int)
| Triangle(Int, Int, Int)
type Shape =
Circle(Int)
| Square(Int)
| Triangle(Int, Int, Int)
E033 catches the still-invalid shapes (continuation lines at the
same indent as type, or a leading | followed directly by a
:dedent).
ADT function-type payloads
Along the way, we validated that ADT constructor payloads already
accept arbitrary type expressions including function arrows.
parse_type_variant/1 routes payload parsing through
parse_type_param_list/1 -> parse_type_expr/1, and
parse_type_expr/1 has recognised (A, B) -> C and A -> B since
the effect-system work in v0.15.0. So
type Callback = On(Int -> Int) | Off
type Transform = Morph((Int, Int) -> Int) | Id
parses and type-checks end-to-end; pattern matching binds the
callable to a variable callable from inside the arm. E032 is
reserved for the case where the payload expression isn't a valid
type.
Algebra formatter as default
The v0.20.0 post ended with: "the plan for v0.21.0 is to promote
--algebra to the default once it has had one full release of
shake-down." v0.21.0 honours that plan. cure fmt now runs the
algebra formatter by default; cure fmt --safe opts into the
legacy byte-level formatter as an escape hatch for sources that
trip the algebra formatter’s round-trip check; cure fmt --check
routes through the algebra formatter so CI agrees with interactive
use; and cure fmt --algebra stays around as an explicit synonym
for the default.
@derive(Functor, Monoid, JSON)
Cure.Types.Derive.derive/3 gains three new targets to extend the
Show/Eq/Ord trio from v0.19.0:
:functoremitsfmap(x, f)that appliesfto every field of the record. Intended for single-parameter records likerec Box(A)\n value: A, where it lines up with the canonical Functor instance; works on any record.:monoidemitscombine(a, b)that pairwise combines each field with the polymorphic<>operator. Users supplyempty/0separately—the derivation intentionally does not guess a neutral element.:jsonemitsto_json(x)that renders a record as a JSON object whose keys are the field names and whose values come from the per-fieldto_json/1dispatch.from_json/1is reserved for a future release.
The existing @derive(Show, Eq, Ord) wiring from v0.19.0 --
signature collection, codegen expansion pass—carries these new
targets without any further plumbing.
By the numbers
- 4 new error-catalog codes (
E031-E034) with examples and fix guidance. - 5 new example files under
examples/for each v0.21.0 surface:binary_destructuring.cure,adt_fn_payload.cure,multi_line_adt.cure,let_destructuring.cure,json_derive.cure. - 1 new authoritative doc (
docs/BINARIES.md) plus updates todocs/LANGUAGE_SPEC.md(multi-line ADT, function-type payloads,letdestructuring, binary patterns) anddocs/TUTORIAL.md(new Chapter 11 "Binary parsing"; FSMs move to Chapter 12). - 1078 tests pass (up from 1050; 3 doctests + 1075 tests);
mix credo --strict: 0 issues across 137 source files;mix cure.check.stdlib: 25/25;mix cure.check.examples: 40/40.
What’s next (v0.22.0)
v0.22.0 is scoped to three language-surface closers that finish the work v0.20.0 and v0.21.0 started:
- Multi-statement lambda bodies. Brace-delimited
(
fn (x) -> { stmt1; stmt2; final }) andend-terminated (fn (x) ->\n stmt1\n stmt2\nend) block forms for anonymous functions embedded in argument lists—where the existing indented-block form is not usable—will land in v0.22.0. - Binary comprehension generators.
for <<b <- buf>>is still tokenised as a less-than comparison inside the<<...>>literal. A dedicated binary-generator path inparse_generator_or_filter/1will unlock this shape. - Full
byte_sizearithmetic throughrest::binarytails. v0.21.0 bindsrestto plainBitstring; v0.22.0 emits a properbyte_size(rest) == byte_size(scrutinee) - sum_of_preceding_sizesrefinement once the SMT translator grows the arithmetic.
Looking further out (v0.23.0)
The package-registry story has been resliced to keep v0.22.0 focused. Now targeting v0.23.0:
- Remote package-registry index service, publication signing,
and Hex.pm cross-publishing. The HTTP-based read-only index
protocol, Ed25519 archive signing, and
cure publish --hexexport path all roll into v0.23.0 so the next release can stay a pure language-surface release.
Getting started
git clone https://github.com/am-kantox/cure-lang.git
cd cure
mix deps.get && mix test
mix escript.build
./cure version # Cure 0.21.0
./cure fmt lib/ # algebra formatter by default
The repository is at github.com/am-kantox/cure-lang. The v0.20.0 AST polished every edge; v0.21.0 puts that polish to work.