← All posts

Cure v0.20.0 :: The Shape of Things

by Aleksei Matiushkin

release formatter algebra comments bitstring bin_segment refinement pattern

v0.19.0 was "Bring the Furniture"—proofs, record defaults, derive, property-based testing, a lazy iterator protocol, the first half of a package registry. It was a release about shipping new things.

v0.20.0 is about the shape those things have inside the compiler. Four coupled tracks land together. None of them are user-visible language features in the usual sense: they are the scaffolding that makes the next generation of features cheap to build. Plain # comments are now first-class AST nodes, binary literals and patterns grow full Elixir-style segment specifiers, a Wadler/Inspect.Algebra-style pretty-printer sits behind cure fmt --algebra, and a new Cure.Types.PatternRefinement module narrows scrutinee types through constructor and literal-equality witnesses.

Plain # comments in the AST

Until v0.20.0, Cure.Compiler.Lexer discarded every plain # comment. Only ## and fenced ### ... ### doc comments survived as :doc_comment tokens. That was fine for compilation—comments have no runtime meaning—but it made an AST-based formatter lossy by construction: once you lifted a file into the parser, every inline annotation was gone.

v0.20.0 adds an opt-in preserve_comments: true option to Cure.Compiler.Lexer.tokenize/2. When set, plain # comments emit :line_comment tokens. The parser threads those tokens into the AST as

{:comment, [line: 12, col: 3], "explain foo"}

nodes inside the enclosing program, container body, or block body. There is a subtlety here worth calling out: the lexer emits :line_comment before the :indent token of the block the comment belongs to. A naive parser would scoop the comment up as a top-level node and leave the block empty. parse_definition_block now collects those leading comments and routes them back inside the block, so the comment above fn foo ends up a sibling of fn foo in the module body, not a sibling of the module itself.

The rest of the compiler treats comment nodes as trivia: Cure.Compiler.Codegen strips them from block bodies before generating Erlang forms, and Cure.Types.Checker.do_infer reports :unit for them so a trailing comment never steals a block's result type. If you are building a linter, a documentation extractor, or something that wants to preserve comments, you can opt in via the lexer flag without affecting any existing pipeline.

Full bitstring segments

Cure has had <<>> binary literals for a long time, but every element was just an expression: no sizes, no types, no endianness, nothing. v0.20.0 brings the full Erlang/Elixir-style segment grammar:

<<tag::utf8, size::16, payload::binary-size(size), rest::binary>>

The lexer picks up :: as a new :colon_colon token, distinct from the type-annotation : (Cure already reserved single colon for type annotations, so reusing it for segment specifiers would have been ambiguous). The parser wraps every <<...>> element in

&lbrace;:bin_segment, [type: :utf8, size: &lbrace;...&rbrace;, ...], [value]&rbrace;

Segment metadata carries :type (:integer, :float, :bits, :bitstring, :bytes, :binary, :utf8, :utf16, :utf32), :signedness (:signed, :unsigned), :endianness (:big, :little, :native), :size (an AST node—any expression), and :unit (integer). Specifiers chain with -: ::binary-size(n). A bare integer is shorthand for size(n), so <<x::8>> is just <<x::size(8)>>. Defaults mirror Erlang: integer-unsigned-big-size(8)-unit(1), with utf8/utf16/utf32 providing their own implicit size.

Cure.Compiler.Codegen and Cure.Compiler.PatternCompiler translate the segment AST into full Erlang :bin_element forms with the correct size/type/unit/sign/endian tuples, in both construction and pattern positions. Cure.Compiler.Printer round-trips segments back to surface syntax.

This is deliberately the groundwork for the v0.21.0 headline feature: Erlang-style destructuring of binaries in match expressions, function heads, let bindings, and comprehension generators. The segment AST and the compilers land now; the type-checker narrowing through size(n) (so that rest in <<n::8, rest::binary-size(n)>> picks up byte_size(rest) == n) and exhaustiveness analysis for binary patterns are the v0.21.0 work on top.

cure fmt --algebra

Cure’s formatter has always been two things: a safe byte-level pass (Cure.Compiler.Formatter) that normalises whitespace without re-parsing, and an opt-in aggressive AST printer (Cure.Compiler.Printer) that round-trips through the parser and strips anything the parser cannot see (including plain # comments).

The byte-level pass is conservative but rigid: it cannot re-flow long argument lists onto multiple lines, cannot align -> arrows in pattern arms, cannot make any layout decision that depends on the structure of the code. The aggressive printer can, but at the cost of losing comments.

v0.20.0 ships the missing third option. Cure.Compiler.Algebra is a new document module modelled on Elixir’s Inspect.Algebra (which in turn follows Wadler’s "A prettier printer"):

alias Cure.Compiler.Algebra

doc =
  Algebra.group(
    Algebra.concat([
      Algebra.string("f("),
      Algebra.nest(
        Algebra.concat([
          Algebra.break(""),
          Algebra.fold(args_docs, Algebra.concat(Algebra.string(","), Algebra.break()))
        ]),
        2
      ),
      Algebra.break(""),
      Algebra.string(")")
    ])
  )

Algebra.format(doc, 80)

It provides the standard primitives: empty/0, string/1, concat/2, concat/1, nest/2, break/0, break/1, line/0, line/1, space/0, space/2, group/1, fold/2, and a best-fit format/2. A group fits flat when its content stays within the remaining width; otherwise every soft break inside renders as a newline at the current indent.

Cure.Compiler.AlgebraFormatter converts the Cure AST (including comment nodes) into an algebra document. Rather than rewrite every node’s surface syntax from scratch, the v0.20.0 renderer delegates per-node rendering to the existing Cure.Compiler.Printer and owns the outer layout: standalone {:comment, ...} nodes render as real # ... lines in source order; top-level definitions are separated by blank lines; module/protocol/trait/fsm/proof containers re-walk their body so nested comments surface in the right places.

Cure.Compiler.Formatter.format_algebra/2 lexes with preserve_comments: true, parses, renders, and then round-trip-validates the result: the formatted buffer must re-parse to an AST structurally equal to the input modulo comment placement and position metadata. If verification fails, the original source is returned unchanged—so the formatter is non-destructive by construction.

Opt in with:

cure fmt --algebra

The byte-level safe formatter stays the default for v0.20.0. The plan for v0.21.0 is to promote --algebra to the default (keeping --safe as the escape hatch) once it has had one full release of shake-down.

Structural refinement narrowing

The fourth track is quiet compared to the others, but it is the most important for the long-term dependent-type story.

Cure.Types.PatternRefinement.narrow/2 takes a pattern AST and a scrutinee type and returns {bindings, narrowed_scrutinee}. Two kinds of information come back:

Literal-equality witnesses. A sub-pattern that is a literal means the matched value is that literal along the arm. So

narrow(&lbrace;:literal, [subtype: :integer], 0&rbrace;, :int)
# => &lbrace;%&lbrace;&rbrace;, &lbrace;:refinement, :int, "__value__",
#       &lbrace;:binary_op, [operator: :==, ...],
#        [&lbrace;:variable, [], "__value__"&rbrace;,
#         &lbrace;:literal, [subtype: :integer], 0&rbrace;]&rbrace;&rbrace;&rbrace;

narrows :int to {x: Int | x == 0}. Inside a tuple pattern with a literal slot, the corresponding element of the narrowed scrutinee type picks up the refinement; variables in other slots still bind to their original element type.

Disjoint-tag witnesses. A constructor pattern (Ok(v), Error(e)) or a map pattern with a literal :kind tag narrows the scrutinee to a tagged variant:

narrow(&lbrace;:function_call, [name: "Ok"], [&lbrace;:variable, [], "v"&rbrace;]&rbrace;, :any)
# => &lbrace;%&lbrace;"v" => :any&rbrace;, &lbrace;:variant, :ok, []&rbrace;&rbrace;

The module integrates with the existing Cure.Types.Refinement representation so every narrowed type is something the SMT translator already understands.

bind_pattern_vars/3 in Cure.Types.Checker keeps its existing precise element typing for tuples/lists/records/maps -- schema-aware binding in bind_record_pattern/3 already gives better per-field types than the standalone narrower. But Cure.Types.PatternRefinement is now available as a separate module for callers that want the narrowed scrutinee: future releases will route do_infer({:pattern_match, ...}) through it so match-arm bodies can take advantage of the narrowing, and the path-sensitive refinement pass can propagate disjoint-tag witnesses across subsequent expressions in the same branch.

What’s next (v0.21.0)

The headline v0.21.0 feature is full Erlang-style destructuring of binaries in match expressions, function heads, let, and comprehension generators. The groundwork is in place: segment AST, codegen, PatternCompiler, surface syntax, Printer. v0.21.0 adds type-checker narrowing through binary patterns (propagating size(n) into refinements on rest) and exhaustiveness analysis for binary patterns.

Other v0.21.0 items: the remote package-registry index service, publication signing, and Hex.pm cross-publishing (deferred from v0.20.0); type-directed @derive extensions (Functor, Monoid, JSON); and promotion of cure fmt --algebra to the default formatter.

By the numbers

  • 3 new Elixir modules (Cure.Compiler.Algebra, Cure.Compiler.AlgebraFormatter, Cure.Types.PatternRefinement); major extensions to the lexer, parser, codegen, pattern compiler, printer, formatter, type checker, and CLI.
  • 1016 tests pass (up from 970); mix credo --strict: 0 issues across 136 source files; mix cure.check.stdlib: 24/24; mix cure.check.examples: 30/30.

Getting started

git clone https://github.com/am-kantox/cure-lang.git
cd cure
mix deps.get && mix test
mix escript.build
./cure version            # Cure 0.20.0
./cure fmt --algebra lib/

The repository is at github.com/am-kantox/cure-lang. The furniture came in with v0.19.0. v0.20.0 sanded every edge.