Cure v0.20.0 :: The Shape of Things
by Aleksei Matiushkin
v0.19.0 was "Bring the Furniture"—proofs, record defaults, derive, property-based testing, a lazy iterator protocol, the first half of a package registry. It was a release about shipping new things.
v0.20.0 is about the shape those things have inside the
compiler. Four coupled tracks land together. None of them are
user-visible language features in the usual sense: they are the
scaffolding that makes the next generation of features cheap to
build. Plain # comments are now first-class AST nodes, binary
literals and patterns grow full Elixir-style segment specifiers, a
Wadler/Inspect.Algebra-style pretty-printer sits behind
cure fmt --algebra, and a new Cure.Types.PatternRefinement
module narrows scrutinee types through constructor and
literal-equality witnesses.
Plain # comments in the AST
Until v0.20.0, Cure.Compiler.Lexer discarded every plain #
comment. Only ## and fenced ### ... ### doc comments survived
as :doc_comment tokens. That was fine for compilation—comments
have no runtime meaning—but it made an AST-based formatter
lossy by construction: once you lifted a file into the parser,
every inline annotation was gone.
v0.20.0 adds an opt-in preserve_comments: true option to
Cure.Compiler.Lexer.tokenize/2. When set, plain # comments
emit :line_comment tokens. The parser threads those tokens into
the AST as
{:comment, [line: 12, col: 3], "explain foo"}
nodes inside the enclosing program, container body, or block body.
There is a subtlety here worth calling out: the lexer emits
:line_comment before the :indent token of the block the
comment belongs to. A naive parser would scoop the comment up as a
top-level node and leave the block empty. parse_definition_block
now collects those leading comments and routes them back inside
the block, so the comment above fn foo ends up a sibling of fn foo in the module body, not a sibling of the module itself.
The rest of the compiler treats comment nodes as trivia:
Cure.Compiler.Codegen strips them from block bodies before
generating Erlang forms, and Cure.Types.Checker.do_infer reports
:unit for them so a trailing comment never steals a block's
result type. If you are building a linter, a documentation
extractor, or something that wants to preserve comments, you can
opt in via the lexer flag without affecting any existing pipeline.
Full bitstring segments
Cure has had <<>> binary literals for a long time, but every
element was just an expression: no sizes, no types, no endianness,
nothing. v0.20.0 brings the full Erlang/Elixir-style segment
grammar:
<<tag::utf8, size::16, payload::binary-size(size), rest::binary>>
The lexer picks up :: as a new :colon_colon token, distinct
from the type-annotation : (Cure already reserved single colon
for type annotations, so reusing it for segment specifiers would
have been ambiguous). The parser wraps every <<...>> element in
{:bin_segment, [type: :utf8, size: {...}, ...], [value]}
Segment metadata carries :type (:integer, :float, :bits,
:bitstring, :bytes, :binary, :utf8, :utf16, :utf32),
:signedness (:signed, :unsigned), :endianness (:big,
:little, :native), :size (an AST node—any expression),
and :unit (integer). Specifiers chain with -:
::binary-size(n). A bare integer is shorthand for size(n), so
<<x::8>> is just <<x::size(8)>>. Defaults mirror Erlang:
integer-unsigned-big-size(8)-unit(1), with utf8/utf16/utf32
providing their own implicit size.
Cure.Compiler.Codegen and Cure.Compiler.PatternCompiler
translate the segment AST into full Erlang :bin_element forms
with the correct size/type/unit/sign/endian tuples, in both
construction and pattern positions. Cure.Compiler.Printer
round-trips segments back to surface syntax.
This is deliberately the groundwork for the v0.21.0 headline
feature: Erlang-style destructuring of binaries in match
expressions, function heads, let bindings, and comprehension
generators. The segment AST and the compilers land now; the
type-checker narrowing through size(n) (so that rest in
<<n::8, rest::binary-size(n)>> picks up byte_size(rest) == n)
and exhaustiveness analysis for binary patterns are the
v0.21.0 work on top.
cure fmt --algebra
Cure’s formatter has always been two things: a safe byte-level
pass (Cure.Compiler.Formatter) that normalises whitespace without
re-parsing, and an opt-in aggressive AST printer
(Cure.Compiler.Printer) that round-trips through the parser and
strips anything the parser cannot see (including plain #
comments).
The byte-level pass is conservative but rigid: it cannot re-flow
long argument lists onto multiple lines, cannot align -> arrows
in pattern arms, cannot make any layout decision that depends on
the structure of the code. The aggressive printer can, but at the
cost of losing comments.
v0.20.0 ships the missing third option. Cure.Compiler.Algebra is
a new document module modelled on Elixir’s Inspect.Algebra (which
in turn follows Wadler’s "A prettier printer"):
alias Cure.Compiler.Algebra
doc =
Algebra.group(
Algebra.concat([
Algebra.string("f("),
Algebra.nest(
Algebra.concat([
Algebra.break(""),
Algebra.fold(args_docs, Algebra.concat(Algebra.string(","), Algebra.break()))
]),
2
),
Algebra.break(""),
Algebra.string(")")
])
)
Algebra.format(doc, 80)
It provides the standard primitives: empty/0, string/1,
concat/2, concat/1, nest/2, break/0, break/1, line/0,
line/1, space/0, space/2, group/1, fold/2, and a
best-fit format/2. A group fits flat when its content stays
within the remaining width; otherwise every soft break inside
renders as a newline at the current indent.
Cure.Compiler.AlgebraFormatter converts the Cure AST (including
comment nodes) into an algebra document. Rather than rewrite every
node’s surface syntax from scratch, the v0.20.0 renderer delegates
per-node rendering to the existing Cure.Compiler.Printer and
owns the outer layout: standalone {:comment, ...} nodes render
as real # ... lines in source order; top-level definitions are
separated by blank lines; module/protocol/trait/fsm/proof
containers re-walk their body so nested comments surface in the
right places.
Cure.Compiler.Formatter.format_algebra/2 lexes with
preserve_comments: true, parses, renders, and then
round-trip-validates the result: the formatted buffer must
re-parse to an AST structurally equal to the input modulo comment
placement and position metadata. If verification fails, the
original source is returned unchanged—so the formatter is
non-destructive by construction.
Opt in with:
cure fmt --algebra
The byte-level safe formatter stays the default for v0.20.0. The
plan for v0.21.0 is to promote --algebra to the default
(keeping --safe as the escape hatch) once it has had one full
release of shake-down.
Structural refinement narrowing
The fourth track is quiet compared to the others, but it is the most important for the long-term dependent-type story.
Cure.Types.PatternRefinement.narrow/2 takes a pattern AST and a
scrutinee type and returns {bindings, narrowed_scrutinee}. Two
kinds of information come back:
Literal-equality witnesses. A sub-pattern that is a literal means the matched value is that literal along the arm. So
narrow({:literal, [subtype: :integer], 0}, :int)
# => {%{}, {:refinement, :int, "__value__",
# {:binary_op, [operator: :==, ...],
# [{:variable, [], "__value__"},
# {:literal, [subtype: :integer], 0}]}}}
narrows :int to {x: Int | x == 0}. Inside a tuple pattern
with a literal slot, the corresponding element of the narrowed
scrutinee type picks up the refinement; variables in other slots
still bind to their original element type.
Disjoint-tag witnesses. A constructor pattern (Ok(v),
Error(e)) or a map pattern with a literal :kind tag narrows
the scrutinee to a tagged variant:
narrow({:function_call, [name: "Ok"], [{:variable, [], "v"}]}, :any)
# => {%{"v" => :any}, {:variant, :ok, []}}
The module integrates with the existing
Cure.Types.Refinement representation so every narrowed type is
something the SMT translator already understands.
bind_pattern_vars/3 in Cure.Types.Checker keeps its existing
precise element typing for tuples/lists/records/maps --
schema-aware binding in bind_record_pattern/3 already gives
better per-field types than the standalone narrower. But
Cure.Types.PatternRefinement is now available as a separate
module for callers that want the narrowed scrutinee: future
releases will route do_infer({:pattern_match, ...}) through it
so match-arm bodies can take advantage of the narrowing, and the
path-sensitive refinement pass can propagate disjoint-tag
witnesses across subsequent expressions in the same branch.
What’s next (v0.21.0)
The headline v0.21.0 feature is full Erlang-style destructuring of
binaries in match expressions, function heads, let, and
comprehension generators. The groundwork is in place: segment AST,
codegen, PatternCompiler, surface syntax, Printer. v0.21.0
adds type-checker narrowing through binary patterns (propagating
size(n) into refinements on rest) and exhaustiveness analysis
for binary patterns.
Other v0.21.0 items: the remote package-registry index service,
publication signing, and Hex.pm cross-publishing (deferred from
v0.20.0); type-directed @derive extensions (Functor, Monoid,
JSON); and promotion of cure fmt --algebra to the default
formatter.
By the numbers
- 3 new Elixir modules (
Cure.Compiler.Algebra,Cure.Compiler.AlgebraFormatter,Cure.Types.PatternRefinement); major extensions to the lexer, parser, codegen, pattern compiler, printer, formatter, type checker, and CLI. - 1016 tests pass (up from 970);
mix credo --strict: 0 issues across 136 source files;mix cure.check.stdlib: 24/24;mix cure.check.examples: 30/30.
Getting started
git clone https://github.com/am-kantox/cure-lang.git
cd cure
mix deps.get && mix test
mix escript.build
./cure version # Cure 0.20.0
./cure fmt --algebra lib/
The repository is at github.com/am-kantox/cure-lang. The furniture came in with v0.19.0. v0.20.0 sanded every edge.