Cure v0.31.0 :: Specialise & Steer
by Aleksei Matiushkin
v0.30.0 admitted that we did not trust each other with the full
picture, so it gave us john -- the panoramic diagnostic that just
shows everything. With the operator caught up to the language,
v0.31.0 can finally turn to the language itself, and to a corner of
it that has been tagged ## Future since v0.24.0.
This release is about not paying.
The pact at the heart of polymorphism
fn id(x: T) -> T = x is one of the most beautiful five-token
programs you can write. It also lies a little. The compiler accepts
id once and for all, but every call site that uses id pays the
price of the lie: a generic dispatch, a generic body, a generic
sub-expression that has to keep its options open in case T turns
out to be the next thing.
In practice, T is almost never the next thing. It is Int here and
String there, and the compiler is the only one in the room
pretending otherwise. Specialisation is the polite way to stop
pretending.
Monomorphisation
Cure.Optimizer.Monomorphise is a new MetaAST pass wired in front of
Cure.Optimizer.Inline. It walks the module body, identifies
function_def nodes whose declared signature mentions
{:type_var, _}, and for every call site whose arguments pin those
variables to concrete types, it synthesises a specialised clone:
mod Specialise
fn id(x: T) -> T = x
fn use_int() -> Int = id(42)
fn use_str() -> String = id("hi")
After the optimiser has finished, the module also exports id/1
unchanged -- out-of-unit callers, the protocol registry, and dynamic apply/3` keep working exactly as before -- and privately holds two
fresh definitions:
id__223a2c(x) -> x. %% T -> Int
id__17522e(x) -> x. %% T -> String
The mangled suffix is the lower-case 6-character base-16 of
:erlang.phash2/1 of the canonicalised substitution. The hash is
stable across recompiles: the same (name, substitution) pair always
maps to the same suffix, so cross-build artifacts and tooling stay in
sync.
The two id__* functions look pointless on their own -- they are
verbatim copies of the original body -- and that is the whole
point. The inliner, which runs immediately afterwards, has been
waiting for this. It now sees a one-expression non-recursive function
with a concrete signature and a single call site, and devours each
clone in turn. The codegen quietly emits a per-module
-compile({nowarn_unused_function, ...}). attribute for the
:specialised_from clones so erl_lint does not complain about the
ghosts the inliner just laid to rest.
The pact also applies to taxes
The "what if it's something else" tax shows up in BEAM bytecode, in JIT decisions, and -- the part I care about most -- in dependent types. A specialised body with concrete refinements is a body the SMT solver can actually reason about. v0.31.0 is the scaffolding; v0.32.0 will start extracting interest payments from it.
Budget
Specialisation is so seductive that it will happily blow up your bytecode if you let it. The pass therefore caps the number of unique specialisations at 16 per source function. When the cap is exceeded, the first 16 clones survive, the rest fall back to the original generic, and the type checker emits E064 Monomorphisation Budget Exhausted with a count of how many clones it dropped on the floor.
Tune it per project:
[compiler]
monomorph_budget = 32
Or one-off:
$ cure compile --monomorph-budget=32 lib/
Disable it entirely with --no-monomorphise. The pass is on by
default whenever --optimize is in effect, which means the rest of
the optimiser sees specialised signatures wherever the inputs are
concrete enough.
Profile-guided optimisation
Specialisation is a good story when it can find a story to tell. Some
functions are polymorphic-ish in shape but always called with a
single concrete type; some are absolutely monomorphic and never hot;
some are hot enough that the inliner's default ast_size <= 5 cap is
embarrassingly small. The optimiser does not know which is which from
source alone. It needs evidence.
Cure.PGO is the evidence pipeline. It has two halves.
Collection
Cure.PGO.Recorder is a GenServer over a public ETS table.
The tick/* operations are direct ETS writes; they are also no-ops
when the recorder is not running, so test fixtures and library users
that import instrumented code do not have to care.
Cure.PGO.Recorder.tick({MyMod, :foo, 1})
Cure.PGO.Recorder.tick_with_us({MyMod, :bar, 0}, 250)
Cure.PGO.Recorder.tick_branch({MyMod, :baz, 1}, _site_id = 7)
Cure.PGO.Recorder.tick_smt({MyMod, :quux, 2}, _queries = 4, 1500)
Cure.PGO.Recorder.flush/1 serialises one Cure.PGO.Profile per
module to _build/cure/pgo/<mod>.profile as a compressed Erlang
external term. Each entry stores :erlang.phash2/1 of the live
function_def meta as :def_hash; the loader uses it to detect
stale profiles whose underlying source has drifted.
The CLI plumbing wraps the recorder lifecycle:
$ cure run --record-profile lib/main.cure
$ cure profile show --top 5
PGO summary (_build/cure/pgo)
hot: 3 function(s)
cold: 41 function(s)
threshold: 0.8
$ cure profile clear
Consumption
Cure.PGO.load("_build/cure/pgo") reads every *.profile and
returns a %Cure.PGO{} summary partitioning per-MFA entries into a
hot set (top 80% of total cost, with the threshold tunable) and a
cold set. The optimiser pipeline takes that summary as a keyword
option:
{:ok, pgo} = Cure.PGO.load("_build/cure/pgo")
Cure.Optimizer.optimize(ast, pgo: pgo, module: :"Cure.MyMod")
Cure.Optimizer.Inline then classifies each function_def:
- Hot. Size cap relaxes to
<= 12. The inliner becomes more aggressive. - Cold. Size cap tightens to
<= 2. Effectively disables inlining; the call site keeps its frame. - Default. The pre-v0.31.0 cap of
<= 5stays put.
Cure.SMT.Solver.check_sat/3 accepts the same hint:
- Hot. Doubles the Z3 timeout (
6 s) and emits(set-option :smt.arith.solver 6)so flaky:unknowns on the hot path get a second look with the stronger arithmetic engine. - Cold. Halves the timeout (
1.5 s) and conservatively returns:saton:unknown. Cold-path queries can be wrong, but they are not allowed to be slow. - Default. Pre-v0.31.0 behaviour, unchanged.
Strictly opt-in
PGO is not a default. If you compile without --pgo, Cure.PGO is
never consulted, even when _build/cure/pgo/ exists and is full.
This is deliberate. The rule of thumb I have lived by since the
v0.23.0 transparency-log story is that nothing should silently
change the bytecode that ships. PGO is exactly the kind of feature
that wants to break that rule. v0.31.0 says no.
$ cure compile lib/ # never touches profiles
$ cure compile --pgo lib/ # the only way in
Pattern-aware SMT, just enough to grow
Cure.SMT.Translator.generate_query/3 learnt a pgo_hint parameter:
:hot | :cold | :default. Hot queries gain the
(set-option :smt.arith.solver 6) prelude before everything else.
Cold and default queries are byte-identical to v0.30.x.
The richer trigger emission (forall ... :pattern (...)) for
refinement obligations is the natural next step, and it stays on the
roadmap. v0.31.0 ships the seam, the budget, the timeout machinery,
and the cold-path "may hold" semantics. Triggers move when there are
real queries to attach them to.
ASCII diagrams
cure top (v0.27.0) gives you a runtime snapshot. cure doc ships
Mermaid for FSMs, supervisors, and applications. There is one shape
left between them: the static, copy-pasteable terminal panel, which
is exactly what you want when your terminal cannot render Mermaid,
when you are reviewing a PR with git show, or when you are
explaining a supervision tree on a phone call.
Cure.Doc.Ascii reuses the FSM / sup / app traversals from the
v0.27.0 Mermaid emitter and replaces the renderer with Unicode
box-drawing:
$ cure draw lib/turnstile.cure
fsm Turnstile
=============
*──> Locked
states:
▢ Locked
▢ Unlocked
transitions:
Locked ──[coin]──> Unlocked
Unlocked ──[push]──> Locked
Supervisors render as a vertical tree (├── / └──) with each
child's id, module, and restart policy. Applications render as a
panel with vsn, root supervisor, and declared applications. A
--filter fsm|sup|app flag lets cure draw ignore the other
container kinds when a file declares more than one.
What I left on the table
Two items moved out of v0.31.0 deliberately.
Typed hot code upgrades. A typed .appup story
(cure release --upgrade-from, SMT-verified @migration functions,
E057 / E058) was on the original v0.31.0 manifest. The appup /
relup machinery is large enough, and fragile enough, that I want to
give it a release of its own rather than smuggle a half-finished
version into the optimisation release. Rescheduled for v0.32.0.
Automatic codegen instrumentation. v0.31.0 ships the data layer,
the recorder, and the consumer; the codegen pass that injects a
Cure.PGO.Recorder.tick/1 call at every function entry is the next
step. Until it lands, users who want full PGO drive ticks manually,
extend Cure.Profiler to bridge events into the recorder, or
hand-craft profiles for high-traffic modules.
Numbers
- Two new modules:
Cure.Optimizer.Monomorphise,Cure.PGO,Cure.PGO.Profile,Cure.PGO.Recorder,Cure.Doc.Ascii. Plus surgical extensions toCure.Optimizer,Cure.Optimizer.Inline,Cure.SMT.Translator,Cure.SMT.Solver,Cure.Compiler,Cure.Compiler.Codegen, andCure.CLI. - New error code
E064 Monomorphisation Budget Exhausted. - Two new docs:
docs/MONOMORPHISATION.mdanddocs/PGO.md, both wired into the Hex documentation extras. - New example:
examples/specialise.cure-- a small showcase that exercises four distinct substitutions acrossid/1andpair_first/2. - 1691 tests passing. Zero credo issues across 275 source files in strict mode. Zero warnings.
What's next
The long-horizon list got shorter and a little sharper.
- Typed hot code upgrades.
cure release --upgrade-fromto emit a typed.appupwith SMT-verified state migrations (@migration, E057 / E058). Rescheduled from v0.31.0; the appup / relup machinery deserves its own cycle. - Automatic PGO instrumentation. Inject
Cure.PGO.Recorder.tick/1at every compiled function's entry socure run --record-profileproduces useful profiles by default, not by hand. - Pattern-trigger SMT encoding. v0.31.0 ships the
pgo_hintparameter and the hot-path arithmetic-solver switch; richerforall ... :patternemission for refinement obligations is the follow-up. - Cross-module monomorphisation. v0.31.0 specialises per compilation unit. A whole-program pass that threads concrete substitutions across modules is the obvious next step once we have a project-wide call graph.
- Broader IDE support. First-class Helix and Zed configurations. An upgraded VS Code extension that tracks the current LSP surface.
- REPL-level hot reload. Recompile-and-rebind on every file save
for long-running REPL sessions, closing the loop between editing
.curefiles and the running REPL environment.
The repository lives at
github.com/am-kantox/cure-lang.
The new on-disk references are
docs/MONOMORPHISATION.md
and docs/PGO.md.
The full
CHANGELOG
lists every touch point.