View Source cure_lexer (cure v0.1.0)
Cure Programming Language - Lexer
The lexer module provides tokenization services for Cure source code, converting raw text into a structured stream of tokens for the parser. It supports all Cure language constructs including keywords, operators, literals, and comments.
Features
- Position Tracking: Every token includes precise line and column information
 - String Interpolation: Support for 
#{expr}string interpolation syntax - Multi-character Operators: Recognition of operators like 
->,|>,::, etc. - Comprehensive Literals: Numbers, strings, atoms, and boolean values
 - Keywords: All Cure language keywords including FSM constructs
 - Error Recovery: Detailed error reporting with location information
 
Token Types
The lexer recognizes the following token categories:
- Keywords: 
def,fsm,match,when, etc. - Identifiers: Variable and function names
 - Literals: Numbers, strings, atoms, booleans
 - Operators: 
+,->,|>,::,==, etc. - Delimiters: 
(),[],{},,,;, etc. - Comments: Line comments starting with 
# 
String Interpolation
Supports string interpolation with #{expression} syntax:
"Hello #{name}!"  % Tokenized as interpolated stringError Handling
All tokenization errors include precise location information:
{error, {Reason, Line, Column}}
    Summary
Functions
Extract the type from a token record.
Tokenize a string of Cure source code into a list of tokens.
Tokenize a Cure source file.
Functions
-spec token_type(#token{type :: atom(), value :: term(), line :: integer(), column :: integer()}) -> atom().
Extract the type from a token record.
This utility function extracts the token type, which is useful for pattern matching and categorizing tokens in the parser.
Arguments
Token- Token record to extract type from
Returns
- Atom representing the token type (e.g., 
def,identifier,number) 
Examples
Token = #token{type = identifier, value = <<"add">>, line = 1, column = 5},
token_type(Token).
% => identifier
KeywordToken = #token{type = def, value = def, line = 1, column = 1},
token_type(KeywordToken).
% => def
  -spec tokenize(binary()) -> {ok, [#token{type :: atom(), value :: term(), line :: integer(), column :: integer()}]} | {error, term()}.
Tokenize a string of Cure source code into a list of tokens.
This is the main entry point for lexical analysis. It processes the entire input and returns a list of tokens with position information.
Arguments
Source- Binary containing Cure source code to tokenize
Returns
{ok, Tokens}- Successful tokenization with list of token records{error, {Reason, Line, Column}}- Tokenization error with location{error, {Error, Reason, Stack}}- Internal error with stack trace
Token Record
Each token is a record with fields:
type- Token type atom (e.g.,identifier,number,def)value- Token value (e.g., variable name, number value)line- Line number (1-based)column- Column number (1-based)
Examples
tokenize(<<"def add(x, y) = x + y">>).
% => {ok, [
%      #token{type=def, value=def, line=1, column=1},
%      #token{type=identifier, value= <<"add">>, line=1, column=5},
%      #token{type='(', value='(', line=1, column=8},
%      ...
%    ]}
tokenize(<<"invalid \xff character">>).
% => {error, {invalid_character, 1, 9}}Error Types
invalid_character- Unrecognized character in inputunterminated_string- String literal without closing quoteinvalid_number_format- Malformed numeric literalunterminated_comment- Block comment without proper termination
-spec tokenize_file(string()) -> {ok, [#token{type :: atom(), value :: term(), line :: integer(), column :: integer()}]} | {error, term()}.
Tokenize a Cure source file.
This is a convenience function that reads a file from disk and tokenizes its contents. It handles file I/O errors and passes the content to the main tokenization function.
Arguments
Filename- Path to the .cure source file to tokenize
Returns
{ok, Tokens}- Successful tokenization with list of token records{error, {file_error, Reason}}- File I/O error{error, {Reason, Line, Column}}- Tokenization error with location
Examples
tokenize_file("examples/hello.cure").
% => {ok, [#token{type=def, ...}, ...]}
tokenize_file("nonexistent.cure").
% => {error, {file_error, enoent}}