Syntax Reference¶

This document helps understand how Temper code is parsed. Readers can use temper repl to get feedback on how the language interprets a piece of code. Especially useful is the describe REPL command which lets you view a snapshot of the compilation state at various processing stages.

Temper's syntax should be familiar to users of “C-like” languages: languages that use {...} around blocks and semicolons (;) separate computational steps. It is most similar to TypeScript; types follow names (name: Type) with a colon in between. But its syntax is distinct from JS/TS in details.

Some diffences include:

Temper uses let for named function declarations so that there is no confusion about when a named function's name is visible in the surrounding scope. Temper is a macro language so this is important when macros can operate on declarations.
Temper allows for interpolation into any string, so "chars${expr}". For backwards compatibility, JavaScript could only allow interpolation into back-tick strings (`chars${expr}`).
Temper has substantially different syntax for function expressions (fn (x: Int): Int { x + 1 } instead of (x: Int): Int => x + 1) and function types (fn <T> (T): T instead of <T>(T) => T).
Temper's import and export syntax, which allows connecting modules together is different.

The grammar below explains the main syntactic categories. It's meant to be advisory, to help learners discover features by following grammatical threads.

It is not an exact grammar. Temper has a three-stage parse: lexical analysis, operator precedence grouping, tree building. This grammar is derived from the tree builder which operates on a stream of tokens after an operator precedence parser has inserted synthetic parentheses into the token stream and after some other token level rewriting operations.

Since Temper is a macro language, some language features that would have separate syntactic paths in a non-macro language are instead implemented as macros; they parse as regular function calls, but those functions are macros that apply at a leter compilation-stage. For example, if is a macro so there is no dedicated syntax for if statements below.

Structure of a file¶

Syntax for Root¶

The root of a Temper module is a sequence of top-levels followed by an end of file marker.

Syntax for TopLevels¶

Top-levels are separated by semicolons in a module body or block.

Syntax for TopLevel¶

A top-level is roughly a whole declaration, or expression. Temper is an expression language, so most statement-like constructs can also nest.

Syntax for Garbage¶

Syntax for TopLevelNoGarbage¶

Syntax for TrailingSemi¶

Semicolons (;) are allowed at the end of a block or module body. An expression followed by a semicolon is not implicitly the result of the containing block or module.

Trailing semicolons are never inserted.

Statements¶

Syntax for Stmt¶

Statements are not a large syntactic category, but include labeled statements (like someName: /* loop or block */), jumps (break, continue, return, etc.) which go to the end or beginning of a containing statement and which may refer to a label.

Besides those, any expression may appear in statement position.

Syntax for Nop¶

A semicolon used to separate statements. Since our parser is built around an operator precedence parser, and semicolon is a low precedence operator, this grammar consumes them, but does not require them.

Not all semicolons need to appear explicitly in program text.

Automatic Semicolon Insertion¶

Semicolons are inserted in the following places:

After } that end a line except before a close bracket or an operator token that is not prefix.
Before { that starts a line except after an open bracket or an operator token that is not postfix.

This is more conservative than semicolon insertion in JavaScript, but still simplifies several things.

All adjacent statements are separated by semicolons¶

There's no need to have a set of special statements like if (...) stmt0 else stmt1 that do not need to be followed by a semicolon. Productions for a series of statements and declarations can simply assume that semicolons appear before them.

No limited set of statement continuers¶

We don't need a special set of statement continuers like else so that we know that the token sequence } else { is part of one statement. This lets us use common cues to allow new continuers like

foo(x) {
  // Ruby-style block
} bar(y) {
  // ruby-style block
}
// ⏸️

which de-sugars to a single statement

foo(x, fn { ... }, bar = fn (f) { f(y, fn { ... }) });
// ⏸️

vs something without a continuer

foo(x) {
  // Ruby-style block
}                         // <-- Semicolon inserted here
bar(y) {
  // Ruby-style-block
}
// ⏸️

which de-sugars to two statements

foo(x, fn { ... });
bar(y, fn { ... });
// ⏸️

Motivation¶

Developers of C-like languages are used to not following }s that end a statement with a semicolon.

The exception is class definitions in C++ which, unlike Java and more recent C-like languages do need to be followed by semicolons.

That that trips me up everytime I go back to C++ seems evidence that requiring semicolons after statements that end with something block-like would be a burden to developers.

Syntax for LabeledStmt¶

Declares a label and associates it as markers for the beginning and end of a statement, so that breaks and continues within that statement may refer to it explicitly.

Unlike TypeScript, we do not allow labeling any statement. This allows conveying property declarations like the p: T in

interface I {
  p: T;
}
// ⏸️

to the disAmbiguate stage with that property declaration treated as a (Call (Name ":") (Name "p") (Name "T"))

Otherwise, we would have to treat class bodies as a special syntactic category to avoid ambiguity with

do {
  p: T;
}
// ⏸️

or the disambiguation would need to convert T's from statement context to expression context.

Syntax for LeftLabel¶

A label that that can be jumped to as by break and continue. This is left in the left-hand-side sense: it is a declaration, not a use.

Syntax for Jump¶

A jump to a location within the same function body that does not skip over any necessary variable initializations.

Syntax for LabelOrHole¶

Syntax for Label¶

A label that can be jumped to as by break and continue.

Syntax for AwaitReturnThrowYield¶

await, return, throw, and yield are operators which affect control flow and operate on expressions.

return(42); for example, looks like a function call but the parentheses are not required:

return; is an application of an operator even though there are no parentheses.
return 42; is an application of the operator to the arguments (42) even though there are no explicit parentheses.

Syntax for StmtBlock¶

A { ... } delimited block of statements.

Expressions¶

Syntax for Expr¶

An expression is evaluated to produce a result and/or a side effect.

Syntax for BooleanLiteral¶

Syntax for Float64Literal¶

Syntax for Call¶

The call expression includes simple function calls (f(args)) as well as calls with Ruby-style block functions (f { ... }) and flow-control like if (condition) { t } else { e } because [if is a macro][builtin/if].

Syntax for New¶

Syntax for StringLiteral¶

Block Lambdas¶

Syntax for BlockLambda¶

A block lambda is a {...} block that specifies a function value and which cna appear as part of a function call as below:

someFunction { ... }

Optionally, a signature is needed to specify argument names, and may specify the function type wholly or partially.

It may be followed by an extends clause that specifyies marker interfaces that are super-types for the produced function value.

The signature is followed by the double-semicolon token (;;) which is distinct from two, space separated semicolons (; ;).

someFunction /* <- callee */ {
  (x): ReturnType               // <- Optional signature
  extends SomeInterfaceType     // <- super types
  ;;                            // <- double semicolon separator

  body
}
// ⏸️

Syntax for BlockLambdaSignatureAndSupers¶

The signature of a block lambda explains the names of arguments visible within the body, optionally their types and return type.

The signature also includes other interfaces that the lambda must implement. For example, a function that might pause execution could use a signature line as below:

(x: Int): Int extends GeneratorFn;;

That describes a function that takes an integer x and which also is a sub-type of GeneratorFn.

The extends clause may be left off entirely if no super-types are desired, or multiple super-types may be specified: extends First & Second.

Unlike in a function type, when a name is specified for a block lambda argument, it is the name of the argument, not its type.

let f: fn (Int): Void;
//         ⇧⇧⇧ word is a type

let g(myLambda: fn (Int): Void): Void { myLambda(1); }

g { (x): Void;;
  // ⇧ word is an argument name.
  // In this case, the type is inferred from g's signature.
  doSomethingWith(x + 1);
}
// ⏸️

Syntax for BlockLambdaSignature¶

A block lambda signature line like (x: Int): ReturnType or just (x) to take advantage of the more aggressive type inference for block lambdas than for declared functions.

This is often followed by ;; as it is part of BlockLambdaSignatureAndSupers Syntax.

These syntactic constructs are interpreted as if preceded by fn but the meaning is subtly different.

(x: Int) is equivalent to fn (x: Int) where the return type must later be inferrable from the calling context and the body.
(x) is equivalent to fn (x) where the argument and return type must later be inferrable.
(x): ReturnTypeis equivalent tofn (x): RT` where only argument types must later be inferrable.

Syntax for BlockLambdaSupers¶

Syntax for BlockLambdaBody¶

Uncategorized¶

Syntax for Arg¶

Syntax for ArgNoInit¶

Syntax for Args¶

Syntax for Arrow¶

Syntax for ArrowHead¶

Syntax for CallArgs¶

Arguments to a function call.

A function call's arguments may be one of:

a parenthesized, comma separated list of arguments like (a, b, c). See Args
a parenthesized, semicolon separated list of 2 or three arguments with a specific purpose. As in (let x = 1; x < 2; ++x) which is what the for loop macro expects.
a string group as in a tagged string template like callee"foo ${ bar }".

Syntax for CallHead¶

The function called, its arguments, and any block lambda

Syntax for CallJoiningWords¶

When the call continues with something like } else if (c) {...} we need to include \else_if = fn {...} as a final named parameter to the call that receives the block just closed, so that the called function can delegate its result to later segments. This joins words like else if into the \else_if symbol which follows the call join symbol. A late parse stage pass finds those and groups everything following the joining words into a trailing block function so that the contents of the parentheses and brackets can match their own signature elements based on the joining words.

Syntax for CallTail¶

Syntax for Callee¶

The callee in a function application is a tad complicated.

Our OPP grammar covers many constructs that are bespoke constructs in many languages, so class C extends A, B { ... } is parsed as an application of a block lambda (later turned into a member block) like class(\word, C, \super, A, \super B, fn { .... }).

This production desugars various parts into a combination of the callee class, and symbol/argument pairs.

The \word argument is also used in constructs like function declaration let f<T>(arg: Type) { ... } where the let macro is what is invoked to build a named function declaration.

This production allows a callee to have:

an expression specifying the called macro or function,
an accompanying word,
type parameters like <T, U> (whether the type parameters are actual parameters or formal parameters is determined by the Disambiguate stage),

Syntax for CalleeAndArgs¶

Captures low precedence operators that may follow a parenthesized argument list.

: ReturnType desugars to \outType, ReturnType.
extends SuperType and implements SuperType* desugars to\super,SuperType`.

Syntax for CalleeAndRequiredArgs¶

This is like CalleeAndArgs but is used in contexts where we're not sure yet whether this is a call. A call requires at least one of

Parenthesized arguments as in callee()
Semi-ed arguments as in loopMacro (initialization; condition; increment)
A template string as in callee"foo ${bar}"
A trailing block as in callee {}

This production succeeds is entered where we may not have a trailing block so must have one of the others.

Syntax for CommaEl¶

Syntax for CommaExpr¶

Syntax for CommaOp¶

Syntax for DeclDefault¶

Syntax for DeclInit¶

Syntax for DeclMulti¶

Syntax for DeclMultiNamed¶

Syntax for DeclMultiNested¶

Syntax for DeclName¶

Syntax for DeclType¶

Syntax for DeclTypeNested¶

Syntax for DecoratedLet¶

Syntax for DecoratedLetBody¶

Syntax for DecoratedTopLevel¶

Decorations transform declarations and function and type definitions at compile time.

@SomeName followed by an optional argument list

When a let declaration declares multiple names, any decoration before the let applies to all the names, but declarations immediately before a declared name affect only that name.

Syntax for EmbeddedComment¶

Comments are not semantically significant but nor are they filtered out entirely.

Temper tries to preserve them when translating documentation, and they are available to backends; for example, the Python backend turns autodoc comments before declarations into Python doc strings.

Syntax for EscapeSequence¶

Syntax for ForArgs¶

Syntax for ForCond¶

Syntax for ForIncr¶

Syntax for ForInit¶

Syntax for Formal¶

Syntax for FormalNoInit¶

Syntax for Formals¶

Syntax for Id¶

Syntax for Infix¶

Syntax for InfixOp¶

Syntax for Json¶

Syntax for JsonArray¶

Syntax for JsonBoolean¶

Truth values are represented using the keywords false and true.

Syntax for JsonNull¶

Syntax for JsonNumber¶

Syntax for JsonObject¶

Syntax for JsonProperty¶

Syntax for JsonString¶

Syntax for JsonValue¶

Syntax for Let¶

Syntax for LetArg¶

Syntax for LetBody¶

Syntax for LetNested¶

Syntax for LetRest¶

Syntax for List¶

Syntax for ListContent¶

Syntax for ListElement¶

Syntax for ListElements¶

Syntax for ListHole¶

Syntax for Literal¶

Syntax for MatchBranch¶

Relates a match case, e.g. a pattern, to a consequence of matching that pattern.

Syntax for MatchCase¶

There are two kinds of match cases: run-time type checks that use keyword is, and a value to match.

Syntax for Member¶

Syntax for NoPropClass¶

Syntax for Obj¶

Syntax for Pattern¶

Syntax for Postfix¶

Syntax for PostfixOp¶

Syntax for Prefix¶

Syntax for PrefixOp¶

Syntax for Prop¶

Syntax for PropClass¶

Syntax for PropName¶

Syntax for Props¶

Syntax for QuasiAst¶

Syntax for QuasiHole¶

Syntax for QuasiInner¶

Syntax for QuasiLeaf¶

Syntax for QuasiTree¶

Syntax for Quasis¶

Syntax for RawBlock¶

Syntax for RawCommaOp¶

Syntax for RegExp¶

Syntax for RegularDot¶

Syntax for ReservedWord¶

Syntax for SpecialDot¶

Syntax for Specialize¶

Syntax for Spread¶

Syntax for StringGroup¶

Syntax for StringGroupTagged¶

Syntax for StringHole¶

Syntax for StringPart¶

String interpolation¶

Strings may contain embedded expressions. When a string contains a ${ followed by an expression, followed by a }, the resulting string value is the concatenation of the content before, content from the expression, and the content after.

"foo ${ "bar" } baz"
== "foo bar baz"
// ✅

An empty interpolation contributes no characters, which means it may be used to embed meta-characters.

"$${}{}" == "\$\{\}"
// ✅

(This mostly comes in handy with tagged strings to give fine-grained control over what the tag receives.)

Empty interpolations can also be used to wrap a long string across multiple lines.

"A very long string ${
  // Breaking this string across multiple lines.
}that runs on and on"
== "A very long string that runs on and on"
// ✅

Empty interpolations also let you include spaces at the end of a line in a multi-quoted string.

"""
Line 1
Line 2 ${}
"""
== "Line 1\nLine 2 "
// ✅

Syntax for StringPartRaw¶

Parallels [ProductionNames.StringPart] but emits a [ValuePart] instead of routing a string token to [lang.temper.lexer.unpackQuotedString] so that the tag expression gets string content without escape sequences decoded.