This EEP extends the begin ... end
expression to make it a construct
usable for control flow and value-based error handling based on pattern
matching.
This introduces begin ... else ... end
along with a new contextual use
of the <-
operator to replace or simplify deeply-nested case ... end
expressions, and prevent using exceptions for control flow.
This document has been placed in the public domain.
The current syntax for a begin ... end
expression is:
begin
Exprs
end
The expression does not have a restricted scope, and is mostly used to
group multiple distinct expressions as a single block. We propose a new
type of expressions (denoted MatchOrReturnExprs
), only valid within a
begin ... end
expression:
begin
Exprs | MatchOrReturnExprs
end
MatchOrReturnExprs
are defined as having the following form:
Pattern <- Expr
This definition means that MatchOrReturnExprs
are only allowed at the
top-level of begin ... end
expressions.
The <-
operator takes the value return by Expr
and pattern matches on
it against Pattern
.
If the pattern matches, all variables from Pattern
are bound in the local
environment, and the expression is equivalent to a successful Pattern = Expr
call. If the value does not match, the begin ... end
expression returns the
failed expression directly.
A special case exists in which we extend begin ... end
into the following form:
begin
Exprs | MatchOrReturnExprs
else
Pattern -> Exprs;
...
Pattern -> Exprs
end
This form exists to capture non-matching expressions in a MatchOrReturnExprs
to handle failed matches rather than returning their value. In such a case, an
unhandled failed match will raise an else_clause
error, otherwise identical to
a case_clause
error.
This extended form is useful to properly identify and handle successful and unsuccessful matches within the same construct without risking to confuse happy and unhappy paths.
Given the structure described here, the final expression may look like:
begin
Foo = bar(), % normal exprs still allowed
{ok, X} <- f(Foo),
[H|T] <- g([1,2,3]),
...
else
{error, Y} ->
{ok, "default"};
{ok, _Term} ->
{error, "unexpected wrapper"}
end
Do note that to allow easier pattern matching and more intuitive usage,
the <-
operator should have associativity rules lower than =
, such that:
begin
X = [H|T] <- exp()
end
is a valid MatchOrReturnExprs
equivalent to the non-infix form '<-'('='(X,
[H|T]), exp())
, since reversing the priorities would give '='('<-'(X, [H|T]),
exp())
, which would create a MatchOrReturnExp
out of context and be invalid.
In a nutshell, the matching rules for the <-
operator should align closely
with the usage known in list comprehensions.
Erlang has some of the most flexible error handling available across a large number of programming languages. The language supports:
throw
, error
, exit
)
catch Exp
try ... [of ...] catch ... [after ...] end
exit/2
, and trap_exit
{ok, Val} | {error, Term}
, {ok, Val} |
false
, or ok | {error, Val}
So why should we look to add more? There are various reasons for this, including trying to reduce deeply nested conditional expressions, cleaning up some messy patterns found in the wild, and providing a better separation of concerns when implementing functions.
One common pattern that can be seen in Erlang is deep nesting of case
... end
expressions, to check complex conditionals.
Take the following code taken from Mnesia, for example:
commit_write(OpaqueData) ->
B = OpaqueData,
case disk_log:sync(B#backup.file_desc) of
ok ->
case disk_log:close(B#backup.file_desc) of
ok ->
case file:rename(B#backup.tmp_file, B#backup.file) of
ok ->
{ok, B#backup.file};
{error, Reason} ->
{error, Reason}
end;
{error, Reason} ->
{error, Reason}
end;
{error, Reason} ->
{error, Reason}
end.
The code is nested to the extent that shorter aliases must be introduced
for variables (OpaqueData
renamed to B
), and half of the code just
transparently returns the exact values each function was given.
By comparison, the same code could be written as follows with the new construct:
commit_write(OpaqueData) ->
begin
ok <- disk_log:sync(OpaqueData#backup.file_desc),
ok <- disk_log:close(OpaqueData#backup.file_desc),
ok <- file:rename(OpaqueData#backup.tmp_file, OpaqueData#backup.file),
{ok, OpaqueData#backup.file}
end.
Or, to protect against disk_log
calls returning something else than ok |
{error, Reason}
, the following form could be used:
commit_write(OpaqueData) ->
begin
ok <- disk_log:sync(OpaqueData#backup.file_desc),
ok <- disk_log:close(OpaqueData#backup.file_desc),
ok <- file:rename(OpaqueData#backup.tmp_file, OpaqueData#backup.file),
{ok, OpaqueData#backup.file}
else
{error, Reason} -> {error, Reason}
end.
The semantics of these calls are identical, except that it is now much easier to focus on the flow of individual operations and either success or error paths.
Frequent ways in which people work with sequences of failable operations include folds over lists of functions, and abusing list comprehensions. Both patterns have heavy weaknesses that makes them less than ideal.
Folds over list of functions use patterns such as those defined in posts from the mailing:
pre_check(Action, User, Context, ExternalThingy) ->
Checks =
[fun check_request/1,
fun check_permission/1,
fun check_dispatch_target/1,
fun check_condition/1],
Args = {Action, User, Context, ExternalThingy},
Harness =
fun
(Check, ok) -> Check(Args);
(_, Error) -> Error
end,
case lists:foldl(Harness, ok, Checks) of
ok -> dispatch(Action, User, Context);
Error -> Error
end.
This code requires declaring the functions one by one, ensuring the entire context is carried from function to function. Since there is no shared scope between functions, all functions must operate on all arguments.
By comparison, the same code could be implemented with the new construct as:
pre_check(Action, User, Context, ExternalThingy) ->
begin
ok <- check_request(Context, User),
ok <- check_permissions(Action, User),
ok <- check_dispatch_target(ExternalThingy),
ok <- check_condition(Action, Context),
dispatch(Action, User, Context)
end.
And if there was a need for derived state between any two steps, it would be easy to weave it in:
pre_check(Action, User, Context, ExternalThingy) ->
begin
ok <- check_request(Context, User),
ok <- check_permissions(Action, User),
ok <- check_dispatch_target(ExternalThingy),
DispatchData <- dispatch_target(ExternalThingy),
ok <- check_condition(Action, Context),
dispatch(Action, User, Context)
end.
The list comprehension hack, by comparison, is a bit more rare. In fact, it is mostly theoretical. Some things that hint at how it could work can be found in Diameter test cases or the PropEr plugin for Rebar3.
Its overal form uses generators in list comprehensions to tunnel a happy path:
[Res] =
[f(Z) || {ok, W} <- [b()],
{ok, X} <- [c(W)],
{ok, Y} <- [d(X)],
Z <- [e(Y)]],
Res.
This form doesn’t see too much usage since it is fairly obtuse and I suspect most people have either been reasonable enough not to use it, or did not think about it. Obviously the new form would be cleaner:
begin
{ok, W} <- b(),
{ok, X} <- c(W),
{ok, Y} <- d(X),
Z = e(Y),
f(Z)
end
which on top of it, has the benefit of returning an error value if one is found.
This form is not necessarily obvious at a first glance. To better
expose it, let’s take a look at some functions defined in the
release_handler
module in
OTP:
write_releases_m(Dir, NewReleases, Masters) ->
RelFile = filename:join(Dir, "RELEASES"),
Backup = filename:join(Dir, "RELEASES.backup"),
Change = filename:join(Dir, "RELEASES.change"),
ensure_RELEASES_exists(Masters, RelFile),
case at_all_masters(Masters, ?MODULE, do_copy_files,
[RelFile, [Backup, Change]]) of
ok ->
case at_all_masters(Masters, ?MODULE, do_write_release,
[Dir, "RELEASES.change", NewReleases]) of
ok ->
case at_all_masters(Masters, file, rename,
[Change, RelFile]) of
ok ->
remove_files(all, [Backup, Change], Masters),
ok;
{error, {Master, R}} ->
takewhile(Master, Masters, file, rename,
[Backup, RelFile]),
remove_files(all, [Backup, Change], Masters),
throw({error, {Master, R, move_releases}})
end;
{error, {Master, R}} ->
remove_files(all, [Backup, Change], Masters),
throw({error, {Master, R, update_releases}})
end;
{error, {Master, R}} ->
remove_files(Master, [Backup, Change], Masters),
throw({error, {Master, R, backup_releases}})
end.
At a glance, it is very difficult to clean up this code: there are 3 multi-node operations (backing up, updating, and moving release data), each of which relies on the previous one to succeed.
You’ll also notice that each error requires special handling, reverting or removing specific operations on success or on failure. This is not a simple question of tunnelling values in and out of a narrow scope.
Another thing to note is that this module, as a whole (and not just the
snippet presented here) uses throw
expressions to operate non-local
return. The actual point of return handling these is spread through
various locations in the file:
create_RELEASES/4
,
and
write_releases_1/3
for example.
The case catch Exp of
form is used throughout the file because
value-based error flow is painful in nested structures.
So let’s take a look at how we could refactor this with the new construct:
write_releases_m(Dir, NewReleases, Masters) ->
RelFile = filename:join(Dir, "RELEASES"),
Backup = filename:join(Dir, "RELEASES.backup"),
Change = filename:join(Dir, "RELEASES.change"),
begin
ok <- backup_releases(Dir, NewReleases, Masters, Backup, Change,
RelFile),
ok <- update_releases(Dir, NewReleases, Masters, Backup, Change),
ok <- move_releases(Dir, NewReleases, Masters, Backup, Change, RelFile)
end.
backup_releases(Dir, NewReleases, Masters, Backup, Change, RelFile) ->
case at_all_masters(Masters, ?MODULE, do_copy_files,
[RelFile, [Backup, Change]]) of
ok ->
ok;
{error, {Master, R}} ->
remove_files(Master, [Backup, Change], Masters)
{error, {Master, R, backup_releases}}
end.
update_releases(Dir, NewReleases, Masters, Backup, Change) ->
case at_all_masters(Masters, ?MODULE, do_write_release,
[Dir, "RELEASES.change", NewReleases]) of
ok ->
ok;
{error, {Master, R}} ->
remove_files(all, [Backup, Change], Masters),
{error, {Master, R, update_releases}}
end.
move_releases(Dir, NewReleases, Masters, Backup, Change, RelFile) ->
case at_all_masters(Masters, file, rename, [Change, RelFile]) of
ok ->
remove_files(all, [Backup, Change], Masters),
ok;
{error, {Master, R}} ->
takewhile(Master, Masters, file, rename, [Backup, RelFile]),
remove_files(all, [Backup, Change], Masters),
{error, {Master, R, move_releases}}
end.
The only reasonable way to rewrite the code was to extract all three major multi-node operations into distinct functions.
The improvements are:
throw()
would be needed in write_release_m
,
therefore separating the flow control details from specific function
implementations.As a control experiment, let’s try reusing our shorter functions with the previous flow:
%% Here is the same done through exceptions:
write_releases_m(Dir, NewReleases, Masters) ->
RelFile = filename:join(Dir, "RELEASES"),
Backup = filename:join(Dir, "RELEASES.backup"),
Change = filename:join(Dir, "RELEASES.change"),
try
ok = backup_releases(Dir, NewReleases, Masters, Backup, Change,
RelFile),
ok = update_releases(Dir, NewReleases, Masters, Backup, Change),
ok = move_releases(Dir, NewReleases, Masters, Backup, Change, RelFile)
catch
{error, Reason} -> {error, Reason}
end.
backup_releases(Dir, NewReleases, Masters, Backup, Change, RelFile) ->
case at_all_masters(Masters, ?MODULE, do_copy_files,
[RelFile, [Backup, Change]]) of
ok ->
ok;
{error, {Master, R}} ->
remove_files(Master, [Backup, Change], Masters)
throw({error, {Master, R, backup_releases}})
end.
update_releases(Dir, NewReleases, Masters, Backup, Change) ->
case at_all_masters(Masters, ?MODULE, do_write_release,
[Dir, "RELEASES.change", NewReleases]) of
ok ->
ok;
{error, {Master, R}} ->
remove_files(all, [Backup, Change], Masters),
throw({error, {Master, R, update_releases}})
end.
move_releases(Dir, NewReleases, Masters, Backup, Change, RelFile) ->
case at_all_masters(Masters, file, rename, [Change, RelFile]) of
ok ->
remove_files(all, [Backup, Change], Masters),
ok;
{error, {Master, R}} ->
takewhile(Master, Masters, file, rename, [Backup, RelFile]),
remove_files(all, [Backup, Change], Masters),
throw({error, {Master, R, move_releases}})
end.
Very little changes in the three distributed functions. However, the weakness
of this approach is that we have intimately tied implementation details of the
small functions to their parent’s context. This makes it hard to reason about
these functions in isolation or to reuse them in a different context.
Furthermore, the parent function may capture throws
not intended for it.
It is my opinion that using value-based flow control, through similar refactorings, yields safer and cleaner code, which also happens to have far more reduced levels of nesting. It should therefore be possible to express more complex sequences of operations without making them any harder to read, nor reason about in isolation.
That is in part due to the nesting, but also because we take a more compositional approach, where there is no need to tie local functions’ implementation details to the complexity of their overall pipeline and execution context.
It is also the best way to structure code in order to handle all exceptions and to provide the context they need as close as possible to their source, and as far as possible from the integrated flow.
This section will detail the decision-making behind this EEP, including:
else
Blockbegin ... end
as a construct and its scope{badunwrap, Val}
as a default exceptionThere’s a lot of content to cover here.
Multiple languages have value-based exception handling, many of which have a strong functional slant.
The most famous case is possibly Haskell with the Maybe
monad, which
uses either Nothing
(meaning the computation returned nothing) or
Just x
(their type-based equivalent of {ok, X}
). The union of both
types is denoted Maybe x
. The following examples are taken from
Haskell/Understanding
monads/Maybe.
Values for such errors are tagged in functions as follows:
safeLog :: (Floating a, Ord a) => a -> Maybe a
safeLog x
| x > 0 = Just (log x)
| otherwise = Nothing
Using the type annotations directly, it is possible to extract values (if any) through pattern matching:
zeroAsDefault :: Maybe Int -> Int
zeroAsDefault mx = case mx of
Nothing -> 0
Just x -> x
One thing to note here is that as long as you are not able to find a
value to substitute for Nothing
or that you cannot take a different
branch, you are forced to carry that uncertainty with you through all
the types in the system.
This is usually where Erlang stops. You have the same possibilities (albeit dynamically checked), along with the possibility of transforming invalid values into exceptions.
Haskell, by comparison, offers monadic operations and its do notation to abstract over things:
getTaxOwed name = do
number <- lookup name phonebook
registration <- lookup number governmentDatabase
lookup registration taxDatabase
In this snippet, even though the lookup
function returns a Maybe x
type, the do notation abstracts away the Nothing
values, letting the
programmer focus on the x
part of Just x
. Even though the code is
written as if we can operate on discrete value, the function
automatically re-wraps its result into Just x
and any Nothing
value
just bypasses operations.
As such, the developer is forced to acknowledge that the whole function’s flow is conditional to values being in place, but they can nevertheless write it mostly as if everything were discrete.
OCaml supports exceptions, with constructs such as raise (Type
"value")
to raise an exception, and try ... with ...
to handle them.
However, since exceptions wouldn’t be tracked by the type system,
maintainers introduced a Result
type.
The type is defined as
type ('a, 'b) result =
| Ok of 'a
| Error of 'b
which is reminiscent of Erlang’s {ok, A}
and {error, B}
. OCaml users
appear to mostly use pattern matching, combinator libraries, and monadic
binding to deal with value-based error handling, something similar to
Haskell’s usage.
Rust defines two types of errors: unrecoverable ones (using panic!
)
and recoverable ones, using the Result<T, E>
values. The latter is of
interest to us, and defined as:
enum Result<T, E> {
Ok(T),
Err(E),
}
Which would intuitively translate to Erlang terms {ok, T}
and {error,
E}
. The simple way to handle these in Rust is through pattern matching:
let f = File::open("eep.txt");
match f {
Ok(file) => do_something(file),
Err(error) => {
panic!("Error in file: {:?}", error)
},
};
Specific error values have to be well-typed, and it seems that the Rust community is still debating implementation details about how to best get composability and annotations within a generic type.
However, their workflow for handling these is well-defined already. This
pattern matching form has been judged too cumbersome. To automatically
panic on error values, the .unwrap()
method is added:
let f = File::open("eep.txt").unwrap();
In Erlang, we could approximate this with:
unwrap({ok, X}) -> X;
unwrap({error, T}) -> exit(T).
F = unwrap(file:open("eep.txt", Opts)).
Another construct exists to return errors to caller code more directly,
without panics, with the ?
operator:
fn read_eep() -> Result<String, io::Error> {
let mut h = File::open("eep.txt")?;
let mut s = String::new();
h.read_to_string(&mut s)?;
Ok(s)
}
Any value Ok(T)
encountering ?
is unwrapped. Any value Err(E)
encountering ?
is returned to the caller as-is, as if a match
with
return
had been used. This operator however requires that the
function’s type signature use the Result<T, E>
type as a return value.
Prior to version 1.13, Rust used the try!(Exp)
macro to the same
effect, but found it too cumbersome. Compare:
try!(try!(try!(foo()).bar()).baz())
foo()?.bar()?.baz()?
Swift supports exceptions, along with type annotations declaring that a
function may raise exceptions, and do ... catch
blocks.
There is a special operator try?
which catches any thrown exception
and turns it into nil
:
func someThrowingFunction() throws -> Int {
// ...
}
let x = try? someThrowingFunction()
Here x
can either have a value of Int
or nil
. The data flow is
often simplified by using let
assignments in a conditional expression:
func fetchEep() -> Eep? {
if let x = try? fetchEepFromDisk() { return x }
if let x = try? fetchEepFromServer() { return x }
return nil
}
Go has some fairly anemic error handling. It has panics, and error values. Error values must be assigned (or explicitly ignored) but they can be left unchecked and cause all kinds of issues.
Nevertheless, Go exposed plans for new error handling in future versions, which can be interesting.
Rather than changing semantics of their error handling, Go designers are mostly considering syntactic changes to reduce the cumbersome nature of their errors.
Go programs typically handled errors as follows:
func main() {
hex, err := ioutil.ReadAll(os.Stdin)
if err != nil {
log.Fatal(err)
}
data, err := parseHexdump(string(hex))
if err != nil {
log.Fatal(err)
}
os.Stdout.Write(data)
}
The new proposed mechanism looks as follows:
func main() {
handle err {
log.Fatal(err)
}
hex := check ioutil.ReadAll(os.Stdin)
data := check parseHexdump(string(hex))
os.Stdout.Write(data)
}
The check
keyword asks to implicitly check whether the second return
value err
is equal to nil
or not. If it is not equal to nil
, the
latest defined handle
block is called. It can return the result out to
exit the function, repair some values, or simply panic, to name a few
options.
Elixir has a slightly different semantic approach to error handling compared
to Erlang. Exceptions are discouraged for control flow (while Erlang
specifically uses throw
for it), and the with
macro is introduced:
with {:ok, var} <- some_call(),
{:error, _} <- fail(),
{:ok, x, y} <- parse_name(var)
do
success(x, y, var)
else
{:error, err} -> handle(err)
nil -> {:error, nil}
end
The macro allows a sequence of pattern matches, after which the ˋdo …ˋ block is called. If any of the pattern matches fails, the failing value gets re-matched in the optional ˋelse … end` section.
This is the most general control flow in this document, being fully flexible with regards to which values it can handle. This was done in part because there is not a strong norm regarding error or valid values in either the Erlang nor Elixir APIs, at least compared to other languages here.
This high level of flexibility has been criticized in some instances as being a bit confusing: it is possible for users to make error-only flows, success-only flows, mixed flows, and consequently the ˋelseˋ clause can become convoluted.
The OK library was released to
explicitly narrow the workflow to well-defined errors. It supports three forms,
the first of which is the for
block:
OK.for do
user <- fetch_user(1)
cart <- fetch_cart(1)
order = checkout(cart, user)
saved_order <- save_order(order)
after
saved_order
end
It works by only matching on {:ok, val}
to keep moving forwards when
using the <-
operator: the fetch_user/1
function above must return
{:ok, user}
in order for the code to proceed. The =
operator is
allowed for pattern matches the same way it usually does within Elixir.
Any return value that matches {:error, t}
ends up returning directly
out of the expression. The after ... end
section takes the last value
returned, and if it isn’t already in a tuple of the form {:ok val}
, it
wraps it as such.
The second variant is the try
block:
OK.try do
user <- fetch_user(1)
cart <- fetch_cart(1)
order = checkout(cart, user)
saved_order <- save_order(order)
after
saved_order
rescue
:user_not_found -> {:error, missing_user}
end
This variant will capture exceptions as well (in the rescue
block),
and will not re-wrap the final return value in the after
section.
The last variant for the library is the pipe:
def get_employee_data(file, name) do
{:ok, file}
~>> File.read
~> String.upcase
end
The goal of this variant is to simply thread together operations that
could result in either a success or error. The ~>>
operator matches
and returns an {:ok, term}
tuple, and the ~>
operator wraps a value
into an {:ok, term}
tuple.
In Erlang, true
and false
are regular atoms that only gained special
status through usage in boolean expressions. It would be easy to think
that more functions would return yes
and no
were it not from control
flow constructs.
Similarly, undefined
has over years of use become a kind of default
“not found” value. Values such as nil
, null
, unknown
, undef
,
false
and so on have seen some use, but a strong consistency in format
has ended up aligning the community on one value.
When it comes to return values for various functions, {ok, Term}
is
the most common one for positive results that need to communicate a
value, ok
for positive results with no other value than their own
success, and {error, Term}
is most often uses for errors. Pattern
matching and assertions have enforced that it is easy to know whether a
call worked or not by its own structure.
However, many success values are still larger tuples: {ok, Val,
Warnings}
, {ok, Code, Status, Headers, Body}
, and so on. Such
variations are not problematic on their own, but it would likely not
hurt too much either to use {ok, {Val, Warnings}}
or {ok, {Code,
Status, Headers, Body}}
.
While using more standard forms could lead to easier generalizations and abstractions that can be applied to community-wide code. By choosing specific formats for control flow on value-based error handling, we would explicitly encourage this form of standardization.
That being said, the variety of formats existing and the low amount of strict values being used would mean that forcing normalization calls for a potential loss of flexibility in future language decisions. For example, EEP-54—completed before final revisions of this RFC—tries to add new forms of context to error reports, and various libraries already rely on these richer patterns.
It is therefore the opinion of the OTP technical board that we should not
normalize error-return values. As such, an approach closer to Elixir’s with
has been proposed, although this EEP’s approach is more general in terms of
sequences of acceptable expressions and their composition.
Avoiding normalization on error and good values introduces the need for the
else ... end
sub-block to prevent edge cases.
Let’s look with the following type of expression as an explanation why:
begin
{ok, {X,Y}} <- id({ok, {X,Y}})
...
end
While this mechanism is fine to handle skipping pattern, it has some problematic weaknesses in the context of error handling.
One example of this could be taken from the OTP pull request that adds new return value to packet reading based on inet options: #1950.
This PR adds a possible value for packet reception to the prior form:
{ok, {PeerIP, PeerPort, Data}}
To ask make it possible to alternatively get:
{ok, {PeerIP, PeerPort, AncData, Data}}
Based on socket options set earlier. So let’s put it in context for the current proposal:
begin
{ok, {X,Y}} <- id({ok, {X,Y}}),
{ok, {PeerIP, PeerPort, Data}} <- gen_udp:recv(...),
...
end
Since we force a return on any non-matching value, the whole expression, if
the socket is misconfigured to return AncData
, would return {ok, {PeerIP,
PeerPort, AncData, Data}}
on a failure to match.
Basically, an unexpected but good result could be returned from a
function using the begin ... end
construct, which would look like a
success while it was actually a complete failure to match and handle the
information given. This is made even more ambiguous when data has the
right shape and type, but a set of bound variables ultimately define
whether the match succeeds or fails (in the case of a UDP socket,
returning values that comes from the wrong peer, for example).
In worst cases, It could let raw unformatted data exit a conditional
pipeline with no way to detect it after the fact, particularly if later
functions in begin ... end
apply transformations to text, such as
anonymizing or sanitizing data. This could be pretty unsafe
and near impossible to debug well.
Think for example of:
-spec fetch() -> {ok, iodata()} | {error, _}.
fetch() ->
begin
{ok, B = <<_/binary>>} <- f(),
true <- validate(B),
{ok, sanitize(B)}
end.
If the value returned from f()
turns out to be a list (say it’s a
misconfigured socket using list
instead of binary
as an option), the
expression will return early, the fetch()
function will still return
{ok, iodata()}
but you couldn’t know as a caller whether it is the
transformed data or non-matching content. It would not be obvious to
most developers either that this could represent a major security risk
by allowing unexpected data to be seen as clean data.
This specific type of error is in fact possible in Elixir, but no such
warning appears to have been circulating within its community so far.
The issue is to be handled with an else
block which this proposal
reuses to clamp down on unexpected values:
-spec fetch() -> {ok, iodata()} | {error, _}.
fetch() ->
begin
{ok, B = <<_/binary>>} <- f(),
true <- validate(B),
{ok, sanitize(B)}
else
false -> {error, invalid_data};
{error, R} -> {error, R}
end.
Here misconfigured sockets won’t result in unchecked data passing through
your app; any invalid use case is captured, and if the value for B
turns
out to be a list, an else_clause
error is raised with the bad value.
Unless the clause is mandatory (it is not in Elixir and we do not plan it here
either for compatibility reasons with existing begin ... end
expressions),
this level of additional matching is purely optional; the developer has no
obvious incentive to go and handle these errors, and if they do, the exception
raised will be through a missing clause in the else
section, which will
obscure its origin and line number.
We will therefore have to rely on education and documentation (along with type analysis) to prevent such issues from arising in the future.
These problems would not exist with normalized error and return values as those used in statically-typed languages, but since we do not intend to normalize values, the else
block is a necessary workaround.
begin ... end
Expressions #
Abstractions over error flow requires to define a scope limiting the
way flow is controlled. Before choosing the begin ... end
expression,
the following items needed consideration:
begin ... end
else
keywordIn the languages mentioned earlier, two big error handling categories seem to emerge.
The first group of language seems to track their error handling at the
function level. For example, Go uses return
to return early from the
current function. Swift and Rust also scope their error handling
abstractions to the current function, but they also make use of their
type signatures to keep information about the control flow
transformations taking place. Rust uses the Result<T, E>
type
signature to define what operations are valid, and Swift asks of
developers that they either handle the error locally, or annotate the
function with throws
to make things explicit.
On the other hand, Haskell’s do notation is restricted to specific expressions, and so are all of Elixir’s mechanisms.
Erlang, Haskell, and Elixir all primarily use recursion as an iteration
mechanism, and (outside of Haskell’s monadic constructs) do not support
return
control flow; it is conceptually more difficult for a return
(or break
) to be useful when iteration requires recursion:
“returning” by exiting the current flow may not bail you out of what the
programmer might consider a loop, for example.
Instead, Erlang would use throw()
exceptions as a control flow
mechanism for non-local return, along with a catch
or a try ...
catch
. Picking a value-based error handling construct that acts at the
function level would not necessarily be very interesting since almost
any recursive procedure would still require using exceptions.
As such, it feels simpler to use a self-contained construct built to specifically focus on sequences of operations that contain value-based errors.
Prior attempts at abstracting value-based error handling in Erlang overloaded special constructs with parse transforms in order to provide specific workflows.
For example, the fancyflow
library tried to abstract the following code:
sans_maybe() ->
case file:get_cwd() of
{ok, Dir} ->
case
file:read_file(
filename:join([Dir, "demo", "data.txt"]))
of
{ok, Bin} ->
{ok, {byte_size(Bin), Bin}};
{error, Reason} ->
{error, Reason}
end;
{error, Reason} ->
{error, Reason}
end.
as:
-spec maybe() -> {ok, non_neg_integer()} | {error, term()}.
maybe() ->
[maybe](undefined,
file:get_cwd(),
file:read_file(filename:join([_, "demo", "data.txt"])),
{ok, {byte_size(_), _}}).
And Erlando would replace:
write_file(Path, Data, Modes) ->
Modes1 = [binary, write | (Modes -- [binary, write])],
case make_binary(Data) of
Bin when is_binary(Bin) ->
case file:open(Path, Modes1) of
{ok, Hdl} ->
case file:write(Hdl, Bin) of
ok ->
case file:sync(Hdl) of
ok ->
file:close(Hdl);
{error, _} = E ->
file:close(Hdl),
E
end;
{error, _} = E ->
file:close(Hdl),
E
end;
{error, _} = E -> E
end;
{error, _} = E -> E
end.
With monadic constructs in list comprehensions:
write_file(Path, Data, Modes) ->
Modes1 = [binary, write | (Modes -- [binary, write])],
do([error_m ||
Bin <- make_binary(Data),
Hdl <- file:open(Path, Modes1),
Result <- return(do([error_m ||
file:write(Hdl, Bin),
file:sync(Hdl)])),
file:close(Hdl),
Result]).
Those cases specifically aimed for a way to write sequences of operations where pre-defined semantics are bound by a special context, but are limited to overloading constructs rather than introducing new ones.
By comparison, most of Erlang’s control flow expressions follow similar structures. See the following most common ones:
case ... of
Pattern [when Guard] -> Expressions
end
if
Guard -> Expressions
end
begin
Expressions
end
receive
Pattern [when Guard] -> Expressions
after % optional
IntegerExp -> Expressions
end
try
Expressions
of % optional
Pattern [when Guard] -> Expressions
catch % optional
ExceptionPattern [when Guard] -> Expressions
after % optional
Expressions
end
It therefore logically follows that if we were to add a new construct, it should be of the form
<keyword>
...
end
The questions remaining are: which keyword to choose, and which clauses to support.
begin ... end
#
Initially, a format similar to Elixir’s with
expression was being
considered:
<keyword>
Expressions | UnwrapExpressions
of % optional
Pattern [when Guard] -> Expressions
end
With this construct, the basic <keyword> ... end
form would follow the
currently proposed semantics, but the of ...
section would allow
pattern matching on any return value from the expression, whether
{error, Reason}
or any non-exception value returned by the last
expression in the main section.
This form would be in line with what try ... of ... catch ... end
allows: once the main section is covered, more work can be done within
the same construct.
However, try ... of ... catch ... end
has a specific reason for
introducing the patterns and guards: protected code impacting tail
recursion.
In a loop such as:
map_nocrash(_, []) -> [];
map_nocrash(F, [H|T]) ->
try
F(H)
of
Val -> [Val | map_nocrash(F, T)]
catch
_:_ -> map_nocrash(F, T)
end.
The of
section allows to continue doing work in the case no exception
has happened, without having to protect more than the current scope of
the function, nor preventing tail-recursion by forcing a presence of
each iteration on the stack.
No such concerns exist for value-based error handling, and while the
of ... end
section might be convenient at times, it is strictly not
necessary for the construct to be useful.
What was left was to choose a name. Initially, the <keyword>
value
chosen was maybe
, based on the Maybe monad. The problem is that
introducing any new keyword carries severe risks to backwards
compatibility.
For example, all of the following words were considered:
======= ================= =========================================
Keyword Times used in OTP Rationale
as a function
======= ================= =========================================
maybe 0 can clash with existing used words,
otherwise respects the spirit
option 88 definitely clashes with existing code
opt 68 definitely clashes with existing code
check 49 definitely clashes with existing code
let 0 word is already reserved and free, but
makes no sense in context
cond 0 word is already reserved and free, may
make sense, but would prevent the
addition of a conditional expression
given 0 could work, kind of respects the context
when 0 reserved for guards, could hijack in new
context but may be confusing
begin 0 carries no conditional meaning, mostly
free for overrides
Initially, this proposal expected to use the maybe
keyword:
maybe
Pattern <op> Exp,
...
of
Pattern -> Exp % optional
end
but for the reasons mentioned in the previous section, the of ...
section became non-essential.
Then, with the strong requirements for backwards compatibility making it
difficult to introduce new keywords, along with the possibility to reuse
begin
without changing any of its current behavior, this form became the
most interesting one.
The term begin
is also reminiscent of transactions and abortive
contexts, which means that although not an ideal fit for value-based
error flow, it is also not entirely outlandish and could accept the new
added optional semantics without being too out of place.
else
keyword #
The first step here was looking at all the existing alternative reserved
keywords: of
, when
, cond
, catch
, after
.
None of these actually conveys the sense of requiring an alternative clause to
the construct, and so we require adding a new one. The else
keyword is
tempting if only because it opens the door to introducing it as a reserved word
in if
expressions at a later date.
A quick look at the OTP code base to be sure seems to return no else()
function and should therefore be relatively safe to use in general.
In order to form MatchOrReturnExprs
, there is a need for a mechanism to
introduce pattern matching with distinct semantics from regular pattern
matching.
A naive parse transform approach with fake function calls would be the most basic way to go:
begin
match_or_return(Pattern, Exp),
% variables bound in Pattern are available in scope
...
end
However, this would introduce pattern matches in non-left-hand-side positions and make nesting really weird to deal with without exposing parse transform details and knowing how the code is translated.
A prefix keyword such let <Pattern> = <Exp>
could also be used.
Such keywords unfortunately suffer the same issues as maybe
would
have, and let
typically has different implications.
An infix operator seems like a good fit since pattern matching already uses them in multiple forms:
=
is used for pattern matches. Overloading it in error flow would
prevent regular matching from being used:=
is used for maps; using it could work, but would certainly be
confusing when handling nested maps in a pattern<-
could make sense. It is already restricted in scope to list and
binary comprehensions and would therefore not clash nor be confused.
The existing semantics of the operator imply a literal
pattern match working like a filter, which is what we are looking for.<=
same as <-
but for binary generatorsThe <-
operator makes the most sense and shouldn’t be confusing for
anyone.
For completeness’s sake, I also checked for alternative operators
in a prior version of this EEP that introduced prescriptive values
for {ok, T} | {error, R}
, which had distinct semantics:
======= ===========================================================
Operator Description
======= ===========================================================
#= no clash with other syntax (maps, records, integers), no
clash with abstract patterns EEP either.
!= No clash with message passing, but is sure to annone used
to C-style inequality checks
<~ Works with no known conflict; shouldn't clash with ROK's
frame proposals (uses infix ~ and < > as delimiters).
<| reverse pipe operator. No obvious clash either
The <-
operator from list comprehensions is the most adequate
option, both in terms of simplicity and cognitive costs.
Within the expected usage of the unwrap expressions, the <~
operator
needs to have a precedence rule such that:
X = {Y,X} <- <Exp>
Is considered a valid pattern match operation with X = {Y,X}
being the
whole left-hand-side pattern, such that operation priorities are:
lhs <- rhs
Instead of
lhs = rhs <- <...>
In all other regards, the precedence rules should be the same as =
in
order to provide the most unsurprising experience possible.
Other approaches were considered in making this proposal, and ultimately disregarded.
An earlier version of this document simply used:
begin
Foo = bar(),
X <~ id({ok, 5}),
[H|T] <~ id({ok, [1,2,3]}),
...
end
Which implicitly unpacked {ok, T} = f()
by calling T <~ f()
, and forced
all acceptable non-matching values to be of the form {error, T}.
To make the form useful to most existing code, it also required some magic
everyone (myself included) didn’t very much like, for which _ <~ f()
would
implicitly succeed if the return value for f()
was ok
.
This was judged to be too magical, and not necessarily a ton of existing
Erlang code would have benefited from the form since ok
is often returned
for successful functions without an extra value. A stronger prescriptiveness
of the form {ok, undefined}
(to replicate Rust’s Ok(())
) would have
been required to avoid the magic, and would have felt very unidiomatic.
with
#
The Elixir approach is fairly comprehensive, and rather powerful. Rather than handling success or errors, it generalizes over pattern matching as a whole, as we do here.
The one difference is that Elixir’s with
expression forces all
conditionals to happen first, with a do
block for the free-form
expressions that follow:
with dob <- parse_dob(params["dob"]),
name <- parse_name(params["name"])
do
%User{dob: dob, name: name}
else
err -> err
end
The Erlang form introduced in this document is more general since it
allows mixing MatchOrReturnExprs
and regular expressions throughout,
without the need for a general do
block.
The Erlang form does imply a likely more complex set of rewriting rules when translating from the AST form to Core Erlang. It should be possible to purely rewrite in existing Core Erlang terms although the end result may not look like the original code at all
cond
and cond let
#
Anthony Ramine recommended looking into reusing the already reserved
cond
and let
keywords. He mentioned Rust planning something based on
these and how it could be ported to Erlang based on his prior work on
supporting the cond
construct within the language.
The proposed mechanism would look like:
cond
X > 5 -> % regular guard
Exp;
f() < 18 -> % function used in guard, as originally planned
Exp;
let {ok, Y} = exp(), Y < 5 ->
Exp
end
The last clause would allow Y
to be used in its own branch only if it
matches and all guards succeed; if the binding fails, a switch is
automatically made to the next branch.
As such, more complex sequences of operations could be covered as:
cond
let {ok, _} = call1(),
let {ok, _} = call2(),
let Res = call3() ->
Res;
true ->
AlternativeBranch
end
This mechanism is, in my opinion, worth exploring and maybe adding to the language, but on its own does not adequately solve error handling flow issues since errors cannot be extracted easily from failing operations.
Auto-wrapping return values is something the Elixir’s OK
library does,
as well as Haskell’s do notation, but that neither Rust nor Swift does.
It seems that there is no very clear consensus on what could be done.
Thus, for the simplicity of the implementation and backwards
compatibility of the begin ... end
expression, just returning the
value as-is without auto-wrapping seems sensible, particularly
since we do not prescribe tuple formats for handled values.
It would therefore be up to the developer to just return whatever value best matches their function’s type signature, making easier to still integrate return values with the system they have.
It also lets sequences of operations potentially return ok
on success,
even if their individual functions returned values such as true
, for
example, rather than {ok, true}
.
The exception format proposed here is {else_clause, Value}
. This format
is chosen following Erlang/OTP standards:
if_clause
{case_clause, Val}
function_clause
(the value is provided in the stacktrace){badmatch, Val}
catch
block and receive
expressions do not raise
anything explicitlySince case_clause
is functionally the closest exceptions and that it carries
a value, we choose to replicate the same form here.
The reason else_clause
is chosen over begin_clause
because the else
block
could arguably be used in other constructs in the future, and constraining the
exception to the block’s name itself is likely more future-proof.
The possibility of an early exit from a begin ... end
expression
means that variables declared within its scope are now potentially
unsafe to use outside of it.
This is a change of behaviour that brings begin
in line with the
variables bound within a case ... end
branch, a try/catch
clause, or
a receive ... end
branch. The same is true of the else
block.
This lack of safety only needs to be started at the first MatchOrReturnExpr
encountered, since all variables bound before respect the same semantics as the
existing begin ... end
expression. If this analysis is done rather than just
declaring all variables as unsafe wholesale, then there is no backwards
compatibility concern to be had.
The need for the <-
operator to be used in a new context means code built
with support for the new expressions won’t be portable to older Erlang
releases. However, if all the code is rewritten after the AST and reuses
existing core Erlang components, built BEAM artifacts should work on older
versions fine. This is, however, not a supported use case by the OTP team.
No reference implementation is usually required at this step. One is to be developed at a later point in time.