May 31, 2019

Why Pipelines are Useful

A cross-language exploration of function composition

This post is part of a series on functional programming:

At the time of writing, there is an ongoing discussion in ECMAScript Technical Committee 39 (TC39) regarding the implementation of a pipeline operator. At the same time, there has been an open RFC in the PHP community regarding a pipe operator since 2016. It seems that multiple communities would like a pipeline operator. But why?

In this post, we will explore which problems a pipeline operator would solve and how other languages deal with these problems.

A case of confusion

Imagine we have an function called “confuse”, in which a word is reversed, capitalized and then exclaimed:

// Prints: "Looc!"
console.log(confuse("cool"));

We can implement each transformation in JavaScript separately:

const reverse = word => word.split("").reverse().join("");
const capitalize = word => word[0].toUpperCase() + word.substring(1);
const exclaim = word => word + "!";

Interesting to note is that these definitions are all unary functions: they take only single argument.

Then, we can write our “confuse” function:

const confuse = word => {
    const reversed = reverse(word);
    const capitalized = capitalize(reversed);
    const exclaimed = exclaim(capitalized);

    return exclaimed;
};

We first define every transformation function separately and then we use the output of one function as the result of the other. This chaining can be simplified because of the unary nature of the functions. Of course, we do not have to introduce temporary variables for the result, we could compose our function using the transformations.

const confuse = word => exclaim(capitalize(reverse(word)));

This is actually surprisingly readable. In the context of our program, to confuse means to exclaim something, capitalize it and reverse it. Right?

Not quite. It means: first reverse something, then capitalize it and finally exclaim it. The order of evaluation is different from the reading order! The innermost (or rightmost) function is executed first. This does not fit everyone’s mental model. Some people like to read code from left-to-right, especially if their native tongue is a left-to-right language. For those who are used to right-to-left language, or are more familiar with programming or mathematics, this could be less of an issue.

Another thing that could be improved is that we have to explicitly pass a parameter to the innermost function. If I say: to confuse
means reverse, capitalize and exclaim the type or meaning of the argument should not have to be specified: the confuse function is a composition of these transformation functions. This style of programming is called point-free programming, points being the explicit reference to the arguments of the function.

A pipeline operator could actually solve both issues:

the arguments for the functions do not have to be named (point-free)
the composition order is the same as the application order

Approaches in other languages

Before we dive in and look at how we can write our function in modern JavaScript using pipes or a pipeline, we will take a look at some other languages. Two of them are functional (Haskell, Elixir) and one of them is mixed (D). How can we do (ordered) composition of unary functions in these languages?

Haskell’s composition

There is no pipe or pipeline operator in Haskell. However, because Haskell supports operator overloading, you could create your own. In fact, the Flow package offers a couple of operators with the purpose of writing “more understable Haskell”.

The community has not unanimously accepted this as the language and the culture prefer composition and application. The pipe operator is seen as non-idiomatic. This is somewhat understandable given the more mathy background of most Haskell programmers, but – as a software engineer – I would say one should go with the most maintainable, easiest-to-understand solution. If a package like Flow helps with that, go for it. Alternatively, one could take a look at the more idiomatic way of programming using arrows, more specifically the >>> operator.

In Haskell we would write our transformation functions (including an optional explicit type declaration) like this:

import Data.Char (toUpper)

capitalize :: String -> String
capitalize (head:tail) = toUpper head : tail

exclaim :: String -> String
exclaim word = word ++ "!"

Note that we did not define the reverse function, because Haskell has this built in. The capitalize function deconstructs a string (a list of characters) into its first element (the head) and its other elements (the tail) and we define capitalize as applying : (the cons operator ; prepending an element to a list) to the uppercased head and the tail of its argument. The exclaim function works by concatenating (++) an exclamation mark to a word.

We could write this function in a more point-free manner, by rewriting the infix ++ operator as a prefix:

exclaim :: String -> String
exclaim = (++) "!"

This does not work, as it partially applies “!” as its first argument. So it adds the second argument to “!”. Or in other words, it prepends “!” to whatever the second argument will be. If only there was a way to flip the arguments. Luckily, there is:

exclaim :: String -> String
exclaim = flip (++) "!"

Now that we have defined our functions, we can combine them into our confuse function:

import Data.Char (toUpper)

capitalize :: String -> String
capitalize (head:tail) = toUpper head : tail

exclaim :: String -> String
exclaim = flip (++) "!"

confuse :: String -> String
confuse word = exclaim (capitalize (reverse word))

main = do
    -- prints "!Looc"
    putStrLn $ confuse "cool"

The definition of confuse is comparable to how we did it in JavaScript: nesting. It would be cool if we could lose all those parentheses and our argument. Haskell has the composition operator (.) for that.

confuse :: String -> String
confuse = exclaim . capitalize . reverse

Confusingly, being point-free in Haskell often means adding more points. “Point-free” means that the points (or: arguments) of the space on which the function acts are not explicitly mentioned. Keep in mind that composition is evaluated from right to left.

If we wanted to write our own pipe operator we could do so by flipping the arguments of our composition operator:

(|>) :: (a -> b) -> (b -> c) -> a -> c
(|>) = flip (.)

confuse :: String -> String
confuse = reverse |> exclaim |> capitalize

Recall that using a pipeline operator is, to some, seen as a less idiomatic way of writing Haskell. I would advise to stick with whatever the project’s and team’s conventions are.

Elixir’s pipe operator

In Elixir, there is a pipe operator. The same operator can be found in other languages like F#, Elm and Julia.

Using the pipe operator, we can write our transform functions as follows:

capitalize = fn word -> 
  word 
  |> String.slice(0,1)
  |> String.upcase()
  |> (fn head -> head <> String.slice(word, 1..-1) end).()
end

exclaim = fn word -> word <> "!" end

Elixir already has a String.reverse function, so we do not have to define one. Our capitalize function takes a single argument and pipes it to a function that takes the head. The head is piped to a function that uppercases it. The result is passed into an inline anonymous function that prepends the tail of the initial argument. The exclaim function concatenates our word with “!”.

We can use a short-hand syntax (&) to write our anonymous functions in a less explicit style, although we are referencing arguments by their position:

capitalize = fn word -> 
  word 
  |> String.slice(0,1)
  |> String.upcase()
  |> (&(&1 <> String.slice(word, 1..-1))).()
end

exclaim = &(&1 <> "!")

We can then write our code as:

capitalize = fn word -> 
  word 
  |> String.slice(0,1)
  |> String.upcase()
  |> (&(&1 <> String.slice(word, 1..-1))).()
end

exclaim = &(&1 <> "!")

confuse = fn word -> 
  word
  |> String.reverse()
  |> capitalize.()
  |> exclaim.()
end

# Prints "Looc!"
IO.puts(confuse.("cool"))

To be fair, although I like Elixir’s feature set, I am not that experienced in it. There probably are more elegant ways of defining our transformation functions.

D’s uniform function call synctax (UFCS)

In D, there are several ways of creating our transform functions. Let’s create some pure functions:

import std.stdio;
import std.ascii;
import std.algorithm.mutation;
import std.functional;

pure string reversed(string word) 
{
    return reverse(dup(word));
}

pure string capitalize(string word) 
{
    return toUpper(word[0]) ~ word[1 .. $];
}

pure string exclaim(string word) 
{
    return word ~ "!";
}

We cannot reverse an (immutable) string, so we have to duplicate (dup) it, before applying the reverse function from the std.algorithm.mutation module. The capitalize utilizes the toUpper function from the std.ascii module.

We can compose these functions using the compose! template from the std.functional module:

alias confuse = compose!(exclaim, capitalize, reversed);

void main()
{
    // Prints: "Looc!"
    writeln(confuse("cool"));
}

We can also pipe them using pipe! from the same module:

alias confuse = pipe!(reversed, capitalize, exclaim);

void main()
{
    // Prints: "Looc!"
    writeln(confuse("cool"));
}

It may not surprise us to see that pipe! is defined as compose! with its arguments reversed:

template unaryFun(alias fun, string parmName = "a") ;
alias compose(fun...) = unaryFun!(fun[0]);
alias pipe(fun...) = compose!(Reverse!fun);

Another interesting approach offered by D (and Nim) is the Uniform Function Call Syntax (UFCS):

pure string confuse(string word) 
{
    return word
        .reversed
        .capitalize
        .exclaim;
}

UFCS allows chaining free functions on a matching parameter. In this case, each function takes a string and outputs a string, so it can be applied as if it was a member function! Free functions are functions that are not in the local scope (in order to prevent naming conflicts).

A pipeline in JavaScript

The pipeline operator will probably look something like Elixir’s pipeline, although the TC39 has to figure out how to deal with un-parenthesized arrow functions, placeholder arguments and asynchronicity (async/await). One feature that could make this great is the ability to explicitize partial application of specified function arguments. There is an interesting presentation on the two competing proposals on the TC39 GitHub.

Let’s write our own pipeline function that confirms to these two ideas:

the arguments for the functions do not have to be named (point-free)
the composition order is the same as the application order

Let’s start off by writing a pipe function, that takes two arguments and puts the left argument into the right argument.

const pipe = (x, f) => f(x);

We can then rewrite our confuse function by piping our transformations:

const confuse = word => 
  pipe( 
    pipe(
      pipe(
        word,
        reverse
      ),
      capitalize
    ),
    exclaim
  );

That adheres to only one of our requirements (left-to-right order), so we need to find a way to generalize this function and make it a bit less explicit. We need to accept an arbitrary amount of functions. We want to apply each function to the result of the previous function.

In order to accept an arbitrary amount of functions, we could use a variadic argument. An imperative approach to this would be an iteration that applies pipe to every function.

const pipeline = (input, ...funcs) => {
  let result = input;
  
  for (const f of funcs) {
    result = pipe(result, f);
  }
  
  return result;
}

Our confuse function would then be:

const confuse = word => pipeline(word, reverse, capitalize, exclaim);

We can do better! First of all, the pipeline can be more declarative. After all, the pipeline is a reduction of applying pipe to each function with a given input:

const pipeline = (input, ...funcs) => 
    funcs.reduce(pipe, input);

Secondly, we can generalize it by creating a curried function: we translate our function that takes two arguments to a function that takes only one, but returns another function that takes the other argument. In effect, this allows deferred execution through partial application.

const pipeline = (...funcs) => input =>
    funcs.reduce(pipe, input);

This decouples the creation of the pipeline from its execution: invoking pipeline and giving it an amount of functions, produces a “preconfigured” pipeline. A function is returned that takes the input for the pipeline as its argument." Our pipeline is therefore a higher-order function: it takes function(s) as its arguments and/or it returns a function as its output.

Not only do we have a set of easily testable, generic and composable functions, we are more declarative through a point-free style:

const pipe = (x, f) => f(x);
const pipeline = (...funcs) => input => funcs.reduce(pipe, input);

const confuse = pipeline(reverse, capitalize, exclaim);

// Prints: "Looc!"
console.log(confuse("cool"));

As a bonus, we can define composition as the inverse of our pipeline, by reducing using pipe from right to left, rather than from left to right:

const pipe = (x, f) => f(x);
const composition = (...funcs) => input => funcs.reduceRight(pipe, input);

const confuse = composition(exclaim, capitalize, reverse);

// Prints: "Looc!"
console.log(confuse("cool"));

Interestingly, we have already seen this relationship between composition and pipelines in how D defined compose and pipe and how we can define our pipeline operator in Haskell by flipping the arguments of the composition operator. A more complete definition of pipe and compose is given in the Ramda library, a functional JavaScript library that contains loads of interesting building blocks for functional programming.

The minimal proposal

If we were to write our pipeline using the minimal proposal, the corresponding code would be as follows:

const confuse = reverse |> capitalize |> exclaim;

You might then even call confuse and console.log as a pipeline:

// Prints: Looc!
"cool" |> confuse |> console.log;

“Looc!”, indeed. If you wish to experiment with this, you can use a Babel plugin: @babel/plugin-proposal-pipeline-operator.

Risk

The benefit of easily composable and readable pipelines can also be its greatest risk. It is so easy to write oneliners composed of a lot of functions, that it will probably happen. This will not benefit the readability, thus maintainability, of our code. We can do a number of things to keep pipelines easy to read. These are things we should be doing already:

break pipelines up into smaller pieces (these can even be pipelines themselves)
use descriptive, intent-revealing names
favor simple, unary functions over complex functions that take multiple arguments

Sometimes, we want composition

Is composition always more difficult to grok for people accustomed to left-to-right human languages than pipelines? No. There are, in fact, many functions that are more declaratively combined using composition.

For example, let’s define a function called odd, that determines whether a number is odd. Imagine that we already have the function even (i.e. checking if something is divisible by 2). We can define odd as something that is not even. In other words: the composition of not and even, wherein we want the function even to be evaluated before not.

In (future) JavaScript, this might look somewhat as follows:

// Pipeline
// i.e. const odd = pipeline(even, not);
const odd = even |> not;

// Composition
// i.e. const odd = composition(even, not);
const odd = not . even

To me, composition makes more sense in these kinds of cases as it is more aligned with real-world language. As I see it, whether to choose composition or a pipeline depends on which is the most readable and what the purpose of your operations is: a processing pipeline is easier to read from left to right, but the simple combination of functions might be more understandable through (right-to-left) composition.

In conclusion

JavaScript, like other languages, is becoming more and more functional. Although functional programming has a close relationship with mathematics, we do not need to accept that function composition is our only way of declaratively combining the behavior of functions.

Using a pipeline operator, we can not only define functions in a point-free manner, but we can also make our code more natural to reason about by composing our functions from left-to-right. This is where a pipeline operator (like Elixir’s or F#’s), like partial application, pattern matching and a composition operator, would be a nice addition to the (functional) JavaScript programmer’s toolbox.

A. Rothuis