Use Bash strict mode

Note

This article is a derivative of the article by Aaron Maxwell. See the original article for additional details.

Your Bash scripts will be more robust, reliable and maintainable if you start them like this:

#!/bin/bash
set -euo pipefail
IFS=$'\n\t'

I call this the unofficial Bash strict mode. This causes Bash to behave in a way that makes many classes of subtle bugs impossible. You'll spend much less time debugging, and also avoid having unexpected complications in production.

There is a short-term downside: these settings make certain common Bash idioms harder to work with. The rest of this article will explain what these settings do, and then show you how to work around the problems they cause.

With these settings, certain common errors will cause the script to immediately fail, explicitly and loudly. Otherwise, you can get hidden bugs that are discovered only when they blow up in production.

set -euo pipefail is short for:

set -e
set -u
set -o pipefail

Let's look at each separately.

`set -e`

The set -e option instructs Bash to immediately exit if any command has a non-zero exit status. You wouldn't want to set this for your command-line shell, but in a script it's massively helpful. In all widely-used general-purpose programming languages, an unhandled runtime error - whether that's a thrown exception in Java, or a segmentation fault in C, or a syntax error in Python - immediately halts execution of the program; subsequent lines are not executed.

Bash's default behavior is to continue execution, even after a command fails. This is fine for interactive use, but in scripts it's terrible.

If one line in a script fails, but the last line succeeds, the whole script has a successful exit code. That makes it very easy to miss the error. Again, what you want when using Bash as your command-line shell and using it in scripts are at odds here.

Being intolerant of errors is a lot better in scripts, and that's what set -e gives you.

`set -u`

set -u affects variables. When set, a reference to any variable you haven't previously defined - with the exceptions of $* and $@ - is an error, and causes the program to immediately exit.

Languages like Python, C, Java and more all behave the same way, for all sorts of good reasons. One is so typos don't create new variables without you realizing it. For example:

#!/bin/bash
firstName="Aaron"
fullName="$firstname Maxwell"
echo "$fullName"

Take a moment and look. Do you see the error? The right-hand side of the third line says "firstname", all lowercase, instead of the camel-cased "firstName". Without the -u option, this will be a silent error: Bash will substitute the empty string for the undefined variable "$firstname", and the program will print out just " Maxwell".

But with the -u option, the script exits on that line with an exit code of 1, printing the message "firstname: unbound variable" to stderr. This is what you want: have it fail explicitly and immediately, rather than create subtle bugs that may be discovered too late.

`set -o pipefail`

This setting prevents errors in a pipeline from being masked. If any command in a pipeline fails, that return code will be used as the return code of the whole pipeline. By default, the pipeline's return code is that of the last command - even if it succeeds.

Imagine finding a sorted list of matching lines in a file:

$ grep some-string /non/existent/file | sort
grep: /non/existent/file: No such file or directory
$ echo $?
0

Here, grep has an exit code of 2, writes an error message to stderr, and an empty string to stdout. This empty string is then passed through sort, which happily accepts it as valid input, and returns a status code of 0 (success). This is fine for a command line, but bad for a shell script: you almost certainly want the script to exit right then with a nonzero exit code, like this:

$ set -o pipefail
$ grep some-string /non/existent/file | sort
grep: /non/existent/file: No such file or directory
$ echo $?
2

Setting `IFS`

The IFS variable - which stands for Internal Field Separator - controls what Bash calls "word splitting". When set to a string, each character in the string is considered by Bash to separate words.

This governs how Bash will iterate through a sequence. For example, this script:

#!/bin/bash
IFS=$' '
items="a b c"
for x in $items; do
    echo "$x"
done

IFS=$'\n'
for y in $items; do
    echo "$y"
done

...will print out this:

a
b
c
a b c

For the first loop, IFS is a space, meaning that words are separated by a space character. For the second loop, "words" are separated by a newline, which means Bash considers the whole value of "items" as a single word. If IFS is more than one character, splitting will be done on any of those characters.

Setting IFS to $'\n\t' means that word splitting will happen only on newlines and tab characters. This very often produces useful splitting behavior. By default, Bash sets this to $' \n\t' - space, newline, tab - which is too eager.

Consider a script that takes filenames as command line arguments:

for arg in $@; do
    echo "doing something with file: $arg"
done

If you invoke this as myscript.sh notes todo-list 'My Resume.doc', then with the default IFS value, the third argument will be mis-parsed as two separate files - named "My" and "Resume.doc". When actually it's a file that has a space in it, named "My Resume.doc".

Issues & Solutions

I've been using the unofficial Bash strict mode for years. Once you get used to it, you'll find it makes your scripts much more reliable. But there are a few common idioms that you need to adjust.

Sourcing a nonconforming document

This comes up most often when using Python virtual environments. You elect to use this by sourcing a file named "bin/activate" within:

# This will update PATH and set PYTHONPATH to
# use the preconfigured virtual environment.
source /path/to/venv/bin/activate

# Now the desired version of Python is in your path,
# with the specific set of libraries you need.
python my_program.py

The problem is, these activate scripts are not always strict-mode-compliant. In particular, they often reference unbound variables.

No problem, you just use the pattern above:

set +u
source /path/to/venv/bin/activate
set -u

The solution is to (a) temporarily disable that aspect of strict mode; (b) source the document; then (c) re-enable, on the next line. The most common time you'll need this will be when the document references an undefined variable.

Positional parameters

...what if it's not provided?

Under strict mode, you need to use this for all positional parameter references:

#!/bin/bash
set -u

name=${1:-}
if [[ -z "$name" ]]; then
  echo "usage: $0 NAME"
  exit 1
fi
echo "Hello, $name"

The key is the ${1:-}. This is a Shell Parameter Expansion. If $1 is unset or null, it expands to an empty string; otherwise, it expands to the value of $1. This provides a default for an unset value, and so it satisfies set -u.

Commands you expect to have non-zero exit status

There are two main ways to deal with this.

The simplest, which you will usually want to use, is to append "|| true" after the command:

# "grep -c" reports the number of matching lines. If the number is 0,
# then grep's exit status is 1, but we don't care - we just want to
# know the number of matches, even if that number is zero.

# Under strict mode, the next line aborts with an error:
count=$(grep -c some-string some-file)

# But this one behaves more nicely:
count=$(grep -c some-string some-file || true)

echo "count: $count"

This works because set -e does not apply to commands whose return codes are being tested - as in if statements, while loops, or as part of a || or && construct. The "|| true" is a command that will always succeed, and so its return code is 0 (success).

The problem is that this masks the return code of grep. What if grep exited with a status of 2? That indicates an error - e.g., the file doesn't exist. The || true will mask this too.

Then you can temporarily disable the exit-immediately option:

# We had started out this script with set -e . And then

set +e
count=$(grep -c some-string some-file)
retval=$?
set -e

# grep's return code is 0 when one or more lines match;
# 1 if no lines match; and 2 on an error. This pattern
# lets us distinguish between them.

echo "return value: $retval"
echo "count: $count"

Essential clean-up

Sometimes you create a temporary file or directory, and you need to make sure it gets cleaned up when the script exits - even if it exits unexpectedly due to an error. With the set -e option, it is possible an error will cause your script to exit before it can perform the cleanup, which is not acceptable.

The solution: use Bash exit traps. Here's how you would use it to robustly clean up a scratch directory:

scratch=$(mktemp -d -t tmp.XXXXXXXXXX)
function finish {
  rm -rf "$scratch"
}
trap finish EXIT

# Now your script can write files in the directory "$scratch".
# It will automatically be deleted on exit, whether that's due
# to an error, or normal completion.

The trap command arranges for the finish function to be called whenever the script exits, for any reason.