From: Alex Dehnert Date: Sat, 23 Feb 2013 20:37:26 +0000 (-0500) Subject: Add document on writing safer shell scripts X-Git-Url: https://sipb.mit.edu/gitweb.cgi/wiki.git/commitdiff_plain/4f7205948f37fe6fec864c025e16a72b0afe2eab?ds=sidebyside Add document on writing safer shell scripts --- diff --git a/doc.mdwn b/doc.mdwn index fcccc6c..8007928 100644 --- a/doc.mdwn +++ b/doc.mdwn @@ -61,6 +61,8 @@ How to set up printing on a Mac. * [[Using CPAN|doc/cpan]]
CPAN is a source of many useful Perl libraries, but the tools often seem determined not to let you have them. Here's how to beat them into submission. +* [[Writing safe(r) shell scripts|doc/safe-shell]] + ## IAP Classes and Cluedumps SIPB teaches [an array of classes](http://sipb.mit.edu/iap) each IAP and shorter [cluedumps](http://cluedumps.mit.edu/) in the fall. diff --git a/doc/safe-shell.mdwn b/doc/safe-shell.mdwn new file mode 100644 index 0000000..f4bb32a --- /dev/null +++ b/doc/safe-shell.mdwn @@ -0,0 +1,110 @@ +[[!meta title="Writing Safe Shell Scripts"]] + +Writing shell scripts leaves a lot of room to make mistakes, in ways that will +cause your scripts to break on certain input, or (if some input is untrusted) +open up security vulnerabilities. Here are some tips on how to make your shell +scripts safer. + +## Don't + +The simplest step is to avoid using shell at all. Many higher-level languages +are both easier to write the code in in the first place, and avoid some of the +issues that shell has. For example, Python will automatically error out if you +try to read from an uninitialized variable (though not if you try to write to +one), or if some function call you make produces an error. + +One of shell's chief advantages is that it's easy to call out to the huge +variety of command-line utilities available. Much of that functionality will be +available through libraries in Python or other languages. For the handful of +things that aren't, you can still call external programs. In Python, the +[subprocess](http://docs.python.org/2/library/subprocess.html) module is very +useful for this. It also has two big advantages over shell — it's a lot +easier to avoid +[word-splitting](http://www.gnu.org/software/bash/manual/html_node/Word-Splitting.html) +or similar issues, and since calls to subprocess will tend to be relatively +uncommon, it's easy to scrutinize them especially hard. + +## Shell settings + +POSIX sh and especially bash have a number of settings that can help write safe shell scripts. + +I recommend the following in bash scripts: + + set -euf -o pipefail + +In dash, `set -o` doesn't exist, so use only `set -euf`. + +What do those do? + +### `[set](http://www.gnu.org/software/bash/manual/html_node/The-Set-Builtin.html) -e` + +If a command fails, `set -e` will make the whole script exit, instead of just +resuming on the next line. If you have commands that can fail without it being +an issue, you can append `|| true` or `|| :` to suppress this behavior — +for example `set -e` followed by `false || :` will not cause your script to +terminate. + +### `[set](http://www.gnu.org/software/bash/manual/html_node/The-Set-Builtin.html) -u` + +Treat unset variables as an error, and immediately exit. + +### `[set](http://www.gnu.org/software/bash/manual/html_node/The-Set-Builtin.html) -f` + +Disable filename expansion (globbing) upon seeing `*`, `?`, etc.. + +If your script depends on globbing, you obviously shouldn't set this. Instead, +you may find +`[shopt](http://www.gnu.org/software/bash/manual/html_node/The-Shopt-Builtin.html) +-s failglob` useful, which causes globs that don't get expanded to cause +errors, rather than getting passed to the command with the `*` intact. + +### [set](http://www.gnu.org/software/bash/manual/html_node/The-Set-Builtin.html) -o pipefail + +`set -o pipefail` causes a pipeline (for example, `curl -s http://sipb.mit.edu/ +| grep foo`) to produce a failure return code if any command errors. Normally, +pipelines only return a failure if the last command errors. In combination with +`set -e`, this will make your script exit if any command in a pipeline errors. + +## Quote liberally + +Whenever you pass a variable to a command, you should probably quote it. +Otherwise, the shell will perform +[word-splitting](http://www.gnu.org/software/bash/manual/html_node/Word-Splitting.html) +and +[globbing](http://www.gnu.org/software/bash/manual/html_node/Filename-Expansion.html), +which is likely not what you want. + +For example, consider the following: + + alex@kronborg tmp [15:23] $ dir="foo bar" + alex@kronborg tmp [15:23] $ ls $dir + ls: cannot access foo: No such file or directory + ls: cannot access bar: No such file or directory + alex@kronborg tmp [15:23] $ cd "$dir" + alex@kronborg foo bar [15:25] $ file=*.txt + alex@kronborg foo bar [15:26] $ echo $file + bar.txt foo.txt + alex@kronborg foo bar [15:26] $ echo "$file" + *.txt + +Depending on what you are doing in your script, it is likely that the +word-splitting and globbing shown above are not what you expected to have +happen. By using `"$foo"` to access the contents of the `foo` variable instead +of just `$foo`, this problem does not arise. + +When writing a wrapper script, you may wish pass along all the arguments your +script received. Do that with: + + wrapped-command "$@" + +See ["Special Parameters" in the bash +manual](http://www.gnu.org/software/bash/manual/html_node/Special-Parameters.html) +for details on the distinction between `$*`, `$@`, and `"$@"` — the first +and second are rarely what you want in a safe shell script. + +## Conclusion + +When possible, instead of writing a "safe" shell script, *use a higher-level +language like Python*. If you can't do that, the shell has several *options* that +you can enable that will reduce your chances of having bugs, and you should be +sure to *quote liberally*.