Some Bash Scripting Notes

The best existing guide to bash scripting is the Advanced Bash-Scripting Guide, which is pretty tragic, as it's totally hopeless. This document attemps to summarise some basic and useful things which are not necessarily explained well elsewhere.

Variables

Does anyone actually understand the rules for quoting? Basically, as far as i can tell, you want to always quote absolutely anything which includes a variable substitution which could potentially have any kind of mildly exciting characters (like spaces) in it. So don't cp $foo $bar, but rather cp "$foo" "$bar".

There are some useful special variables:

SymbolValue
$#Number of command-line args
$@The command-line args, as a list of strings - you almost certainly want to quote this as "$@", to get a list of quoted strings
$*The command-line args as one big string (i've never come across a situation where this would be at all useful)
$$The current PID
$?Exit value of the last invoked program; this one makes me feel nervous, but i don't think you can capture exit values by assignment

There are others, but i'd avoid them.

Mangling Filenames

I have no idea how this works, but it chops the file extension (not including the .) off a filename:

extension=${filename##*.}

Actually, i do kind of know - there's all sorts of really good string mangling you can do inside ${}; see here.

To extract a filename from a path, use basename; to get a directory name from a path, use dirname:

path=/usr/bin/tar
filename=`basename "$path"` # tar
dirname=`dirname "$path"` # /usr/bin

To get the filename with an extension stripped off, pass the extension as a second parameter to basename:

path=/tmp/installer.tar
ext=${path##*.}
rootname=`basename "$path" ".$ext"` # installer

To get the absolute path to a file, use readlink -f:

relpath=.project
abspath=`readlink -f "$relpath"` # /home/twic/.project

This will resolve all symbolic links and that as well. Note that the file doesn't have to exist for this to work! Some linuces have a realpath which does the same thing. The standard readlink on OS X doesn't support -f; if you import coreutils via MacPorts, you have a greadlink which works properly.

Directories

mkdir -p creates a whole sequence of directories in one go:

mkdir -p foo/bar/baz # works even if foo didn't exist to begin with

pushd and popd can be amazingly useful, but they print pointless commentary on their activity, which can't be silenced, but can be piped to the bin:

tmpdir="$$.tmp"
mkdir "$tmpdir"
pushd "$tmpdir" >>/dev/null
# do stuff in the temporary directory
popd >>/dev/null

Conditionals

The standard way:

if [[ $foo ]]
then
	echo foo is true
elif [[ $bar ]]
	echo foo is false, but bar is true
else
	echo nothing is true
fi

The double square brackets tell bash to use its built-in expression evaluation. Single braces call out to a utility program. I don't think there's ever a time you'd want to do that.

However, note that if you want to test the return value of a program, there are no brackets:

if tar xf "$tarfile"
then
	echo "Extracted $tarfile"
else
	echo "Failed to extract $tarfile"
fi

Note that, as far as i can tell, you're not allowed to omit the then clause, nor to leave it empty (and a comment counts as empty). If you want an empty then, you need a noop. A standard bash noop is a colon:

if tar xf "$tarfile"
then
	: # do nothing
else
	echo "Failed to extract $tarfile"
	exit 1
fi

As for what's true, undefined variables and the empty string are false, and everything else is true. Including "false" and "0", so careful. Possibly if you had a numeric zero rather than a string one, that would be false, but i don't know how to make one of those.

More complicated expressions are made in kind of the normal way, but be careful with quoting, and picking the right operators - there are different ones for strings and numbers. Those operators in full:

OperatorOperation
==String equality (also =, but don't)
!=String inequality
<String less-than
>String greater-than
-z (unary)String is empty (or undefined?)
-n (unary)String is not empty
-eqNumeric equality
-neNumeric inequality
-ltNumeric less-than
-leNumeric less-than-or-equal-to
-gtNumeric greater-than
-geNumeric greater-than-or-equal-to

File tests are also good. Do:

filename=foo.txt
if [[ -e "$filename" ]]
then
	echo "$filename exists"
fi

Those useful file tests in full:

SymbolTest
-eExists
-fExists and is a file
-dExists and is a directory
-hExists and is a symlink
-sExists (and is a file?) and is not empty
-rExists and we can read it
-wExists and we can write it
-xExists and we can execute it

Bash also supports perl-like (or vice-versa) chaining of commands with || and &&:

mkdir tmpdir || exit 1

mkdir tmpdir || (echo "could not create directory $tmpdir"; exit 1)

Beats me what the rules for those parens are, but that idiom works.

Loops

Loops are one of the bits of shell scripting that i understand least well. The structure of a loop is easy enough:

for loopvar in LIST OF THINGS TO LOOP OVER
do
	echo $loopvar
done

You can write out a literal list of things to loop over by separating strings with spaces:

for day in monday tuesday wednesday thursday friday sunday
do
	echo $day
done

for seaArea in "North Utsire" "South Utsire" "German Bight"
do
	echo $seaArea
done

Looping over your arguments is also easy:

for arg in "$@"
do
	echo $arg
done

But what if you have a variable with a list of things in that you want to loop over? Well, it seems you can just do this (note where the quoting is and isn't):

list="one two three"
for thing in $list
do
	echo $thing
done

What happens is that the string is split up on whitespace. As far as i can tell, bash doesn't have the concept of lists per se, just strings with whitespace in. This probably a Deep Bash Truth. But i think this means that if you want to manipulating a list of strings which might themselves have whitespace in, you're fucked.

If you want to loop over a list of things which aren't separated by whitespace, use tr to change the separator. Like so:

for dir in `echo $PATH | tr : ' '`
do
	echo "$dir"
done

As a bonus, you can loop over some files (note - no quotes around the glob!):

for sourcefile in *.java
do
	javac "$sourcefile"
done

Functions

Definition and use are pretty simple:

function foo () { # the parens are optional
	# params are accessed like script args
	x=$1
	y=$2
	z=$3
}

# invocation is like running a program
foo a b c

As with script args, can do varargs using $@ and/or shift

function deleteAll () {
	rm "$@"
}

function copyAllTo () {
	dst=shift
	cp "$@" $dst
}

I've never looked into how you do return values.

Invocation

To run a script in the current context, rather than spawning a subshell (eg so it can set environment variables):

. script.sh
source script.sh

To set some environment variables for just one invocation of a script, rather than generally in your own context:

VAR=value script.sh

Note that an undefined variable evaluates to false in a conditional, whereas a defined one is true (maybe unless it's zero or the empty string or something), so this is an easy way to pass flags - much easier than dicking around with getopt or similar:

if [[ "$VERBOSE" ]]
then
	echo "Reticulating splines"
fi

Invoke that script as one of:

script.sh # quiet
VERBOSE=true script.sh # verbose