Why?
Get to know your shell, which is the command interpreter for many variants of Unix. This means, Linux[]; *BSD, and mac OS all use Bash as the default login command interpreter. You can now even run Bash on Windows with the Windows Subsystem for Linux (Richard Stallman must be thrilled!)!
History / sh vs bash
The Unix operating system was first written in 1969 by Ken Thompson and Dennis Ritchie. Ken Thompson and Dennis Ritchie are the authors of both Unix and C language; however, Thompson was more of the Unix developer and Ritchie more the C language developer. Thompson, wrote the first version of Unix in August 1969, when his wife was on vacation with his newborn son:
I allocated a week each to the operating system, the shell, the editor, and the assembler [he told me]... and during the month she was gone, it was totally rewritten in a form that looked like an operating system, with tools that were sort of known, you know, assembler, editor, and shell -- if not maintaining itself, right on the verge of maintaining itself
-- Ken Thompson, describing the origin story of Unix
This original Unix shell is called the Thompson Shell. Shell redirection was a early feature of the shell and was present for Unix Version 1, which was released on November 3, 1971. Additionally, some tools included in this first release were:
cal
cat
chdir
(later shortened tocd
)chmod
/chown
cp
date
df
du
ed
: this is the ancestor ofvi/vim
, with the ancestry being:ed
(editor)
->em
(editor for "mortals")
->en
->ex
("extended" en)
->vi
(visual)
->vim
(vi "improved")ls
mkdir
su
Pipes were added in the Unix Version 3, which was released February 1973.
The Thompson Shell was intentially minimalistic and inadequate for programming tasks, such as if
and other flow control statements. In Unix Version 7 (released 1979), the Thompson Shell was replaced by the Bourne Shell, which was authored by Steven Bourne. The Bourne Shell made the shell more like a programming language, by adding features such as flow control (if
/else
,case
), variables, i/o redirection (>&2
,2>&1
,2>my-errors.log
), etc..
Due to the Unix wars, an uncertainty of being able to use Unix in the future
because of copyright / licencing issues, Richard Stallman, the founder of the GNU project, began writing open-source replicas of the Unix kernel and Unix tools without any of the Unix source code. These tools included the C compiler (gcc
), the C standard library (glibc
), the core binary executables such as sed
, bundled in a package called coreutils
, and of course, my personal favorite, Bash.
What you need to know
I assume you are already know your way around Bash, so I will skip much of the basic content. However, if you want to brush up, this
and this tutorial are good places to start.
Friendly interactive behavior
Here are some settings I add to my ~/.profile
, which is where you place Bash settings:
bind 'set show-all-if-ambiguous on'
bind 'TAB:menu-complete'
bind 'set completion-ignore-case on'
bind 'set visible-stats on'
bind 'set page-completions off'
If you use TAB to autocomplete filename, you will find this behavior much
easier to use. If you are familiar with zsh, this will make Bash feel more like zsh. Instead of having to press tab twice, you can press it once to autocomplete. Additionally, there will be no more bell sound and it will show you a list of files that match your autocomplete, letting yout cycle through them with TAB until you get your match.
Note the complete list of files shown with just one TAB press after typing "cat
[TAB]"
@mbp2:src $ cat
athena* athena.user* email*
@mbp2:src $ cat athena
Exit codes
A Bash script will, by default, not exit if an error from a program has been
thrown, which is not the same as a syntax error. The Bash script will continue on interpreting. This is not normally the case with most programming languages and can lead to unexpected behavior in Bash scripts. A sane default I use at the top of my scripts, in addition to the shebang, is:
#!/usr/bin/env bash
set -euo pipefail
This will lead to the script failing on any error or unset variable.
Here are some more details on these settings:
-
set -e
(fromman bash
): Exit immediately if a pipeline, which can
consist of a single shell command, exits with a non-zero status; -
set -u
(fromman bash
): Treat unset variables as an error when performing
parameter expansion; -
set -o pipefail
(fromman bash
): The return value is the value of the
last command to exit with a non-zero status, or zero if all commands exited succesfully.
Bash, and many other programming languages, have the concept of exit codes. The exit code of the prior command is stored in the $?
variable. Exit codes 1 - 2, 126 - 165, and 255 all have special meanings, and should only be called if the command warrants that exit code. Any exit code can be thrown however, as demonstrated:
@mbp2:~ $ /usr/local/bin/bash -c 'exit 0'; echo $?
0
@mbp2:~ $ /usr/local/bin/bash -c 'exit 1'; echo $?
1
@mbp2:~ $ /usr/local/bin/bash -c 'exit 12'; echo $?
12
Variables
I assume you have used variables before so I will skip most of the content here. However, I occassionally see a misuse of variable expansions. Here are some good rules:
- Always quote the variable expansion. This is because the quoting actually preserves newline characters embedded in the variable, if present, whilst non-quoted expansions will remove the newline characters:
good:
do_some_stuff "$thing"
bad:
do_some_stuff $thing
- Only use curly braces if adjacent to non-spaces, omit curly braces everywhere else. Curly braces are indented to explcitly signal the variable. If it's not explicity set, the bash parser, as well as the reader, may be confused and assume the variable includes an adjacent character as the Bash variable name:
good:
url="${base_url}/${endpoint}?${query_params}"
good:
msg="the api request returned: $result"
bad:
do_some_stuff ${thing}
bad:
do_some_stuff "${thing}"
- Use single quotes if there are no variables present and reserve double quotes for when using variable expansions:
good:
base_url='https://www.gnu.org/'
bad:
base_url="https://www.gnu.org/"
Named arguments
Bash uses positional arguments for function declarations and function calls. However, many command line programs offer flags to pass in content or optional behavior. Here is a way to get that same behavior with your custom Bash scripts:
tmp_dir="$(mkdir -p /tmp/email && echo '/tmp/email')"
report=''
distro_list=''
html=''
date_override=''
body_override=''
usage(){
echo "Usage: email: ${0} [--report <file_path>] [--distro-list <'distro@list.com'>]" 1>&2
echo " [--html] [--date-override <date>] [--body-override <body>]" 1>&2
echo " Do not use --html and body override in the same call. " 1>&2
exit 1
}
while [[ $# -gt 0 ]]; do
case "$1" in
-r|--report) report="$2"; shift ;;
-l|--distro-list) distro_list="$2"; shift ;;
-h|--html) html='y' ;;
-d|--date-override) date_override="$2"; shift;;
-b|--body-override) body_override="$2"; shift;;
*) break ;;
esac
shift
done
if [[ -z $report ]] || [[ -z $distro_list ]]; then usage; fi
if [[ ! -z $html ]] && [[ ! -z $body_override ]]; then usage; fi
email(){
local report="$1"
local distro_list="$2"
local html="$3"
local date_override="$4"
local body_override="$5"
if [[ $(whoami) == 'root' ]]; then # docker (k8s, odroid, pi)
curl_email "$report" "$distro_list" "$html" "$date_override" "$body_override"
elif [[ $(whoami) == 'sbx_'* ]]; then # AWS Lambda
curl_email "$report" "$distro_list" "$html" "$date_override" "$body_override"
elif [[ $(whoami) == 'skilbjo' ]]; then # mac OS
curl_email "$report" "$distro_list" "$html" "$date_override" "$body_override"
fi
}
email "$report" "$distro_list" "$html" "$date_override" "$body_override"
In the above script, I declared my global functions at the top of the script but initalized them with empty values. I then used a function that specificies usage and exits with an error, a common pattern in command line programs before I used a while statement to parse the "$@" arguments, which is similar to argc
/argv
pattern in C language's main
argument parsing, and set the global variables with the appropriate arguments based on the flags. After, I checked if required arguments have been set, and if not, will call my usage function, which will exit the script with an error code. Next, I have the script logic declared as functions, and finally, I call my main function, in this case, email
, and enter the logic section, which is a script to send a custom email here.
Sourcing files
A way to segment a potentially large Bash script/application would be to split your application into multiple smaller files, and load those files into memory at run-time. This can be done with:
file foo
bar(){
echo 'I ran from a sourced file!'
}
file run-it
#!/usr/bin/env bash
source foo
bar
would return:
$ ./run-it
I ran from a sourced file!
$
This is a similar approach to how libraries work in other programming languages, such as C, Python, Clojure, etc.. However, the Unix process model is framed more to executing independent programs: if the functions are helper functions or more of a library, they should be sourced in from the application's entrypoint. If they are independent programs, they should be invoked like any other Unix binary. If the script takes arguments, you should use the named arguments approach I referenced above.
Aliases (you can inline a function!)
When becoming a wizard with the cli, you may wish you use your own shortcuts.
Place alias in an ~/.aliases
file, with the following syntax:
alias h='cd ~'
alias mkdir='mkdir -p'
alias vim='vim -p'
alias man='function _(){ /usr/bin/man "$1" | col -xb | vim -;};_'
alias ytdl='function _(){ cd ~/Desktop/; youtube-dl -x --audio-format mp3 "$1" & cd -;};_' # download youtube songs
alias "psql.dw"='function _psql(){ psql "$db_uri" -c "$1"; };_psql'
alias x='exit'
alias a='cd ~/dev/aeon/'
alias m='cd ~/dev/markets-etl/'
alias b='cd ~/dev/bash-etl/'
alias d='cd ~/dev'
alias 'netstat.osx'='echo "Proto Recv-Q Send-Q Local Address Foreign Address (state)" && netstat -an | grep LISTEN'
A caveat is you cannot use aliases when using Sudo or Watch. For example:
@mbp2:cdmtr $ echo "alias 'docker.ssh'='docker-machine ssh default'" >>~/.aliases
@mbp2:cdmtr $ source ~/.aliases
@mbp2:cdmtr $ docker.ssh
docker@default:~$ exit
@mbp2:cdmtr $ watch -n5 docker.ssh
sh: docker.ssh: command not found
^C
@mbp2:cdmtr $
Style guide / Application development philosophy
- Variable expansion: see above
- Use functions as much as possible. Only the global portion of your script, (limited) global variables, function declarations, and entrypoint into the main function, should be the shebang. Sourcing of any additional files can happen inside a function, for example, a setup function.
- Variables: Use local variables as much as possible. Passing data to and from functions as arguments is better than using global variables.
- Indentation: 2 spaces
- No tabs
- Variables, functions, and file names all have descriptive names, unless using standard convention, for example, using i in for loop with i being the incrementor.
- Use
"$(command substitution)"
instead of\
backticks\
. - Use single quotes for variables with no variable access, and double quotes for variables with variable expansion. For example,
base_url='https://www.codementor.io'
and
full_url="${url}/${endpoint}?${query_parameters}"
. - Generic functions that do one small thing well are preferrable.
Dotfiles + Github / Dropbox
I like my comfy Bash set up and don't want to have to default to context switch
when moving from my work computer to my home computer, single-board-computer cluster, AWS EC2 instances, and others.
I place my Bash settings in a Dropbox folder and create a symbolic link at ~/.bashrc
that points to it. This works nicely as I can add a new setting I may want to test across my main computers, before commiting it to a git repository and syncing it across all the other devices, many of which I do not have Dropbox installed on, like Linux and FreeBSD, but are able to use git.
Here is a tutorial to get started.
I even embed my favorite bashrc settings in docker containers I develop: see this example
Conclusion
The investment you make in your core toolset will pay dividends for the rest of your career. While the programming languages you use may change, what is nearly guaranteed is that you will need a shell. Bash is important in the day-to-day, your investment here will let you be a better programmer.
Some great / handy programs
bash-completion
: not a program per say, but a handy tool that lets you use TAB complete for program arguments, like git- man: man - format and display the on-line manual pages
- htop: interactive process viewer / a more modern
top
command - tree: visual version of
ls
command - ack: quickly search for file contents of many files
- vim: vim - Vi IMproved, a programmer's text editor
- tmux: tmux - terminal multiplexer
- bc: evaluate simple mathmatical equations. For example,
sleep "$(echo '60 * 5' | bc)" # sleep for 5 min
- gzip: compression/decompression tool. compress:
gzip -9 [file]
. decompress:
gzip -d [file].gz
- watch: run the same command repeatedly in an ncurses window.
watch -n1 'docker ps | grep 'my-container'
Other shells to explore
- ash / dash: almquist shell / debian almquist shell
- zsh: z shell
- ksh: korn shell
- tcsh: c shell
- fish: a modern shell
Linux: Linux is custom, so distribution maintainers may craft special versions of Linux that may not use Bash. An example is Alpine Linux, the default distribution meant for Docker containers, it uses almquist shella, which is a lightweight clone of the Bourne Shell. Linux is also referred to as GNU/Linux.
Unix itself was spun from a project, Multics [Multiplexed Information and Computing Service], which was developed from the CTSS [Compatible Time-Sharing System] project at MIT. There is an incredible video from 1963 describing the reason why operating systems are important and how they work.
References
- Appendix E. Exit Codes With Special Meanings
- Ancient Unix
- Research Unix
- The GNU and the Penguin, a History of Free and Open Source
- Bourne Shell
- Bash (Unix Shell)
- Almquist Shell
- GNU + Linux
- hilarious article about deleting /bin/bash