Finding Files on the Command Line
When I first started programming, I used a GUI program to find files. I thought
it was a great tool. It even supported regular expressions, but one day it
started crashing and I could no longer depend on it. Then I started using a
combination of unix command line tools and they became my first choice for
file search. Searching directly in IDEs and Emacs is still useful.
We will look at how to use find
and grep
to find files.
Find files by name
Open your command line program and go to a directory with not too many files.
Run the following command.
find .
This will recursively print out all of the files (directories, hidden files,
files, symbolic links, etc.) in the current directory and subdirectories. They
will be shown as relative paths. find
with no arguments will do the same
thing.
We can change it to return the contents of the current directory with the
absolute path.
find "$(pwd)"
"$()"
means run the bash command inside and return it as the string. pwd
returns the present working directory. It will print out all of the files it
finds with the absolute path.
Now let's find only files that have the .js
extension.
find . -iname "*.js"
-iname
filters files by the provided string argument and case insensitive.
The case sensitive version is -name
. *
is a wild card. It consumes 0 or more
characters, then matches strings that end with .js
. This is a very common
pattern in find
.
What if we want to find a file that starts with i
and ends with .js
and
may have other characters in between.
find . -iname "i*.js"
Finally, we can also look for words in the middle of the file name.
find . -iname "*index*"
This does not restrict the file to any particular extension. What would the
query look like if we want to find something that has index
somewhere in the name
and ends with the extension .js
. Also try finding files that end in another
extension in your directory.
Find files by content
grep
searches plain text with regular expressions. It performs a global search
and prints all lines that match. It is a great tool for searching the contents
of one or more files. We can perform a simple search on a file with the following
pattern grep "regexp" /file/path
. Here is an example.
grep "Hello" main.js
This returns any lines from main.js
that has the word Hello
exactly.
We can make it context insensitive by adding -i
.
grep -i "Hello" main.js
If we want it to print out the line number, add the -n
flag.
grep -n "Hello" main.js
We can search a directory recursively by adding the -r
flag. It will print
out the filename where it was found. Add the -n
flag as well if you want the
line number.
grep -r "div" src/
We can even restrict the files we search by extension.
grep -r --include=*.js "div" src
And we can restrict multiple file extensions.
grep -r --include=*.{css,html,js} "div" src
Combining find and grep
There is some overlap in finding files by name in find
and grep
but I
like to use find
to filter the file names, then grep
to filter the
file contents with the pipe |
operator. It a nice way to separate concerns
even though it might not be necessary.
First example, find all the javascript files that have "hello" in them.
find . -iname "*.js" | xargs grep -i "hello"
What is xargs
? It helps grep
evaluate the file path strings into actual
files and perform grep
on each file. Otherwise, if you do not add xargs
,
grep
will perform its queries just on the file names.
More find commands
find
can exclude hidden files.
find . -not -path "*/\.*"
find
can also include multiple search queries. However, this is a bit more
verbose than --include=*.{html,js}
from grep
. -o
is used to connect
the queries.
find . -name "*.html" -o -name "*.js"
find
can exclude directories from your search. Unfortunately this option is
very verbous, for each option you have to add -path "./exclude/path" -prune
,
and then connect it with -o
and make sure the query ends in -print
.
Search current directory excluding ./node_modules
.
find . -path "./node_modules" -prune -o -print
Search current directory excluding ./node_modules
and ./lib
.
find . -path "./node_modules" -prune -o -path "./lib" -prune -o -print
There is a simpler way to remember by combining it with grep
, but it has
worse performance because find
still search all the files, then grep
removes it.
find . -name "*.js" | grep -v "node_modules"
Finally we can combine a couple of the things we learned here. Exclude
node_modules
and search for files that have .html
or .js
extension.
find . -name "*.html" -o -name "*.js" -path "./node_modules" -prune -o -print
Learn more
I suggest you try some queries on your own and write them down. The more you
practice the easier it will be to recall the syntax, but if you forget, you can
take a look at what you have down previously.
The reverse-i-search (CTRL-r) is a
great way to build searches incrementally by making a small search, running it,
then getting it again from the command line and refining it.
If there are some search queries you use repeatedly you can create a bash alias
to save time.
find
and grep
are easy to get started with, but have a lot of options. To
learn more check out their man
pages,
stackoverflow, or
Digital Ocean Tutorials.
Other tools
Here are some other search tools that are worth looking at.
cool usage examples. If you can install
rg
andfd
give those a try. I’d suggest installing those via cargo.I’ve been really enjoying using
rg
instead ofgrep
, and NO ripgrep does not mean RIP grep even though you could make solid case for it.