Chapter 8: Essential File and Text Utilities


Linux ships with a rich set of command-line utilities for inspecting, finding, and manipulating files. This chapter covers the tools you will use most often — from checking file types and permissions to searching content and managing disk space.


Inspecting Files

file document.pdf                      # determine file type from content, not extension
wc file.txt                            # count lines, words, bytes
wc -l file.txt                         # count lines only
stat file.txt                          # detailed metadata: size, permissions, timestamps
du -sh directory/                      # disk usage of a directory (human-readable)
du -sh *                               # disk usage of everything in the current directory
df -h                                  # filesystem disk space usage (human-readable)

Viewing File Contents

cat file.txt                           # print entire file
less file.txt                          # page through a file (q to quit, / to search)
head file.txt                          # first 10 lines
head -n 20 file.txt                    # first 20 lines
tail file.txt                          # last 10 lines
tail -n 20 file.txt                    # last 20 lines
tail -f logfile.txt                    # follow a file in real time (Ctrl+C to stop)

less is the most useful for large files: it does not load the entire file into memory, supports searching with /, and you can scroll both ways.


Finding Files

find: search by attributes

find . -name "*.txt"                   # find by name pattern (case-sensitive)
find . -iname "*.txt"                  # case-insensitive
find /home -type f                     # files only (not directories)
find /home -type d                     # directories only
find . -mtime -7                       # modified in the last 7 days
find . -size +10M                      # files larger than 10 MB
find . -name "*.log" -delete           # find and delete
find . -name "*.py" -exec wc -l {} \;  # find and run a command on each result
locate filename                        # instant search using a pre-built index
sudo updatedb                          # update the index (run after adding new files)

locate is much faster than find but only knows about files that existed when the index was last updated.

Locating commands

which python3                          # full path of an executable in your PATH
whereis python3                        # binary, source, and man page locations
type python3                           # how the shell resolves a command (alias, function, binary)

Searching File Contents

grep: search for patterns

grep "pattern" file.txt                # search in a file
grep -r "pattern" directory/           # search recursively
grep -i "pattern" file.txt             # case-insensitive
grep -n "pattern" file.txt             # show line numbers
grep -l "pattern" directory/           # list files that match (not lines)
grep -v "pattern" file.txt             # invert: lines that do NOT match
grep -c "pattern" file.txt             # count matching lines

For large codebases, ripgrep (rg) is significantly faster than grep and respects .gitignore by default:

sudo apt install ripgrep
rg "pattern" .                         # fast recursive search

File Permissions

Every file has three permission sets: owner, group, others. Each set has three bits: read (r=4), write (w=2), execute (x=1).

ls -l file.txt
# -rw-r--r-- 1 user group 1234 Jan 1 12:00 file.txt
# ^ ^^^ ^^^ ^^^
# | |   |   +-- others: r-- (read only)
# | |   +------ group: r-- (read only)
# | +---------- owner: rw- (read + write)
# +------------ type: - (file), d (directory), l (symlink)

chmod: change permissions

# Symbolic notation
chmod u+x script.sh                    # add execute for owner
chmod go-w file.txt                    # remove write for group and others
chmod a+r file.txt                     # add read for all (a = all)

# Octal notation (more common in scripts)
chmod 755 script.sh                    # rwxr-xr-x
chmod 644 file.txt                     # rw-r--r--
chmod 600 ~/.ssh/id_ed25519            # rw------- (private key)
chmod 700 ~/.ssh                       # rwx------ (directory)

Common octal values: 755 for executables and directories, 644 for regular files, 600 for sensitive files.

chown: change ownership

sudo chown user file.txt               # change owner
sudo chown user:group file.txt         # change owner and group
sudo chown -R user:group directory/    # recursive

# Symbolic link (like a shortcut — can cross filesystems)
ln -s /path/to/target linkname

# Hard link (another name for the same file — same filesystem only)
ln /path/to/file hardlinkname

# Check where a symlink points
readlink -f linkname
ls -la                                 # symlinks shown as: name -> target

Checking Open Files

lsof /path/to/file                     # which process has this file open
lsof -p 1234                           # all files opened by PID 1234
lsof -i :8080                          # process using port 8080

lsof is invaluable when you cannot delete or unmount something because a process has it open.


Key Takeaways


← Chapter 7: File Compression Table of Contents Chapter 9: Process and System Monitoring →