How to Use grep with Regex
A regular expression (regex) describes a search pattern. Here are the most useful pieces for day-to-day work.
Tip on quoting: wrap your pattern in single quotes
'...'so the shell doesn’t change special characters.
Literals: Match exact text
grep 'rack A3' notes.txt
Anchors: start ^ and end $
grep '^2025-08-14' server.log # lines that start with the date
grep 'daily\.$' notes.txt # lines that end with the word "daily."
Any single character: .
grep 'A.ice' data.csv # matches "Alice" and "A-ice" (any one char between A and ice)
Character sets: [ ... ] and ranges
grep 'user=[abc][a-z]*' server.log # user=a..., user=b..., or user=c..., followed by letters
Negation with [^...] (anything except):
grep 'code=[^0-9]' server.log # code= followed by a non-digit
POSIX classes
POSIX character classes are named sets of characters you use inside a bracket expression.
Syntax: they must be wrapped like this: [[:digit:]], [[:alpha:]], [[:space:]].
- The outer square brackets
[...]mean “match one character from this set.” - The inner
[:digit:],[:alpha:], etc., name the set. -
They’re locale-aware:
[:alpha:]can include accented letters in your language settings. - Common classes:
[:digit:]→ digits 0–9[:alpha:]→ letters (A–Z and a–z; plus locale letters)[:alnum:]→ letters or digits[:space:]→ whitespace (space, tab, newline, etc.)[:lower:],[:upper:],[:punct:],[:blank:],[:xdigit:](hex),[:print:],[:graph:]
Important: [:digit:] alone won’t work. It must be inside [...]: [[:digit:]].
cat data.csv
id,name,score
1,Alice,92
2,Bob,81
3,Caroline,92
4,Ali,70
5,O'Neil,77
6,Jean-Luc Picard!,99
6,Jean-Luc Picard,99
Pattern 1:
grep -E '^[[:digit:]]+,[[:alpha:]]+,[[:digit:]]+$' data.csv
1,Alice,92
2,Bob,81
3,Caroline,92
4,Ali,70
Goal: match lines like 123,Alice,92
^— start of line (anchor).[[:digit:]]+— one or more digits → the ID field.,— a literal comma (delimiter).[[:alpha:]]+— one or more letters only (A–Z, a–z; plus locale letters) → the Name field with no spaces or hyphens (e.g., Alice, Bob).,— comma.[[:digit:]]+— one or more digits → the Score field.$— end of line (anchor).
Doesn’t match: 1,Ali ce,92 (space), 1,Mary-Jane,92 (hyphen), header id,name,score (no leading digits).
Pattern 2:
grep -E '^[[:digit:]]+,[[:alpha:][:space:]-]+,[[:digit:]]+$' data.csv
1,Alice,92
2,Bob,81
3,Caroline,92
4,Ali,70
6,Jean-Luc Picard,99
Goal: allow multi-word or hyphenated names: 123,Mary Jane-Smith,92
What changed
- Middle field:
[[:alpha:][:space:]-]+one or more of:[:alpha:]— letters,[:space:]— whitespace (space, tab, etc.),-— a literal hyphen
So names like “Alice Smith” or “Mary-Jane” now match.
Doesn’t match: 5,O’Neil,77 (apostrophe not allowed), 6,Jean-Luc Picard!,99 (! not allowed)
- Key differences (at a glance)
- Pattern 1: Name = letters only
- Pattern 2: Name = letters + spaces + hyphens
- Both require:
- Field 1:
digitsField 2:alphaField 3:digits - Fields separated by commas, line fully matched due to
^and$.
- Field 1:
Allow apostrophes in names:
grep -E '^[[:digit:]]+,[[:alpha:][:space:]'\''-]+,[[:digit:]]+$' data.csv
1,Alice,92
2,Bob,81
3,Caroline,92
4,Ali,70
5,O'Neil,77
6,Jean-Luc Picard,99
(Note the escaped ' inside single quotes.)
Allow only space (not tabs) between words:
grep -E '^[[:digit:]]+,[[:alpha:] -]+,[[:digit:]]+$' data.csv
(That middle class includes ` ` and -, but not [:space:].)
if you need non-ASCII letters (e.g., accents), ensure your locale is set (e.g., LC_ALL=en_US.UTF-8), because [:alpha:] is locale-aware.
Repeaters: *, +, ?, {m,n}
Use extended regex with -E to write these cleanly.
*→ zero or more+→ one or more?→ zero or one (optional){m,n}→ betweenmandntimes (use{m,}for “at least m”;{m}for “exactly m”)
grep -E 'user=[a-z]+' server.log # one or more letters after user=
grep -E 'size=[0-9]{1,3}' server.log # 1 to 3 digits (000–999)
grep -E 'Ali(ce)?' data.csv # "Ali" or "Alice" (group is optional)
grep -E 'code=[0-9]{3}' server.log # exactly 3 digits (like 200, 403, 500)
grep -E 'v[0-9]+\.[0-9]+' notes.txt # versions like v2.1, v10.12
Grouping (…) and alternation |
- Grouping
(...)lets you treat many characters as one unit. - Alternation
|means “this or that”.
grep -E '(ERROR|WARN)' server.log # lines containing ERROR or WARN
grep -E 'Ali(ce|son)?' data.csv # Ali, Alice, or Alison
grep -E '^(INFO|WARN|ERROR)\b' server.log # line starts with a log level
Escaping special characters
- These are special in regex:
.()[]{}+?*^$|. - To search for them literally, put a backslash before them:
\.\[\]…
grep '\.$' notes.txt # lines that end with a literal period .
grep '\[INFO\]' server.log # literal [INFO]
grep 'price\$' prices.txt # the dollar sign itself
Tip: put your whole pattern in single quotes '...' so the shell doesn’t change *, $, or \.
Extended vs Basic regex
- Basic regex (default):
*works, but+,?,{m,n},|, and()are awkward and often need backslashes. - Extended regex (
-Eor egrep): lets you write+,?,{m,n},|,()naturally. Recommendation: use-Efor clarity and fewer escapes.
Practical mini-tasks (cleaned up)
Count how many ERROR lines happened per user
grep 'ERROR' server.log | grep -o 'user=[a-z]*' | sort | uniq -c
Show only the matching piece (not the whole line)
grep -o 'user=[a-z]*' server.log
Find lines that don’t mention a rack code
grep -v 'rack [A-Z][0-9]\+' notes.txt
Case-insensitive search for “ali” as a whole word
grep -i -w 'ali' data.csv
Find repeated scores (same number appears on multiple lines)
cut -d, -f3 data.csv | tail -n +2 | sort | uniq -d
# Show full rows for a score (example: 92)
grep -E ',92$' data.csv
Performance & safety tips
- Quote your pattern: use single quotes
'pattern'to prevent the shell from expanding*,?, or$. - Use
-Ffor fixed-string searches (no regex): faster and safer for plain text.grep -F 'a.b' filematchesa.bliterally. - Skip binary files with
-I, or force treat as text with-aif needed. - Show filenames with
-H(on by default when searching multiple files). - Context helps debugging: add
-nfor line numbers,--color=autofor highlights, and-C NUMfor context lines. - Locale affects classes like
[[:alpha:]]. For pure byte speed, some admins useLC_ALL=C(advanced). - Portability:
grep -P(Perl regex) is powerful but not guaranteed on all systems; prefer-Eunless you need PCRE features.