Red backlit keyboard and code on laptop screen create a tech-focused ambiance.

Linux Text Processing for Sysadmins

The Power Trio of Text Processing

Logs, configs, and data outputs are all text. The ability to search, filter, and transform text efficiently separates those who manually copy-paste from those who process gigabytes of logs in seconds.

grep finds. sed replaces. awk structures. Together, they’re an analytical powerhouse that’s been solving problems since before most of us were born. Master these three and you’ll handle any text-based troubleshooting thrown at you.

grep: Finding Things

Basic Usage

# Find lines containing pattern
grep "error" /var/log/syslog

# Case-insensitive
grep -i "error" /var/log/syslog

# Show line numbers
grep -n "error" /var/log/syslog

# Count matches
grep -c "error" /var/log/syslog

# Invert (lines NOT matching)
grep -v "debug" /var/log/syslog

Context Lines

# 3 lines before match
grep -B 3 "error" logfile

# 3 lines after match
grep -A 3 "error" logfile

# 3 lines before and after
grep -C 3 "error" logfile

Recursive Search

# Search all files in directory
grep -r "pattern" /path/

# Only certain file types
grep -r --include="*.log" "error" /var/log/

# Exclude directories
grep -r --exclude-dir=".git" "TODO" ./

Regular Expressions

# Extended regex
grep -E "error|warning|critical" logfile

# Match IP addresses
grep -E "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}" logfile

# Match email pattern
grep -E "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}" file

# Word boundary (whole word only)
grep -w "error" logfile  # Won't match "errors"

Practical grep Examples

# Failed SSH logins
grep "Failed password" /var/log/auth.log

# HTTP 500 errors in nginx
grep " 500 " /var/log/nginx/access.log

# Find config files mentioning a service
grep -r "database_host" /etc/

# Processes using CPU (from ps output)
ps aux | grep -v grep | grep python

sed: Stream Editing

sed transforms text on the fly. Most commonly used for find-and-replace.

Basic Substitution

# Replace first occurrence on each line
sed 's/old/new/' file

# Replace all occurrences
sed 's/old/new/g' file

# Case-insensitive replace
sed 's/old/new/gi' file

# Edit file in place
sed -i 's/old/new/g' file

# Backup before in-place edit
sed -i.bak 's/old/new/g' file

Deletion

# Delete lines matching pattern
sed '/pattern/d' file

# Delete empty lines
sed '/^$/d' file

# Delete lines 5-10
sed '5,10d' file

# Delete first line
sed '1d' file

# Delete last line
sed '$d' file

Insertion and Appending

# Insert line before match
sed '/pattern/i\new line' file

# Append line after match
sed '/pattern/a\new line' file

# Replace entire line
sed '/pattern/c\replacement line' file

Practical sed Examples

# Change config value
sed -i 's/listen 80/listen 8080/g' nginx.conf

# Remove comments
sed '/^#/d' config.file

# Add prefix to lines
sed 's/^/PREFIX: /' file

# Extract between markers
sed -n '/START/,/END/p' file

# Multiple operations
sed -e 's/foo/bar/g' -e 's/baz/qux/g' file

awk: Data Extraction

awk treats text as structured data with fields. Perfect for logs and CSVs.

Field Extraction

# Print first field (space-separated)
awk '{print $1}' file

# Print multiple fields
awk '{print $1, $3}' file

# Custom delimiter
awk -F: '{print $1}' /etc/passwd

# Print last field
awk '{print $NF}' file

# Print second-to-last
awk '{print $(NF-1)}' file

Filtering

# Lines where field equals value
awk '$3 == "error" {print}' file

# Numeric comparison
awk '$5 > 100 {print}' file

# Pattern match
awk '/error/ {print}' file

# Combine conditions
awk '$3 == "error" && $5 > 100 {print}' file

Calculations

# Sum a column
awk '{sum += $1} END {print sum}' file

# Average
awk '{sum += $1; count++} END {print sum/count}' file

# Count occurrences
awk '{count[$1]++} END {for (i in count) print i, count[i]}' file

Formatting Output

# Printf formatting
awk '{printf "%-20s %10d\n", $1, $2}' file

# Add header
awk 'BEGIN {print "Name\tCount"} {print $1 "\t" $2}' file

# Custom separator in output
awk -F: 'BEGIN {OFS=","} {print $1, $3, $7}' /etc/passwd

Practical awk Examples

# nginx: Top 10 IPs by requests
awk '{print $1}' access.log | sort | uniq -c | sort -rn | head

# Sum of response sizes
awk '{sum += $10} END {print sum}' access.log

# Average response time
awk '{sum += $NF; count++} END {print sum/count}' access.log

# Requests per hour
awk -F: '{print $1":"$2}' access.log | sort | uniq -c

# Users with bash shell
awk -F: '$7 ~ /bash/ {print $1}' /etc/passwd

# Disk usage over 80%
df -h | awk '$5 > 80 {print $1, $5}'

Combining the Tools

Pipeline Power

# Find errors, extract IPs, count, sort
grep "error" access.log | awk '{print $1}' | sort | uniq -c | sort -rn | head

# Replace and filter
sed 's/ERROR/ALERT/g' logfile | grep "ALERT"

# Complex log analysis
cat access.log | \
  grep " 500 " | \
  awk '{print $1, $4, $7}' | \
  sed 's/\[//g' | \
  sort | uniq -c | sort -rn

Real Scenarios

Analyze 500 Errors

# Which URLs are causing 500 errors?
grep " 500 " access.log | \
  awk '{print $7}' | \
  sort | uniq -c | sort -rn | head -10

Find Slow Requests

# Requests taking over 1 second (if response time is last field)
awk '$NF > 1.0 {print}' access.log

Config File Manipulation

# Update multiple config values
sed -i \
  -e 's/database_host=.*/database_host=newdb.local/' \
  -e 's/cache_size=.*/cache_size=1024/' \
  config.ini

User Audit

# Users who logged in yesterday
grep "$(date -d yesterday +%b\ %d)" /var/log/auth.log | \
  grep "session opened" | \
  awk '{print $11}' | sort -u

Quick Reference

# grep
grep "pattern" file              # Basic search
grep -i "pattern" file           # Case-insensitive
grep -r "pattern" /path/         # Recursive
grep -E "pat1|pat2" file         # Extended regex
grep -v "pattern" file           # Invert match
grep -C 3 "pattern" file         # Context lines

# sed
sed 's/old/new/' file            # Replace first
sed 's/old/new/g' file           # Replace all
sed -i 's/old/new/g' file        # In-place edit
sed '/pattern/d' file            # Delete matching lines
sed -n '10,20p' file             # Print lines 10-20

# awk
awk '{print $1}' file            # Print first field
awk -F: '{print $1}' file        # Custom delimiter
awk '$3 > 100' file              # Numeric filter
awk '/pattern/ {print}' file     # Pattern match
awk '{sum+=$1} END {print sum}'  # Sum column

Interview Questions

  • “How would you find all unique IPs in an nginx access log?”
    awk '{print $1}' access.log | sort -u
    # Or with counts:
    awk '{print $1}' access.log | sort | uniq -c | sort -rn
    
  • “Replace a value in all config files in a directory?”
    find /etc/myapp -name "*.conf" -exec sed -i 's/oldvalue/newvalue/g' {} \;
    
  • “Find the top 10 requested URLs from a log file?”
    awk '{print $7}' access.log | sort | uniq -c | sort -rn | head -10
    

The Career Translation

Skill Demonstrates Role Level
Basic grep Can search logs Helpdesk (£25-30k)
grep + sed Text manipulation Junior Sysadmin (£32-40k)
Full grep/sed/awk Data analysis capability Mid-level (£40-50k)
Complex pipelines Automation mindset Senior/DevOps (£50k+)

Next Steps

  • jq – JSON processing (modern log formats)
  • Perl/Python one-liners – When awk isn’t enough
  • xargs – Combine with find for batch operations
  • Regular expressions – Deep dive for complex patterns

These tools have been solving text problems for 50 years. Learn them well and you’ll solve problems faster than anyone reaching for Python.


Part 11 of the Linux Fundamentals series. Final part coming: disk management with df, du, lsblk, and mount.


Linux Fundamentals Series – Part 11 of 12

Previous: SSH Essentials

Next: Disk Management: df, du, lsblk

View the full series

Scroll to Top