wEEK 7
Advanced Text Editing Techniques
Why Text Editing Mastery Matters in Linux
In Week 6, we learned the essentials of working with text:
Creating and viewing files with
echoandcatEditing with
nanoandvimNavigating and inspecting with
less,wc,diff, andgrepOrganizing with directories and
mvSearching with
grepand in-editor tools
Those skills gave us the foundation to work with text files safely and effectively. Now, we are ready to move beyond basic viewing and editing. This week, we focus on advanced text editing techniques that let you transform, filter, and analyze text at scale.
P.A.G.E.S. Framework
P — Print with cat
The cat command is one of the most widely used text tools in Linux because it gives an immediate view of file contents. System administrators and developers rely on it constantly: to quickly check configuration files, confirm the results of scripts, preview logs, or even combine files together. It is also a common starting point in pipelines, where the output from cat is passed into other tools for filtering, searching, or counting. Its simplicity and speed make it the go-to choice when you want to see raw text without opening an editor. While cat is best for smaller files, it’s also used to glue files together, to append new data into logs, and to send file contents directly into processing commands.
| Use Case | Command | Example | Notes |
|---|---|---|---|
| Print file | cat file.txt | cat notes.txt | Quick way to view small files. |
| Print multiple | cat file1 file2 | cat intro.txt body.txt | Displays files in sequence. |
| Merge into new file | cat file1 file2 > combined.txt | cat jan.csv feb.csv > q1.csv | Creates a single output file. |
| Append to file | cat file1 >> output.txt | cat update.log >> all.log | Adds without erasing existing content. |
| Pipe into search | `cat file | grep error` | `cat syslog |
| Pipe into count | `cat file | wc -l` | `cat data.csv |
| Page through file | `cat file | less` | `cat large.txt |
| Better alternative | grep "error" file | instead of `cat file | grep “error”` |
| Alternative tool | tail -f logfile.txt | tail -f syslog | Live log monitoring. |
A — Analyze with wc
The wc command (“word count”) is widely used because it gives a quick numerical summary of a file: how many lines, words, and characters it contains. Administrators use it to check the size of configuration files, verify that a dataset has the expected number of rows, or confirm that a script output is complete. It is also a common step in pipelines, where wc is used to count filtered results from tools like grep. This makes it valuable not just for measuring file size, but also for validating that processing steps are working as intended.
| Use Case | Command | Example | Notes |
|---|---|---|---|
| Count all (lines, words, chars) | wc file.txt | wc notes.txt | Shows full summary. |
| Count lines | wc -l file.txt | wc -l access.log | Common for counting records or log entries. |
| Count words | wc -w file.txt | wc -w report.txt | Useful for documentation or essays. |
| Count characters | wc -c file.txt | wc -c config.cfg | Measures raw size in bytes. |
| Pipe into wc | grep "error" log.txt | wc -l | — | Counts how many lines matched a search. |
| Check dataset rows | cat data.csv | wc -l | — | Confirms number of records in a file. |
G — Grep for Searching (with Regex)
The grep command is used constantly in Linux because it can scan through files and print only the lines that match what you’re looking for. This is especially valuable for system logs, error messages, and configuration files where scrolling line by line would take too long. grep works with simple keywords, but it becomes far more powerful when combined with regular expressions (regex), which let you search for patterns like numbers, dates, or strings at the start or end of a line. Administrators and developers use grep not only to troubleshoot (finding “error” in a log) but also to analyze datasets (locating all rows with certain fields), or even to confirm whether a configuration contains the right parameters.
| Use Case | Command | Example | Notes |
|---|---|---|---|
| Simple search | grep text file | grep "error" syslog | Finds all lines containing “error.” |
| Show line numbers | grep -n text file | grep -n "server" config.txt | Helps navigate to exact spots in configs. |
| Case-insensitive | grep -i text file | grep -i "warning" log.txt | Matches regardless of letter case. |
| Regex search | grep -E "[0-9]{3}" file | grep -E "[0-9]{3}" data.txt | Matches three-digit numbers. |
| Search start of line | grep "^pattern" file | grep "^server" config.txt | Finds lines beginning with “server.” |
| Search with context | grep -C 2 text file | grep -C 2 "error" syslog | Shows matching lines with 2 lines above and below. |
| Pipe search | cat file | grep text | cat syslog | grep "timeout" | Works in pipelines, though direct file search is faster. |
Navigate the tree
E — Extract with cut
The cut command is used to pull out specific columns or character ranges from text files, which is especially useful for structured data like CSVs or log files. Instead of reading every field, you can extract just the pieces you need, such as names, scores, or dates. The tr command (translate) complements this by transforming or cleaning up characters — for example, converting lowercase to uppercase, deleting digits, or replacing spaces with newlines. Together, cut and tr allow quick parsing and cleanup of text without opening a full editor.
| Use Case | Command | Example | Notes |
|---|---|---|---|
| Extract a column | cut -d',' -f1 file.csv | cut -d',' -f1 scores.csv | Gets the first column (e.g., names). |
| Extract multiple fields | cut -d',' -f1,3 file.csv | cut -d',' -f1,3 records.csv | Grabs name and department. |
| Extract characters | cut -c1-5 file.txt | cut -c1-5 ids.txt | Shows first 5 characters per line. |
| Combine with sort | cut -d',' -f2 scores.csv | sort -n | — | Sorts scores in numeric order. |
| Delete characters | tr -d '0-9' < file.txt | — | Removes all digits from text. |
| Convert case | tr 'a-z' 'A-Z' < file.txt | — | Converts text to uppercase. |
| Split into lines | tr ' ' '\n' < words.txt | — | Breaks words into separate lines for counting or sorting. |
S — Sort for Organizing
Sorting and processing text is essential when working with logs, reports, or datasets. The sort command organizes lines alphabetically or numerically and can remove duplicates when paired with uniq. For deeper analysis, awk treats each line as a record and works with fields (columns), letting you print, calculate, or filter specific data. The sed command is a stream editor that automates text substitutions, deletions, or pattern-based edits across entire files. Together, these three tools let you go from raw, unstructured text to clean, organized, and even summarized output.
| Tool & Use Case | Command | Example | Notes |
|---|---|---|---|
| Sort alphabetically | sort file.txt | sort names.txt | Organizes text into order. |
| Sort numerically | sort -n file.txt | sort -n ages.txt | Useful for numbers like scores or IDs. |
| Unique with counts | sort file.txt | uniq -c | sort words.txt | uniq -c | Shows frequency of each line. |
| Select column (awk) | awk '{print $1}' file.txt | awk '{print $1}' records.txt | Prints the first field (e.g., names). |
| Multiple fields (awk) | awk -F',' '{print $1, $3}' file.csv | awk -F',' '{print $1, $3}' employees.csv | Extracts specific CSV columns. |
| Summation (awk) | awk '{sum += $2} END {print sum}' sales.txt | — | Adds up a numeric column. |
| Replace text (sed) | sed 's/error/warning/' file.txt | sed 's/error/warning/' log.txt | Replaces first match on each line. |
| Replace all (sed) | sed 's/error/warning/g' file.txt | — | Replaces every occurrence on each line. |
| Delete line (sed) | sed '2d' file.txt | — | Removes a specific line by number. |
This concludes Lecture 7. Please return to Blackboard to access the Week 7 materials.