Skip to content

cut Command: Extracting Sections from Text

Summary

The cut command is a powerful utility used to extract specific sections or fields from lines of text, based on delimiters or character positions. It's particularly useful for parsing structured data like CSV files or log files.

Introduction

The cut command allows you to select portions of each line of a file or standard input based on delimiters (e.g., commas, tabs, spaces) or character positions. This is a valuable tool for data manipulation and extraction in shell scripting and command-line workflows. Its basic function involves specifying either a delimiter and field numbers or character positions to extract, providing flexibility in handling various data formats.

Use Case and Examples

Extracting the First Field from a Comma-Separated File

cut -d ',' -f 1 data.csv
This command extracts the first field (column) from the data.csv file, using a comma as the delimiter. For example, if data.csv contains name,age,city, this command would output just the name.

Extracting the First and Third Fields from a Pipe-Delimited File

cut -d '|' -f 1,3 data.txt
This command extracts the first and third fields from the data.txt file, using a pipe character (|) as the delimiter. It's important to quote the pipe character to prevent shell interpretation.

Extracting Characters from Position 10 to 20

cut -c 10-20 logfile.txt
This command extracts characters from the 10th to the 20th position (inclusive) of each line in the logfile.txt file. This is useful when dealing with fixed-width data.

Extracting all Characters Starting from Position 5

cut -c 5- input.txt
This command extracts all characters starting from the 5th position of each line in the input.txt file.

Cutting using Standard Input

echo "field1,field2,field3" | cut -d ',' -f 2
This command passes a string to cut via standard input and extracts the second field, delimited by a comma. The output would be field2.

Commonly used flags

Flag Description Example
-d DELIMITER Specifies the delimiter character. If not specified, the default delimiter is a tab character. cut -d ':' -f 1 /etc/passwd (Extracts usernames from /etc/passwd using ':' as the delimiter)
-f FIELDS Specifies the fields to extract. Can be a single field number, a comma-separated list of fields, or a range of fields. cut -d ',' -f 2,4 data.csv (Extracts the second and fourth fields from data.csv)
-c CHARACTERS Specifies the character positions to extract. Can be a single character position, a comma-separated list of positions, or a range of positions. cut -c 1-10 file.txt (Extracts the first 10 characters of each line in file.txt)
--complement Selects the inverse of the specified fields or character positions. cut -d ',' --complement -f 2 data.csv (Extracts all fields except the second field from data.csv)
--output-delimiter=STRING Use STRING as the output delimiter. The default is to use the input delimiter. cut -d ',' -f 1,3 --output-delimiter='|' data.csv (Extracts the first and third fields using comma as input delimiter and pipe as output delimiter.)


Share on Share on

Comments