split
Command in Linux
Summary
The split
command is a utility in Linux used to divide a file into multiple smaller files. It's particularly useful when dealing with very large files that are difficult to manage or process as a single unit.
Introduction
The split
command breaks a file into multiple segments, each with a specified size or number of lines. By default, split
creates files named xaa
, xab
, xac
, and so on. You can customize the prefix of these output files and the number of characters used in the suffix. This is especially helpful when dealing with large log files, datasets, or any situation where smaller, more manageable chunks are needed.
Use case and Examples
Splitting a file into smaller files of 1000 lines each
This command splitslarge_file.txt
into smaller files, each containing 1000 lines. The output files will be named small_file_prefixaa
, small_file_prefixab
, small_file_prefixac
, and so on. Splitting a file into smaller files of 1MB each
This command splitslarge_file.txt
into smaller files, each approximately 1MB in size. The output files will be named small_file_prefixaa
, small_file_prefixab
, small_file_prefixac
, and so on. The m
indicates megabytes; k
can be used for kilobytes and g
for gigabytes. Splitting a file into a specified number of files
This command splitslarge_file.txt
into 3 files. The content of the input file will be distributed as evenly as possible across the output files. Using a numeric suffix instead of a letter suffix
This command splitslarge_file.txt
into smaller files each with 500 lines. The output files will be named small_file_prefix00
, small_file_prefix01
, small_file_prefix02
, and so on. The -d
flag specifies the numeric suffix. Specifying the suffix length
This command splitslarge_file.txt
into smaller files each with 500 lines using a numeric suffix of length 3. The output files will be named small_file_prefix000
, small_file_prefix001
, small_file_prefix002
, and so on. The -a
flag followed by a number defines the suffix length. Commonly used flags
Flag | Description | Example |
---|---|---|
-l <lines> or --lines=<lines> | Specifies the number of lines per output file. | split -l 200 myfile.txt (splits into files with 200 lines each) |
-b <bytes> or --bytes=<bytes> | Specifies the number of bytes per output file. You can use suffixes like k (kilobytes), m (megabytes), and g (gigabytes). | split -b 5m myfile.txt (splits into files with 5MB each) |
-d or --numeric-suffixes | Uses numeric suffixes instead of alphabetic suffixes. | split -d -l 100 myfile.txt |
-a <length> or --suffix-length=<length> | Specifies the length of the suffix. The default is 2. | split -a 4 -l 100 myfile.txt (uses a suffix length of 4, like 0000 , 0001 , etc.) |
-n <number> or --number=<number> | Split into N chunks. The chunks will be attempted to be of equal size. | split -n 3 myfile.txt (splits into 3 files with approximately equal line counts) |
--verbose | Print a diagnostic just before each output file is opened. | split --verbose -l 100 myfile.txt |
--help | Display help information and exit. | split --help |
--version | Output version information and exit. | split --version |