Skip to content

split Command in Linux

Summary

The split command is a utility in Linux used to divide a file into multiple smaller files. It's particularly useful when dealing with very large files that are difficult to manage or process as a single unit.

Introduction

The split command breaks a file into multiple segments, each with a specified size or number of lines. By default, split creates files named xaa, xab, xac, and so on. You can customize the prefix of these output files and the number of characters used in the suffix. This is especially helpful when dealing with large log files, datasets, or any situation where smaller, more manageable chunks are needed.

Use case and Examples

Splitting a file into smaller files of 1000 lines each

split -l 1000 large_file.txt small_file_prefix
This command splits large_file.txt into smaller files, each containing 1000 lines. The output files will be named small_file_prefixaa, small_file_prefixab, small_file_prefixac, and so on.

Splitting a file into smaller files of 1MB each

split -b 1m large_file.txt small_file_prefix
This command splits large_file.txt into smaller files, each approximately 1MB in size. The output files will be named small_file_prefixaa, small_file_prefixab, small_file_prefixac, and so on. The m indicates megabytes; k can be used for kilobytes and g for gigabytes.

Splitting a file into a specified number of files

split -n 3 large_file.txt small_file_prefix
This command splits large_file.txt into 3 files. The content of the input file will be distributed as evenly as possible across the output files.

Using a numeric suffix instead of a letter suffix

split -d -l 500 large_file.txt small_file_prefix
This command splits large_file.txt into smaller files each with 500 lines. The output files will be named small_file_prefix00, small_file_prefix01, small_file_prefix02, and so on. The -d flag specifies the numeric suffix.

Specifying the suffix length

split -d -a 3 -l 500 large_file.txt small_file_prefix
This command splits large_file.txt into smaller files each with 500 lines using a numeric suffix of length 3. The output files will be named small_file_prefix000, small_file_prefix001, small_file_prefix002, and so on. The -a flag followed by a number defines the suffix length.

Commonly used flags

Flag Description Example
-l <lines> or --lines=<lines> Specifies the number of lines per output file. split -l 200 myfile.txt (splits into files with 200 lines each)
-b <bytes> or --bytes=<bytes> Specifies the number of bytes per output file. You can use suffixes like k (kilobytes), m (megabytes), and g (gigabytes). split -b 5m myfile.txt (splits into files with 5MB each)
-d or --numeric-suffixes Uses numeric suffixes instead of alphabetic suffixes. split -d -l 100 myfile.txt
-a <length> or --suffix-length=<length> Specifies the length of the suffix. The default is 2. split -a 4 -l 100 myfile.txt (uses a suffix length of 4, like 0000, 0001, etc.)
-n <number> or --number=<number> Split into N chunks. The chunks will be attempted to be of equal size. split -n 3 myfile.txt (splits into 3 files with approximately equal line counts)
--verbose Print a diagnostic just before each output file is opened. split --verbose -l 100 myfile.txt
--help Display help information and exit. split --help
--version Output version information and exit. split --version


Share on Share on

Comments