uniq

Deduplicates TSV rows from one or more files without sorting.

Behavior:

Input:

Output:

Header behavior:

Supports --header / -H and --header-hash1 modes.
When using header mode with multiple files, only the header from the first file is written; headers from subsequent files are skipped.

Field syntax:

Use --fields / -f to specify columns to use as the deduplication key.
Use 0 to indicate the entire line should be used as the key (default behavior).
Field lists support 1-based indices, ranges (1-3,5-7), header names, name ranges (run-user_time), and wildcards (*_time).
Run tva --help-fields for a full description shared across tva commands.

Examples:

Deduplicate whole rows tva uniq data.tsv
Deduplicate by column 2 tva uniq data.tsv -f 2
Deduplicate with header using named fields tva uniq --header -f name,age data.tsv
Output only repeated lines tva uniq --repeated data.tsv
Output lines repeated at least 3 times tva uniq --at-least 3 data.tsv
Output with equivalence class IDs tva uniq --header -f 1 --equiv --number data.tsv
Deduplicate multiple files with header tva uniq --header file1.tsv file2.tsv file3.tsv
Ignore case when comparing tva uniq --ignore-case data.tsv

Keyboard shortcuts