Command line interface

The command line interface in tsinfer is intended to provide a convenient interface to the high-level API functionality. There are two equivalent ways to invoke this program:

$ tsinfer

or

$ python3 -m tsinfer

The first form is more intuitive and works well most of the time. The second form is useful when multiple versions of Python are installed or if the tsinfer executable is not installed on your path.

The tsinfer program has five subcommands: list prints a summary of the data held in one of tsinfer’s file formats; infer runs the complete inference process for a given input samples file; and generate-ancestors, match-ancestors and match-samples run the three parts of this inference process as separate steps. Running the inference as separate steps like this is recommended for large inferences as it allows for greater control over the inference process.

Argument details

Command line interface for tsinfer.

usage: tsinfer [-h] [-V]
               {generate-ancestors,ga,match-ancestors,ma,augment-ancestors,aa,match-samples,ms,infer,list,ls,verify}
               ...

Positional Arguments

subcommand Possible choices: generate-ancestors, ga, match-ancestors, ma, augment-ancestors, aa, match-samples, ms, infer, list, ls, verify

Named Arguments

-V, --version show program’s version number and exit

Sub-commands:

generate-ancestors (ga)

Generates a set of ancestors from the input sample data and stores the results in a tsinfer ancestors file.

tsinfer generate-ancestors [-h] [-a ANCESTORS] [--num-threads NUM_THREADS]
                           [--num-flush-threads NUM_FLUSH_THREADS]
                           [--progress] [-v]
                           [--log-section {tsinfer.inference,tsinfer.formats,tsinfer.threads}]
                           samples
Positional Arguments
samples The input sample data in tsinfer ‘samples’ format. Please see the documentation at http://tsinfer.readthedocs.io/ for information on how to import data into this format.
Named Arguments
-a, --ancestors
 The path to the ancestor data file in tsinfer ‘ancestors’ format. If not specified, this defaults to the input samples file stem with the extension ‘.ancestors’. For example, if ‘1kg-chr1.samples’ is the input file then the default ancestors file would be ‘1kg-chr1.ancestors’
--num-threads, -t
 The number of worker threads to use. If < 1, use a simpler unthreaded algorithm (default).
--num-flush-threads, -F
 The number of data flush threads to use. If < 1, all data is flushed synchronously in the main thread (default=2)
--progress, -p Show a progress monitor.
-v, --verbosity
 Increase the verbosity
--log-section, -L
 

Possible choices: tsinfer.inference, tsinfer.formats, tsinfer.threads

Log messages only for the specified module

match-ancestors (ma)

Matches the ancestors built by the ‘generate-ancestors’ command against each other using the model information specified in the input file and writes the output to a tskit .trees file.

tsinfer match-ancestors [-h] [-v]
                        [--log-section {tsinfer.inference,tsinfer.formats,tsinfer.threads}]
                        [-a ANCESTORS] [-A ANCESTORS_TREES]
                        [--num-threads NUM_THREADS] [--progress]
                        [--no-path-compression]
                        samples
Positional Arguments
samples The input sample data in tsinfer ‘samples’ format. Please see the documentation at http://tsinfer.readthedocs.io/ for information on how to import data into this format.
Named Arguments
-v, --verbosity
 Increase the verbosity
--log-section, -L
 

Possible choices: tsinfer.inference, tsinfer.formats, tsinfer.threads

Log messages only for the specified module

-a, --ancestors
 The path to the ancestor data file in tsinfer ‘ancestors’ format. If not specified, this defaults to the input samples file stem with the extension ‘.ancestors’. For example, if ‘1kg-chr1.samples’ is the input file then the default ancestors file would be ‘1kg-chr1.ancestors’
-A, --ancestors-trees
 The path to the ancestor trees file in tskit ‘.trees’ format. If not specified, this defaults to the input samples file stem with the extension ‘.ancestors.trees’. For example, if ‘1kg-chr1.samples’ is the input file then the default ancestors file would be ‘1kg-chr1.ancestors.trees’
--num-threads, -t
 The number of worker threads to use. If < 1, use a simpler unthreaded algorithm (default).
--progress, -p Show a progress monitor.
--no-path-compression
 Disable path compression

augment-ancestors (aa)

Augments the ancestors tree sequence by adding a subset of the samples

tsinfer augment-ancestors [-h] [-n NUM_SAMPLES] [-A ANCESTORS_TREES] [-v]
                          [--log-section {tsinfer.inference,tsinfer.formats,tsinfer.threads}]
                          [--no-path-compression] [--num-threads NUM_THREADS]
                          [--progress]
                          samples augmented_ancestors
Positional Arguments
samples The input sample data in tsinfer ‘samples’ format. Please see the documentation at http://tsinfer.readthedocs.io/ for information on how to import data into this format.
augmented_ancestors
 The path to write the augmented ancestors tree sequence to
Named Arguments
-n, --num-samples
 The number of samples to use. Defaults to 10% of the total.
-A, --ancestors-trees
 The path to the ancestor trees file in tskit ‘.trees’ format. If not specified, this defaults to the input samples file stem with the extension ‘.ancestors.trees’. For example, if ‘1kg-chr1.samples’ is the input file then the default ancestors file would be ‘1kg-chr1.ancestors.trees’
-v, --verbosity
 Increase the verbosity
--log-section, -L
 

Possible choices: tsinfer.inference, tsinfer.formats, tsinfer.threads

Log messages only for the specified module

--no-path-compression
 Disable path compression
--num-threads, -t
 The number of worker threads to use. If < 1, use a simpler unthreaded algorithm (default).
--progress, -p Show a progress monitor.

match-samples (ms)

Matches the samples against the tree sequence structure built by the match-ancestors command

tsinfer match-samples [-h] [-v]
                      [--log-section {tsinfer.inference,tsinfer.formats,tsinfer.threads}]
                      [-A ANCESTORS_TREES] [--no-path-compression]
                      [--no-simplify] [-O OUTPUT_TREES]
                      [--num-threads NUM_THREADS] [--progress]
                      samples
Positional Arguments
samples The input sample data in tsinfer ‘samples’ format. Please see the documentation at http://tsinfer.readthedocs.io/ for information on how to import data into this format.
Named Arguments
-v, --verbosity
 Increase the verbosity
--log-section, -L
 

Possible choices: tsinfer.inference, tsinfer.formats, tsinfer.threads

Log messages only for the specified module

-A, --ancestors-trees
 The path to the ancestor trees file in tskit ‘.trees’ format. If not specified, this defaults to the input samples file stem with the extension ‘.ancestors.trees’. For example, if ‘1kg-chr1.samples’ is the input file then the default ancestors file would be ‘1kg-chr1.ancestors.trees’
--no-path-compression
 Disable path compression
--no-simplify Do not simplify the output tree sequence
-O, --output-trees
 The path to the output trees file in tskit ‘.trees’ format. If not specified, this defaults to the input samples file stem with the extension ‘.trees’. For example, if ‘1kg-chr1.samples’ is the input file then the default output file would be ‘1kg-chr1.trees’
--num-threads, -t
 The number of worker threads to use. If < 1, use a simpler unthreaded algorithm (default).
--progress, -p Show a progress monitor.

infer

Runs the generate-ancestors, match-ancestors and match-samples commands without writing the intermediate files to disk. Not recommended for large inferences.

tsinfer infer [-h] [-v]
              [--log-section {tsinfer.inference,tsinfer.formats,tsinfer.threads}]
              [-O OUTPUT_TREES] [--num-threads NUM_THREADS] [--progress]
              samples
Positional Arguments
samples The input sample data in tsinfer ‘samples’ format. Please see the documentation at http://tsinfer.readthedocs.io/ for information on how to import data into this format.
Named Arguments
-v, --verbosity
 Increase the verbosity
--log-section, -L
 

Possible choices: tsinfer.inference, tsinfer.formats, tsinfer.threads

Log messages only for the specified module

-O, --output-trees
 The path to the output trees file in tskit ‘.trees’ format. If not specified, this defaults to the input samples file stem with the extension ‘.trees’. For example, if ‘1kg-chr1.samples’ is the input file then the default output file would be ‘1kg-chr1.trees’
--num-threads, -t
 The number of worker threads to use. If < 1, use a simpler unthreaded algorithm (default).
--progress, -p Show a progress monitor.

list (ls)

Show a summary of the specified tsinfer related file.

tsinfer list [-h] [-v]
             [--log-section {tsinfer.inference,tsinfer.formats,tsinfer.threads}]
             [--storage]
             path
Positional Arguments
path The tsinfer file to show information about.
Named Arguments
-v, --verbosity
 Increase the verbosity
--log-section, -L
 

Possible choices: tsinfer.inference, tsinfer.formats, tsinfer.threads

Log messages only for the specified module

--storage, -s Show detailed information about data storage.

verify

Verify that the specified tree sequence and samples files represent the same data

tsinfer verify [-h] [-v]
               [--log-section {tsinfer.inference,tsinfer.formats,tsinfer.threads}]
               [--progress]
               samples tree_sequence
Positional Arguments
samples The input sample data in tsinfer ‘samples’ format. Please see the documentation at http://tsinfer.readthedocs.io/ for information on how to import data into this format.
tree_sequence The tree sequence to compare with in .trees format.
Named Arguments
-v, --verbosity
 Increase the verbosity
--log-section, -L
 

Possible choices: tsinfer.inference, tsinfer.formats, tsinfer.threads

Log messages only for the specified module

--progress, -p Show a progress monitor.