Command line interface

The command line interface in tsinfer is intended to provide a convenient interface to the high-level API functionality. There are two equivalent ways to invoke this program:

$ tsinfer

or

$ python3 -m tsinfer

The first form is more intuitive and works well most of the time. The second form is useful when multiple versions of Python are installed or if the tsinfer executable is not installed on your path.

The tsinfer program has five subcommands: list prints a summary of the data held in one of tsinfer’s file formats; infer runs the complete inference process for a given input samples file; and generate-ancestors, match-ancestors and match-samples run the three parts of this inference process as separate steps. Running the inference as separate steps like this is recommended for large inferences as it allows for greater control over the inference process.

Argument details

Command line interface for tsinfer.

usage: tsinfer [-h] [-V]
               {generate-ancestors,ga,match-ancestors,ma,augment-ancestors,aa,match-samples,ms,infer,list,ls,verify}
               ...

Positional Arguments

subcommand

Possible choices: generate-ancestors, ga, match-ancestors, ma, augment-ancestors, aa, match-samples, ms, infer, list, ls, verify

Named Arguments

-V, --version

show program’s version number and exit

Sub-commands:

generate-ancestors (ga)

Generates a set of ancestors from the input sample data and stores the results in a tsinfer ancestors file.

tsinfer generate-ancestors [-h] [-a ANCESTORS] [--num-threads NUM_THREADS]
                           [--num-flush-threads NUM_FLUSH_THREADS]
                           [--progress] [-v]
                           [--log-section {tsinfer.inference,tsinfer.formats,tsinfer.threads}]
                           samples
Positional Arguments
samples

The input sample data in tsinfer ‘samples’ format. Please see the documentation at http://tsinfer.readthedocs.io/ for information on how to import data into this format.

Named Arguments
-a, --ancestors

The path to the ancestor data file in tsinfer ‘ancestors’ format. If not specified, this defaults to the input samples file stem with the extension ‘.ancestors’. For example, if ‘1kg-chr1.samples’ is the input file then the default ancestors file would be ‘1kg-chr1.ancestors’

--num-threads, -t

The number of worker threads to use. If < 1, use a simpler unthreaded algorithm (default).

--num-flush-threads, -F

The number of data flush threads to use. If < 1, all data is flushed synchronously in the main thread (default=2)

--progress, -p

Show a progress monitor.

-v, --verbosity

Increase the verbosity

--log-section, -L

Possible choices: tsinfer.inference, tsinfer.formats, tsinfer.threads

Log messages only for the specified module

match-ancestors (ma)

Matches the ancestors built by the ‘generate-ancestors’ command against each other using the model information specified in the input file and writes the output to a tskit .trees file.

tsinfer match-ancestors [-h] [-v]
                        [--log-section {tsinfer.inference,tsinfer.formats,tsinfer.threads}]
                        [-a ANCESTORS] [-A ANCESTORS_TREES]
                        [--num-threads NUM_THREADS] [--progress]
                        [--no-path-compression]
                        samples
Positional Arguments
samples

The input sample data in tsinfer ‘samples’ format. Please see the documentation at http://tsinfer.readthedocs.io/ for information on how to import data into this format.

Named Arguments
-v, --verbosity

Increase the verbosity

--log-section, -L

Possible choices: tsinfer.inference, tsinfer.formats, tsinfer.threads

Log messages only for the specified module

-a, --ancestors

The path to the ancestor data file in tsinfer ‘ancestors’ format. If not specified, this defaults to the input samples file stem with the extension ‘.ancestors’. For example, if ‘1kg-chr1.samples’ is the input file then the default ancestors file would be ‘1kg-chr1.ancestors’

-A, --ancestors-trees

The path to the ancestor trees file in tskit ‘.trees’ format. If not specified, this defaults to the input samples file stem with the extension ‘.ancestors.trees’. For example, if ‘1kg-chr1.samples’ is the input file then the default ancestors file would be ‘1kg-chr1.ancestors.trees’

--num-threads, -t

The number of worker threads to use. If < 1, use a simpler unthreaded algorithm (default).

--progress, -p

Show a progress monitor.

--no-path-compression

Disable path compression

augment-ancestors (aa)

Augments the ancestors tree sequence by adding a subset of the samples

tsinfer augment-ancestors [-h] [-n NUM_SAMPLES] [-A ANCESTORS_TREES] [-v]
                          [--log-section {tsinfer.inference,tsinfer.formats,tsinfer.threads}]
                          [--no-path-compression] [--num-threads NUM_THREADS]
                          [--progress]
                          samples augmented_ancestors
Positional Arguments
samples

The input sample data in tsinfer ‘samples’ format. Please see the documentation at http://tsinfer.readthedocs.io/ for information on how to import data into this format.

augmented_ancestors

The path to write the augmented ancestors tree sequence to

Named Arguments
-n, --num-samples

The number of samples to use. Defaults to 10% of the total.

-A, --ancestors-trees

The path to the ancestor trees file in tskit ‘.trees’ format. If not specified, this defaults to the input samples file stem with the extension ‘.ancestors.trees’. For example, if ‘1kg-chr1.samples’ is the input file then the default ancestors file would be ‘1kg-chr1.ancestors.trees’

-v, --verbosity

Increase the verbosity

--log-section, -L

Possible choices: tsinfer.inference, tsinfer.formats, tsinfer.threads

Log messages only for the specified module

--no-path-compression

Disable path compression

--num-threads, -t

The number of worker threads to use. If < 1, use a simpler unthreaded algorithm (default).

--progress, -p

Show a progress monitor.

match-samples (ms)

Matches the samples against the tree sequence structure built by the match-ancestors command

tsinfer match-samples [-h] [-v]
                      [--log-section {tsinfer.inference,tsinfer.formats,tsinfer.threads}]
                      [-A ANCESTORS_TREES] [--no-path-compression]
                      [--no-simplify] [-O OUTPUT_TREES]
                      [--num-threads NUM_THREADS] [--progress]
                      samples
Positional Arguments
samples

The input sample data in tsinfer ‘samples’ format. Please see the documentation at http://tsinfer.readthedocs.io/ for information on how to import data into this format.

Named Arguments
-v, --verbosity

Increase the verbosity

--log-section, -L

Possible choices: tsinfer.inference, tsinfer.formats, tsinfer.threads

Log messages only for the specified module

-A, --ancestors-trees

The path to the ancestor trees file in tskit ‘.trees’ format. If not specified, this defaults to the input samples file stem with the extension ‘.ancestors.trees’. For example, if ‘1kg-chr1.samples’ is the input file then the default ancestors file would be ‘1kg-chr1.ancestors.trees’

--no-path-compression

Disable path compression

--no-simplify

Do not simplify the output tree sequence

-O, --output-trees

The path to the output trees file in tskit ‘.trees’ format. If not specified, this defaults to the input samples file stem with the extension ‘.trees’. For example, if ‘1kg-chr1.samples’ is the input file then the default output file would be ‘1kg-chr1.trees’

--num-threads, -t

The number of worker threads to use. If < 1, use a simpler unthreaded algorithm (default).

--progress, -p

Show a progress monitor.

infer

Runs the generate-ancestors, match-ancestors and match-samples commands without writing the intermediate files to disk. Not recommended for large inferences.

tsinfer infer [-h] [-v]
              [--log-section {tsinfer.inference,tsinfer.formats,tsinfer.threads}]
              [-O OUTPUT_TREES] [--num-threads NUM_THREADS] [--progress]
              samples
Positional Arguments
samples

The input sample data in tsinfer ‘samples’ format. Please see the documentation at http://tsinfer.readthedocs.io/ for information on how to import data into this format.

Named Arguments
-v, --verbosity

Increase the verbosity

--log-section, -L

Possible choices: tsinfer.inference, tsinfer.formats, tsinfer.threads

Log messages only for the specified module

-O, --output-trees

The path to the output trees file in tskit ‘.trees’ format. If not specified, this defaults to the input samples file stem with the extension ‘.trees’. For example, if ‘1kg-chr1.samples’ is the input file then the default output file would be ‘1kg-chr1.trees’

--num-threads, -t

The number of worker threads to use. If < 1, use a simpler unthreaded algorithm (default).

--progress, -p

Show a progress monitor.

list (ls)

Show a summary of the specified tsinfer related file.

tsinfer list [-h] [-v]
             [--log-section {tsinfer.inference,tsinfer.formats,tsinfer.threads}]
             [--storage]
             path
Positional Arguments
path

The tsinfer file to show information about.

Named Arguments
-v, --verbosity

Increase the verbosity

--log-section, -L

Possible choices: tsinfer.inference, tsinfer.formats, tsinfer.threads

Log messages only for the specified module

--storage, -s

Show detailed information about data storage.

verify

Verify that the specified tree sequence and samples files represent the same data

tsinfer verify [-h] [-v]
               [--log-section {tsinfer.inference,tsinfer.formats,tsinfer.threads}]
               [--progress]
               samples tree_sequence
Positional Arguments
samples

The input sample data in tsinfer ‘samples’ format. Please see the documentation at http://tsinfer.readthedocs.io/ for information on how to import data into this format.

tree_sequence

The tree sequence to compare with in .trees format.

Named Arguments
-v, --verbosity

Increase the verbosity

--log-section, -L

Possible choices: tsinfer.inference, tsinfer.formats, tsinfer.threads

Log messages only for the specified module

--progress, -p

Show a progress monitor.