known_mutations

outbreak_data.known_mutations(pango_lin=None, descendants=False, mutations=None, freq=0.8, **req_args)

Get information about each mutation present in a lineage or set of lineages in clinical sequences.

Parameters
  • pango_lin -- A string or list of lineage names. Return mutations occuring in any of these lineages.

  • descendants -- If True, return mutations contained in pango_lin as well as any descendants (works only with single pango_lin).

  • mutations -- A string or list of mutation names. Return only mutations co-occuring with all of these mutations.

  • freq -- A frequency threshold above which to return mutations.

Returns

A pandas dataframe of mutation information.

Parameter example 1

{ 'pango_lin': 'BA.2.86.1', 'descendants': True }

Parameter example 2

{ 'pango_lin': ['BA.1', 'BA.2'] }

Example Usage

Get info on all mutations under "BA.2.86":

>>> df = outbreak_data.known_mutations('BA.2.86')
>>> df

   mutation_count  lineage_count  lineage   gene      ref_aa  \
mutation
e:t9i                    995            995  BA.2.86      E           T
orf1b:r1315c             995            995  BA.2.86  ORF1b           R
orf1b:p314l              994            995  BA.2.86  ORF1b           P
orf3a:t223i              994            995  BA.2.86  ORF3a           T
n:r203k                  992            995  BA.2.86      N           R
...                      ...            ...      ...    ...         ...
s:r403k                  858            995  BA.2.86      S           R
s:del25/27               832            995  BA.2.86      S  S:DEL25/27
s:l24s                   832            995  BA.2.86      S           L
s:n460k                  808            995  BA.2.86      S           N
s:s477n                  799            995  BA.2.86      S           S

                alt_aa  codon_num  codon_end          type  prevalence  \
mutation
e:t9i                I          9        NaN  substitution    1.000000
orf1b:r1315c         C       1315        NaN  substitution    1.000000
orf1b:p314l          L        314        NaN  substitution    0.998995
orf3a:t223i          I        223        NaN  substitution    0.998995
n:r203k              K        203        NaN  substitution    0.996985
...                ...        ...        ...           ...         ...
s:r403k              K        403        NaN  substitution    0.862312
s:del25/27    DEL25/27         25       27.0      deletion    0.836181
s:l24s               S         24        NaN  substitution    0.836181
s:n460k              K        460        NaN  substitution    0.812060
s:s477n              N        477        NaN  substitution    0.803015

              change_length_nt
mutation
e:t9i                      NaN
orf1b:r1315c               NaN
orf1b:p314l                NaN
orf3a:t223i                NaN
n:r203k                    NaN
...                        ...
s:r403k                    NaN
s:del25/27                 9.0
s:l24s                     NaN
s:n460k                    NaN
s:s477n                    NaN

[81 rows x 11 columns]