mutation_prevalences

outbreak_data.mutation_prevalences(mutations=None, location=None, pango_lin=None, datemin=None, datemax=None, **req_args)

Get the prevalence of a set of mutations given in some subset of clinical sequences.

Parameters
  • mutations -- List of mutations to query for.

  • location -- The ID string of a location to query within.

  • pango_lineage -- The name of a pangolin lineage to query within.

  • datemin -- (Optional). String containing start of date range to query within in YYYY-MM-DD.

  • datemax -- (Optional). String containing end of date range to query within in YYYY-MM-DD.

Returns

A pandas dataframe of mutation information.

Parameter example

{ 'mutations': ['orf1b:r1315c', 's:l24s'], 'pango_lin': 'BA.2' }

Example Usage

Get prevalence data on mutations orf1b:r1315c and s:l24s under lineage BA.2.86 in the US:

>>> df = outbreak_data.mutation_prevalences('orf1b:r1315c, s:l24s', 'USA' ,'BA.2.86')
>>> df

             pangolin_lineage  lineage_count  mutation_count  proportion  \
query
s:l24s                ba.2.86             84              77    0.916667
orf1b:r1315c          ba.2.86             84              84    1.000000

              proportion_ci_lower  proportion_ci_upper
query
s:l24s                   0.843358             0.961949
orf1b:r1315c             0.970625             0.999994