mutation_prevalences¶
- outbreak_data.mutation_prevalences(mutations=None, location=None, pango_lin=None, datemin=None, datemax=None, **req_args)¶
Get the prevalence of a set of mutations given in some subset of clinical sequences.
- Parameters
mutations -- List of mutations to query for.
location -- The ID string of a location to query within.
pango_lineage -- The name of a pangolin lineage to query within.
datemin -- (Optional). String containing start of date range to query within in YYYY-MM-DD.
datemax -- (Optional). String containing end of date range to query within in YYYY-MM-DD.
- Returns
A pandas dataframe of mutation information.
- Parameter example
{ 'mutations': ['orf1b:r1315c', 's:l24s'], 'pango_lin': 'BA.2' }
Example Usage
Get prevalence data on mutations orf1b:r1315c and s:l24s under lineage BA.2.86 in the US:
>>> df = outbreak_data.mutation_prevalences('orf1b:r1315c, s:l24s', 'USA' ,'BA.2.86')
>>> df
pangolin_lineage lineage_count mutation_count proportion \
query
s:l24s ba.2.86 84 77 0.916667
orf1b:r1315c ba.2.86 84 84 1.000000
proportion_ci_lower proportion_ci_upper
query
s:l24s 0.843358 0.961949
orf1b:r1315c 0.970625 0.999994