lineage_cl_prevalence

outbreak_data.lineage_cl_prevalence(pango_lin, descendants=False, location=None, mutations=None, datemin=None, datemax=None, cumulative=False, **req_args)

Get the daily prevalence of a set of lineages in clinical sequencing data.

Parameters
  • pango_lin -- List of lineage names to query for.

  • descendants -- If True, return mutations contained in pango_lin as well as any descendants (works only with single pango_lin).

  • location -- A string containing the location ID to query within.

  • mutations -- A list of mutation names; query within the subset of sequences containing all of these.

  • datemin -- (Optional). String containing start of date range to query within in YYYY-MM-DD.

  • datemax -- (Optional). String containing end of date range to query within in YYYY-MM-DD.

  • cumulative -- If true returns the cumulative global prevalence since the first day of detection.

Returns

A pandas dataframe containing prevalence data.

Parameter example

{ 'pango_lin': 'BA.2.86.1', 'descendants': True }

Example Usage

Get the prevalence data for BA.2.86.1 in Canada:

>>> df = outbreak_data.lineage_cl_prevalence('BA.2.86.1', location = 'CAN')
>>> df

                     total_count  lineage_count  total_count_rolling  \
date       query
2023-09-08 BA.2.86.1          270              1           251.000000
2023-09-09 BA.2.86.1          204              0           260.571429
2023-09-10 BA.2.86.1          227              0           266.285714
2023-09-11 BA.2.86.1          390              0           290.285714
2023-09-12 BA.2.86.1          409              1           300.142857
...                           ...            ...                  ...
2024-05-11 BA.2.86.1           20              0            37.285714
2024-05-12 BA.2.86.1           24              0            35.000000
2024-05-13 BA.2.86.1           69              0            36.285714
2024-05-14 BA.2.86.1           36              0            35.142857
2024-05-15 BA.2.86.1            3              0            31.428571

                      lineage_count_rolling  proportion  proportion_ci_lower  \
date       query
2023-09-08 BA.2.86.1               0.142857    0.000569             0.000002
2023-09-09 BA.2.86.1               0.142857    0.000548             0.000002
2023-09-10 BA.2.86.1               0.142857    0.000536             0.000002
2023-09-11 BA.2.86.1               0.142857    0.000492             0.000002
2023-09-12 BA.2.86.1               0.285714    0.000952             0.000002
...                                     ...         ...                  ...
2024-05-11 BA.2.86.1               0.000000    0.000000             0.000013
2024-05-12 BA.2.86.1               0.000000    0.000000             0.000014
2024-05-13 BA.2.86.1               0.000000    0.000000             0.000014
2024-05-14 BA.2.86.1               0.000000    0.000000             0.000014
2024-05-15 BA.2.86.1               0.000000    0.000000             0.000016

                      proportion_ci_upper
date       query
2023-09-08 BA.2.86.1             0.009948
2023-09-09 BA.2.86.1             0.009569
2023-09-10 BA.2.86.1             0.009390
2023-09-11 BA.2.86.1             0.008617
2023-09-12 BA.2.86.1             0.008331
...                                   ...
2024-05-11 BA.2.86.1             0.065207
2024-05-12 BA.2.86.1             0.068777
2024-05-13 BA.2.86.1             0.066944
2024-05-14 BA.2.86.1             0.068777
2024-05-15 BA.2.86.1             0.077230

[251 rows x 7 columns]

Get the prevalence data for BA.2 for the first week in Canada:

>>> df = outbreak_data.lineage_cl_prevalence('BA.2', location = 'CAN',
                                     datemin = '2023-03-01',
                                     datemax = '2023-03-08')
>>> df

                                  total_count  lineage_count  total_count_rolling  \
date       query
2022-03-01 BA.2           569             77           569.000000
2022-03-02 BA.2           626             71           597.500000
2022-03-03 BA.2           572             78           589.000000
2022-03-04 BA.2           540             72           576.750000
2022-03-05 BA.2           413             70           544.000000
2022-03-06 BA.2           457             59           529.500000
2022-03-07 BA.2           549             75           532.285714
2022-03-08 BA.2           653            114           544.285714

                  lineage_count_rolling  proportion  proportion_ci_lower  \
date       query
2022-03-01 BA.2               77.000000    0.135325             0.109091
2022-03-02 BA.2               74.000000    0.123849             0.099185
2022-03-03 BA.2               75.333333    0.127900             0.102257
2022-03-04 BA.2               74.500000    0.129172             0.102845
2022-03-05 BA.2               73.600000    0.135294             0.109177
2022-03-06 BA.2               71.166667    0.134404             0.106981
2022-03-07 BA.2               71.714286    0.134729             0.108272
2022-03-08 BA.2               77.000000    0.141470             0.114181

                  proportion_ci_upper
date       query
2022-03-01 BA.2              0.165257
2022-03-02 BA.2              0.151938
2022-03-03 BA.2              0.156063
2022-03-04 BA.2              0.157372
2022-03-05 BA.2              0.166742
2022-03-06 BA.2              0.164927
2022-03-07 BA.2              0.166358
2022-03-08 BA.2              0.172709