all_lineage_prevalences¶
- outbreak_data.all_lineage_prevalences(location=None, ndays=180, nday_threshold=10, other_threshold=0.05, other_exclude=None, cumulative=False, **req_args)¶
Get prevalences of lineages circulating in a location according to clinical sequencing data.
- Parameters
location -- A string containing a location ID. If not specified, global data is returned.
other_threshold -- Minimum prevalence threshold below which lineages will be aggregated under "other".
nday_threshold -- Minimum number of days in which a lineage's prevalence must be above other_threshold in order to not be aggregated.
ndays -- The number of days before the current date to be used as a window to accumulate lineages under "other".
other_exclude -- List of lineages that are not to be included under "other".
cumulative -- If true return the cumulative prevalence; otherwise return daily data.
- Returns
A pandas dataframe containing lineage prevalences.
- Parameter example
{ 'location': 'USA_US-HI' }
- Parameter example
{ 'cumulative': True }
Example usage:
#Find the prevalence all lineages in Argentina that begin with 'xbb.1'
df = od.prevalence_by_location("ARG", startswith = 'xbb.1')
print(df)
date total_count lineage_count lineage prevalence \
1454 2022-10-12 3 1 xbb.1 0.333333
1455 2022-10-13 0 0 xbb.1 0.000000
1456 2022-10-14 0 0 xbb.1 0.000000
1457 2022-10-15 0 0 xbb.1 0.000000
1458 2022-10-16 0 0 xbb.1 0.000000
... ... ... ... ... ...
1673 2023-03-17 0 0 xbb.1.5 0.000000
1674 2023-03-18 0 0 xbb.1.5 0.000000
1675 2023-03-19 0 0 xbb.1.5 0.000000
1676 2023-03-20 0 0 xbb.1.5 0.000000
1677 2023-03-21 1 1 xbb.1.5 1.000000
prevalence_rolling
1454 0.350000
1455 0.179487
1456 0.109375
1457 0.065421
1458 0.058577
... ...
1673 1.000000
1674 1.000000
1675 1.000000
1676 1.000000
1677 1.000000
[224 rows x 6 columns]