These function produce MSF dictionaries based on DHIS2 (for OCA outbreaks) and ODK (for intersectional outbreaks and surveys) data sets defining the data element name, code, short names, types, and key/value pairs for translating the codes into human-readable format.
Arguments
- dictionary
Specify which dictionary you would like to use.
MSF OCA outbreaks include: "AJS", "Cholera", "Measles", "Meningitis"
MSF intersectional outbreaks include: "AJS_intersectional", "Cholera_intersectional", "Diphtheria_intersectional", "Measles_intersectional", "Meningitis_intersectional"
MSF OCA surveys include "Mortality", "Nutrition", "Vaccination_long", "Vaccination_short" and "ebs"
- tibble
If
TRUE(default), return data dictionary as a tidyverse tibble otherwise will return a list.- long
If
TRUE(default), the returned data dictionary is in long format with each option getting one row. IfFALSE, then two data frames are returned, one with variables and the other with content options.- compact
If
TRUE(default), then a nested data frame is returned where each row represents a single variable and a nested data frame column called "options", which can be expanded withtidyr::unnest(). This only works iflong = TRUE.
Value
A data frame (tibble) containing the specified MSF data dictionary.
If long = TRUE, each variable-option pair is represented as a row.
If compact = TRUE, the options are nested as a data frame column named
"options". If long = FALSE, a list is returned with two data frames:
dictionary and options.
Examples
if (require("dplyr") & require("matchmaker")) {
withAutoprint({
# You will often want to use MSF dictionaries to translate codes to human-
# readable variables. Here, we generate a data set of 20 cases:
dat <- gen_data(
dictionary = "Cholera",
varnames = "data_element_shortname",
numcases = 20,
org = "MSF"
)
print(dat)
# We want the expanded dictionary, so we will select `compact = FALSE`
dict <- msf_dict(dictionary = "Cholera", long = TRUE, compact = FALSE, tibble = TRUE)
print(dict)
# Now we can use matchmaker to filter the data:
dat_clean <- matchmaker::match_df(dat, dict,
from = "option_code",
to = "option_name",
by = "data_element_shortname",
order = "option_order_in_set"
)
print(dat_clean)
})
}
#> > dat <- gen_data(dictionary = "Cholera", varnames = "data_element_shortname",
#> + numcases = 20, org = "MSF")
#> > print(dat)
#> # A tibble: 20 × 45
#> case_number date_of_consultation_admiss…¹ patient_origin age_years age_months
#> <chr> <date> <chr> <int> <int>
#> 1 A1 2018-01-07 Village A 46 NA
#> 2 A2 2018-01-29 Village A 9 NA
#> 3 A3 2018-01-20 Village D 10 NA
#> 4 A4 2018-04-18 Village C 9 NA
#> 5 A5 2018-01-18 Village C 28 NA
#> 6 A6 2018-01-10 Village C 51 NA
#> 7 A7 2018-01-01 Village B 16 NA
#> 8 A8 2018-04-09 Village C 21 NA
#> 9 A9 2018-04-26 Village A 9 NA
#> 10 A10 2018-02-28 Village C 40 NA
#> 11 A11 2018-01-03 Village B 34 NA
#> 12 A12 2018-04-30 Village A 62 NA
#> 13 A13 2018-03-08 Village C 43 NA
#> 14 A14 2018-02-28 Village B 38 NA
#> 15 A15 2018-03-13 Village A 70 NA
#> 16 A16 2018-04-22 Village D 17 NA
#> 17 A17 2018-03-14 Village B 37 NA
#> 18 A18 2018-01-22 Village A 45 NA
#> 19 A19 2018-01-13 Village C 59 NA
#> 20 A20 2018-04-24 Village D 83 NA
#> # ℹ abbreviated name: ¹date_of_consultation_admission
#> # ℹ 40 more variables: age_days <int>, sex <fct>, pregnant <fct>,
#> # trimester <fct>, foetus_alive_at_admission <fct>, exit_status <fct>,
#> # date_of_exit <date>, time_to_death <fct>, pregnancy_outcome_at_exit <fct>,
#> # previously_vaccinated <fct>, previous_vaccine_doses_received <fct>,
#> # readmission <fct>, msf_involvement <fct>,
#> # cholera_treatment_facility_type <fct>, residential_status_brief <fct>, …
#> > dict <- msf_dict(dictionary = "Cholera", long = TRUE, compact = FALSE,
#> + tibble = TRUE)
#> > print(dict)
#> # A tibble: 182 × 11
#> data_element_uid data_element_name data_element_shortname
#> <chr> <chr> <chr>
#> 1 AafTlSwliVQ egen_001_patient_case_number case_number
#> 2 OTGOtWBz39J egen_004_date_of_consultation_admiss… date_of_consultation_…
#> 3 wnmMr2V3T3u egen_006_patient_origin patient_origin
#> 4 sbgqjeVwtb8 egen_008_age_years age_years
#> 5 eXYhovYyl61 egen_009_age_months age_months
#> 6 UrYJSk2Wp46 egen_010_age_days age_days
#> 7 D1Ky5K7pFN6 egen_011_sex sex
#> 8 D1Ky5K7pFN6 egen_011_sex sex
#> 9 D1Ky5K7pFN6 egen_011_sex sex
#> 10 dTm5R53YYXC egen_012_pregnancy_status pregnant
#> # ℹ 172 more rows
#> # ℹ 8 more variables: data_element_description <chr>,
#> # data_element_valuetype <chr>, data_element_formname <chr>,
#> # used_optionset_uid <chr>, option_code <chr>, option_name <chr>,
#> # option_uid <chr>, option_order_in_set <dbl>
#> > dat_clean <- matchmaker::match_df(dat, dict, from = "option_code", to = "option_name",
#> + by = "data_element_shortname", order = "option_order_in_set")
#> > print(dat_clean)
#> # A tibble: 20 × 45
#> case_number date_of_consultation_admiss…¹ patient_origin age_years age_months
#> <chr> <date> <chr> <int> <int>
#> 1 A1 2018-01-07 Village A 46 NA
#> 2 A2 2018-01-29 Village A 9 NA
#> 3 A3 2018-01-20 Village D 10 NA
#> 4 A4 2018-04-18 Village C 9 NA
#> 5 A5 2018-01-18 Village C 28 NA
#> 6 A6 2018-01-10 Village C 51 NA
#> 7 A7 2018-01-01 Village B 16 NA
#> 8 A8 2018-04-09 Village C 21 NA
#> 9 A9 2018-04-26 Village A 9 NA
#> 10 A10 2018-02-28 Village C 40 NA
#> 11 A11 2018-01-03 Village B 34 NA
#> 12 A12 2018-04-30 Village A 62 NA
#> 13 A13 2018-03-08 Village C 43 NA
#> 14 A14 2018-02-28 Village B 38 NA
#> 15 A15 2018-03-13 Village A 70 NA
#> 16 A16 2018-04-22 Village D 17 NA
#> 17 A17 2018-03-14 Village B 37 NA
#> 18 A18 2018-01-22 Village A 45 NA
#> 19 A19 2018-01-13 Village C 59 NA
#> 20 A20 2018-04-24 Village D 83 NA
#> # ℹ abbreviated name: ¹date_of_consultation_admission
#> # ℹ 40 more variables: age_days <int>, sex <fct>, pregnant <fct>,
#> # trimester <fct>, foetus_alive_at_admission <fct>, exit_status <fct>,
#> # date_of_exit <date>, time_to_death <fct>, pregnancy_outcome_at_exit <fct>,
#> # previously_vaccinated <fct>, previous_vaccine_doses_received <fct>,
#> # readmission <fct>, msf_involvement <fct>,
#> # cholera_treatment_facility_type <fct>, residential_status_brief <fct>, …