Create an age group variable

age_categories(
  x,
  breakers = NULL,
  lower = 0,
  upper = NULL,
  by = 10,
  separator = "-",
  ceiling = FALSE,
  above.char = "+"
)

group_age_categories(
  dat,
  years = NULL,
  months = NULL,
  weeks = NULL,
  days = NULL,
  one_column = TRUE,
  drop_empty_overlaps = TRUE
)

Arguments

x: Your age variable
breakers: A string. Age category breaks you can define within c(). Alternatively use "lower", "upper" and "by" to set these breaks based on a sequence.
lower: A number. The lowest age value you want to consider (default is 0)
upper: A number. The highest age value you want to consider
by: A number. The number of years you want between groups
separator: A character that you want to have between ages in group names. The default is "-" producing e.g. 0-10.
ceiling: A TRUE/FALSE variable. Specify whether you would like the highest value in your breakers, or alternatively the upper value specified, to be the endpoint. This would produce the highest group of "70-80" rather than "80+". The default is FALSE (to produce a group of 80+).
above.char: Only considered when ceiling == FALSE. A character that you want to have after your highest age group. The default is "+" producing e.g. 80+
dat: a data frame with at least one column defining an age category
years, months, weeks, days: the bare name of the column defining years, months, weeks, or days (or NULL if the column doesn't exist)
one_column: if TRUE (default), the categories will be joined into a single column called "age_category" that appends the type of age category used. If FALSE, there will be one column with the grouped age categories called "age_category" and a second column indicating age unit called "age_unit".
drop_empty_overlaps: if TRUE, unused levels are dropped if they have been replaced by a more fine-grained definition and are empty. Practically, this means that the first level for years, months, and weeks are in consideration for being removed via forcats::fct_drop()

Value

a factor representing age ranges, open at the upper end of the range.

a data frame

Examples



if (interactive() && require("dplyr") && require("epidict")) {
withAutoprint({
set.seed(50)
dat <- epidict::gen_data("Cholera", n = 100, org = "MSF")
ages <- dat %>%
  select(starts_with("age")) %>%
  mutate(age_years = age_categories(age_years, breakers = c(0, 5, 10, 15, 20))) %>%
  mutate(age_months = age_categories(age_months, breakers = c(0, 5, 10, 15, 20))) %>%
  mutate(age_days = age_categories(age_days, breakers = c(0, 5, 15)))

ages %>%
  group_age_categories(years = age_years, months = age_months, days = age_days) %>%
  pull(age_category) %>%
  table()
})
}