Calculates the Theil T Index of a given accessibility distribution. Values range from 0 (when all individuals have exactly the same accessibility levels) to the natural log of n, in which n is the number of individuals in the accessibility dataset. If the individuals can be classified into mutually exclusive and completely exhaustive groups, the index can be decomposed into a between-groups inequaliy component and a within-groups component.
Usage
theil_t(
accessibility_data,
sociodemographic_data,
opportunity,
population,
socioeconomic_groups = NULL,
group_by = character(0)
)
Arguments
- accessibility_data
A data frame. The accessibility levels whose inequality should be calculated. Must contain the columns
id
and any others specified inopportunity
.- sociodemographic_data
A data frame. The distribution of sociodemographic characteristics of the population in the study area cells. Must contain the columns
id
and any others specified inpopulation
andsocioeconomic_groups
.- opportunity
A string. The name of the column in
accessibility_data
with the accessibility levels to be considerend when calculating inequality levels.- population
A string. The name of the column in
sociodemographic_data
with the number of people in each cell. Used to weigh accessibility levels when calculating inequality.- socioeconomic_groups
A string. The name of the column in
sociodemographic_data
whose values identify the socioeconomic groups that should be used to calculate the between- and within-groups inequality levels. IfNULL
(the default), between- and within-groups components are not calculated and only the total aggregate inequality is returned.- group_by
A
character
vector. When notcharacter(0)
(the default), indicates theaccessibility_data
columns that should be used to group the inequality estimates by. For example, ifaccessibility_data
includes ascenario
column that identifies distinct scenarios that each accessibility estimates refer to (e.g. before and after a transport policy intervention), passing"scenario"
to this parameter results in inequality estimates grouped by scenario.
Value
If socioeconomic_groups
is NULL
, a data frame containing the
total Theil T estimates for the study area. If not, a list containing three
dataframes: one summarizing the total inequality and the between- and
within-groups components, one listing the contribution of each group to the
between-groups component and another listing the contribution of each group
to the within-groups component.
See also
Other inequality:
concentration_index()
,
gini_index()
,
palma_ratio()
Examples
if (FALSE) { # identical(tolower(Sys.getenv("NOT_CRAN")), "true")
data_dir <- system.file("extdata", package = "accessibility")
travel_matrix <- readRDS(file.path(data_dir, "travel_matrix.rds"))
land_use_data <- readRDS(file.path(data_dir, "land_use_data.rds"))
access <- cumulative_cutoff(
travel_matrix,
land_use_data,
cutoff = 30,
opportunity = "jobs",
travel_cost = "travel_time"
)
ti <- theil_t(
access,
sociodemographic_data = land_use_data,
opportunity = "jobs",
population = "population"
)
ti
# to calculate inequality between and within income deciles, we pass
# "income_decile" to socioeconomic_groups.
# some cells, however, are classified as in the decile NA because their
# income per capita is NaN, as they don't have any population. we filter
# these cells from our accessibility data, otherwise the output would include
# NA values (note that subsetting the data like this doesn't affect the
# assumption that groups are completely exhaustive, because cells with NA
# income decile don't have any population)
na_decile_ids <- land_use_data[is.na(land_use_data$income_decile), ]$id
access <- access[! access$id %in% na_decile_ids, ]
sociodem_data <- land_use_data[! land_use_data$id %in% na_decile_ids, ]
ti <- theil_t(
access,
sociodemographic_data = sociodem_data,
opportunity = "jobs",
population = "population",
socioeconomic_groups = "income_decile"
)
ti
}