Skip to content
Snippets Groups Projects

ubair

ubair is an R package for Statistical Investigation of the Impact of External Conditions on Air Quality: it uses the statistical software R to analyze and visualize the impact of external factors, such as traffic restrictions, hazards, and political measures, on air quality. It aims to provide experts with a transparent comparison of modeling approaches and to support data-driven evaluations for policy advisory purposes.

Installation

Install via cran or if you have access to https://gitlab.opencode.de/uba-ki-lab/ubair you can use one of the following options:

Using an archive file

Recommended if you do not have git installed.

  • Download zip/tar.gz from GitLab
  • Start a new R-Project or open an existing one
  • in R-Studio:
    • go to ‘Packages’-Tab (next to Help/Plots/Files)
    • Click on ‘Install’ (left upper corner)
    • Install from: choose “Package Archive File”
    • Browse to zip-file
    • ‘Install’
  • alternatively, type in console:
install.packages("<path-to-zip>/ubair-main.zip", repos = NULL, type = "source")

Using remote package

Git needs to be installed.

install.packages("remotes")
# requires a configures ssh-key
remotes::install_git("git@gitlab.opencode.de:uba-ki-lab/ubair.git")
# alternative via password
remotes::install_git("https://gitlab.opencode.de/uba-ki-lab/ubair.git")

Sample Usage of package

For a more detailed explanation of the package, you can access the vignettes:

  • View user_sample source code directly in the vignettes/ folder.
  • Open vignette by function vignette("user_sample_1", package = "ubair"), if the package was installed with vignettes
library(ubair)
params <- load_params()
env_data <- sample_data_DESN025
# Plot meteo data
plot_station_measurements(env_data, params$meteo_variables)
  • split data into training, reference and effect time intervals
application_start <- lubridate::ymd("20191201") # This coincides with the start of the reference window
date_effect_start <- lubridate::ymd_hm("20200323 00:00") # This splits the forecast into reference and effect
application_end <- lubridate::ymd("20200504") # This coincides with the end of the effect window

buffer <- 24 * 14 # 14 days buffer

dt_prepared <- prepare_data_for_modelling(env_data, params)
dt_prepared <- dt_prepared[complete.cases(dt_prepared)]
split_data <- split_data_counterfactual(
  dt_prepared, application_start,
  application_end
)
res <- run_counterfactual(split_data,
  params,
  detrending_function = "linear",
  model_type = "lightgbm",
  alpha = 0.9,
  log_transform = TRUE,
  calc_shaps = TRUE
)
#> [LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000509 seconds.
#> You can set `force_row_wise=true` to remove the overhead.
#> And if memory is not enough, you can set `force_col_wise=true`.
#> [LightGBM] [Info] Total Bins 1557
#> [LightGBM] [Info] Number of data points in the train set: 104486, number of used features: 9
#> [LightGBM] [Info] Start training from score -0.000000
predictions <- res$prediction

plot_counterfactual(predictions, params,
  window_size = 14,
  date_effect_start,
  buffer = buffer,
  plot_pred_interval = TRUE
)
round(calc_performance_metrics(predictions, date_effect_start, buffer = buffer), 2)
#>           RMSE            MSE            MAE           MAPE           Bias 
#>           7.38          54.48           5.38           0.18          -2.73 
#>             R2 Coverage lower Coverage upper       Coverage    Correlation 
#>           0.74           0.97           0.95           0.92           0.89 
#>            MFB            FGE 
#>          -0.05           0.19
round(calc_summary_statistics(predictions, date_effect_start, buffer = buffer), 2)
true prediction
min 3.36 5.58
max 111.90 59.71
var 212.96 128.16
mean 30.80 28.07
5-percentile 9.29 10.73
25-percentile 19.85 19.40
median/50-percentile 29.60 27.09
75-percentile 40.54 36.27
95-percentile 56.80 47.69
estimate_effect_size(predictions, date_effect_start, buffer = buffer, verbose = TRUE)
#> The external effect changed the target value on average by -6.294 compared to the reference time window. This is a -26.37% relative change.

#> $absolute_effect
#> [1] -6.294028
#> 
#> $relative_effect
#> [1] -0.2637

SHAP feature importances

shapviz::sv_importance(res$importance, kind = "bee")
xvars <- c("TMP", "WIG", "GLO", "WIR")
shapviz::sv_dependence(res$importance, v = xvars)

Development

Prerequisites

  1. R: Make sure you have R installed (recommended version 4.4.1). You can download it from CRAN.
  2. RStudio (optional but recommended): Download from RStudio.

Setting Up the Environment

Install the development version of ubair:

install.packages("renv")
renv::restore()
devtools::build()
devtools::load_all()

Development

Install pre-commit hook (required to ensure tidyverse code formatting)

pip install pre-commit

Add new requirements

If you add new dependencies to ubair package, make sure to update the renv.lock file:

renv::snapshot()

style and documentation

Before you commit your changes update documentation, ensure style complies with tidyverse styleguide and all tests run without error

# update documentation and check package integrity
devtools::check()
# apply tidyverse style (also applied as precommit hook)
usethis::use_tidy_style()
# you can check for existing lintr warnings by
devtools::lint()
# run tests
devtools::test()
# build README.md if any changes have been made to README.Rmd
devtools::build_readme()

Pre-commit hook

in .pre-commit-hook.yaml pre-commit rules are defined and applied before each commmit. This includes: split - run styler to format code in tidyverse style - run roxygen to update doc - check if readme is up to date - run lintr to finally check code style format

If precommit fails, check the automatically applied changes, stage them and retry to commit.

Test Coverage

Install covr to run this.

cov <- covr::package_coverage(type = "all")
cov_list <- covr::coverage_to_list(cov)
data.table::data.table(
  part = c("Total", names(cov_list$filecoverage)),
  coverage = c(cov_list$totalcoverage, as.vector(cov_list$filecoverage))
)
covr::report(cov)

Contacts

Jore Noa Averbeck JoreNoa.Averbeck@uba.de

Raphael Franke Raphael.Franke@uba.de

Imke Voß imke.voss@uba.de

Consent

On this website, we use the web analytics service Matomo to analyze and review the use of our website. Through the collected statistics, we can improve our offerings and make them more appealing for you. Here, you can decide whether to allow us to process your data and set corresponding cookies for these purposes, in addition to technically necessary cookies. Further information on data protection—especially regarding "cookies" and "Matomo"—can be found in our privacy policy. You can withdraw your consent at any time.