Write behavior specifications in plain English, implement them as R functions, and run them as tests. Requirements stay in sync with the code because they are the code.
✗comments as specs
Comments encode intent but not procedure. They don't separate precondition from action from outcome. They can't be run, so they drift — and a stale comment is worse than no comment.
test_that("sales trend works", {
# sales data is available
# check the plot for Electronics
# make sure it looks right
result <- get_sales_trend("Electronics")
expect_s3_class(result, "ggplot")
}) ✓the specification
Gherkin forces you to think in procedure: what state is required, what action is taken, what outcome is observable. Vague intent doesn't survive the structure.
Feature: Sales Trends
Scenario: User views trend for a category
Given the sales data is loaded
When the user views the trend for "Electronics"
Then the sales trend plot for "Electronics" is shown ✓the implementation
Each line of the spec maps to one R function. The English phrase becomes the function signature — the same words, now executable.
given("the sales data is loaded", function(context) {
context$data <- load_sales_data()
})
when("the user views the trend for {string}", function(category, context) {
context$plot <- get_sales_trend(context$data, category)
})
then("the sales trend plot for {string} is shown", function(category, context) {
expect_s3_class(context$plot, "ggplot")
expect_equal(context$plot$labels$title, category)
}) ✓verification
> cucumber::test()
#> ✔ | F W S OK | Context
#> ✔ | 1 | Feature: Sales Trends
#>
#> ══ Results ═══════════════════════════════════════════════════
#> [ FAIL 0 | WARN 0 | SKIP 0 | PASS 1 ]
Mutation testing for R. Introduces small changes to your source code and checks whether your tests catch them. Reveals gaps that code coverage misses.
the code
A simple boundary check. Two tests cover adults and minors — but never the edge.
# R/is_adult.R
is_adult <- function(age) {
age >= 18
} mutation score
The > 18 mutant survived.
Boundary value 18 is never tested — your suite can't tell >= from >.
✗the tests
test_that("is_adult returns TRUE for adults", {
expect_true(is_adult(25)) # passes even with age > 18
})
test_that("is_adult returns FALSE for minors", {
expect_false(is_adult(10)) # passes even with age > 18
}) run mutation testing
> muttest::muttest(plan)
#> ℹ Mutation Testing
#> | K | S | E | T | % | Mutator | File
#> ✔ | 1 | 0 | 0 | 1 | 100 | >= → <= | is_adult.R
#> x | 1 | 1 | 0 | 2 | 50 | >= → > | is_adult.R
#>
#> ── Survived Mutants ─────────────────────────────────────────────────────────
#> is_adult.R >= → >
#> 2- age >= 18
#> 2+ age > 18
#>
#> ── Results ──────────────────────────────────────────────────────────────────
#> [ KILLED 1 | SURVIVED 1 | ERRORS 0 | TOTAL 2 | SCORE 50.0% ] after the fix
Every mutation triggers a failure. Adding the boundary test kills the survivor.
✓the fix
test_that("is_adult returns TRUE for adults", {
expect_true(is_adult(25))
})
test_that("is_adult returns FALSE for minors", {
expect_false(is_adult(10))
})
test_that("is_adult is TRUE at the boundary", {
expect_true(is_adult(18)) # kills >= → > #
}) Build Shiny apps from the outside in. Write acceptance tests first, then let them drive every design decision down to the module level.
✗vague requirements
Stories written in prose stay prose. They can't be run, so nobody knows when the app actually satisfies them. Requirements drift the moment code ships.
Budget tracking
As a user I want to see my net balance
so that I can understand my financial situation.
Acceptance: shows income, expenses, and net.
// ← lives in a doc, never executed ✓executable specification
The same scenario becomes a test. Given-When-Then forces you to name preconditions, actions, and outcomes. When it passes, the feature is done.
# tests/acceptance/test-budget.R
test_that("Scenario: I can inspect my net balance", {
# Given
dsl$record_income(2000)
dsl$record_expense(500)
# When
dsl$inspect_finances()
# Then
dsl$verify_total_income(2000)
dsl$verify_total_expenses(500)
dsl$verify_net_balance(1500)
dsl$teardown()
}) what you'll learn
ShinyConf 2024. A structured approach to testing Shiny apps: inside-out unit tests, outside-in acceptance tests, and the loop that connects them.
what you'll learn
useR! 2025. From vague wish to working code: how to cooperate with stakeholders, write Gherkin scenarios, and execute them with Cucumber for R.
what you'll learn
I'm a software engineer specializing in R with 5+ years of experience.
I believe automated testing is the key to building quality software.
My journey into R testing began with a project where, to write any code, you had to be connected to the production environment. Terrible developer experience.
I'm particularly passionate about knowledge sharing, which is why I maintain an active blog and R Tests Gallery. When we share our testing experiences (both successes and failures) we all become better developers.
I approach testing with a practical mindset: tests should make development faster and more confident, not slower and more burdensome. My goal is to help teams find testing strategies that actually enhance their workflow.