Object-Oriented Programming in R: What S4 Can Do That S3 and Python Can’t
Learn How to Build Real DSLs and Frameworks Using R’s Most Misunderstood OOP System
“At first, I avoided S4 like it was a cursed tome from an arcane library… Now I see it for what it is: a grumpy, structured, but immensely powerful wizard.
— A reformed R programmer
Welcome Back to the R Wizardry Workshop
You’ve now played with the simplicity of S3, and you’ve seen how ggplot2 and others speak fluent DSL by composing nouns and verbs like a symphony of tidy Lego blocks.
But today, we’re stepping into S4: the industrial-strength engine that powers Bioconductor, complex genomic analyses, and other spells where S3 just can’t keep up.
Why S4 Exists (and Why It’s Worth Learning)
S3 is charmingly informal.
But what if…
- You need strict type checks?
- You want to dispatch not just on one argument, but on multiple inputs?
- You want your codebase to be more self-documenting, with class signatures, slots, and real inheritance?
Welcome to S4: a more formal, structured, and powerful version of OOP in R.
You don’t have to love it. But if you understand it, you’ll be one of the few who can tame its magic.
What Makes S4 Different from S3?

Let’s Build Something Real: A Pipeline Step DSL in S4
Goal:
We want to model a simple bioinformatics pipeline where each step may take a data.frame
or a GRanges
object. And depending on the type of data, we want different logic to run.
This is classic multiple dispatch.
Step 1: Define an S4 Class
setClass("PipelineStep",
slots = list(name = "character", func = "function"))
Now we have a formal object with two slots: name
and func
.
Step 2: Create a Constructor
PipelineStep <- function(name, func) {
new("PipelineStep", name = name, func = func)
}
Now we can create real steps:
step1 <- PipelineStep("Normalize", function(x) x / max(x))
Step 3: Create a Generic + Multiple Dispatch Method
Here comes the magic:
setGeneric("applyStep", function(step, data) standardGeneric("applyStep"))
Now we add methods depending on the type of data:
setMethod("applyStep", signature("PipelineStep", "data.frame"),
function(step, data) {
message("Applying to data.frame")
step@func(data)
})
setMethod("applyStep", signature("PipelineStep", "GRanges"),
function(step, data) {
message("Applying to GRanges object")
# e.g., normalize coverage scores
step@func(data)
})
Now you can do:
applyStep(step1, my_dataframe)
applyStep(step1, my_granges_object)
Without if/else clutter, thus a lot cleaner.
Multiple Dispatch Feels Like This…
applyStep(step1, <type A>) → Runs Method A
applyStep(step1, <type B>) → Runs Method B
This is superior to Python’s single dispatch and more general than what most OOP systems offer. It’s also the spiritual cousin of Julia’s dispatch system.
Where You See S4 in the Wild
In Bioinformatics, it appears often when handling the classes:
- SummarizedExperiment
- Biostrings
- DESeq2
- SingleCellExperiment
- GenomicRanges and IRanges
Bioconductor wouldn’t exist without it.
Why S4 Scares People (and Why You’ll Be Fine)
Let’s be real:
- It’s verbose.
- It has a slightly 2000s Java vibe.
- Debugging S4 errors can feel like deciphering ancient scrolls.
But if you treat S4 like a serious toolkit for real abstraction (not everyday scripting), it becomes:
A powerful way to build extensible software that is safer, clearer, and more scalable.
Bonus: Mixing S3 and S4
Yes, you can mix them. Many packages do:
- Use S4 where you need strong typing or Bioconductor integration.
- Use S3 where you need lightweight custom types or DSL-like flexibility.
Just don’t forget which spellbook you’re holding.
Summary: Why S4 Is a DSL Architect’s Dream
- You can define generic verbs (plot, summarize, fit, applyStep)
- You can dispatch based on multiple nouns
- You can create slot-based records with clear interfaces
- You can create frameworks, not just scripts
S4 isn’t for everything. But when you’re building something serious, it’s the OOP system that delivers.