Forget objects: start thinking in data transformations

Stop modeling conversations between objects, start composing transformations.
Forget objects: start thinking in data transformations
Stop modeling conversations between objects, start composing transformations.
Introduction
Anyone who has worked on a non-trivial codebase knows this feeling: you open a file to change something simple, and suddenly you're navigating through objects, services, state mutations, and indirect dependencies just to understand what's going on.
Data is hidden, behavior is fragmented, and side effects leak everywhere.
Most of the time, the problem isn't a lack of patterns or abstractions — it's too many of them. We often model systems as conversations between objects, when in reality what we are doing is much simpler: we receive data, transform it step by step, and produce a result.
If you strip away the syntax, frameworks, and buzzwords, most programs are nothing more than a sequence of transformations. That idea has a name — pipeline architecture — but more importantly, it represents a different way of thinking about code.
A pipeline mindset
There's a famous story from the blog "More Shell, Less Egg" that captures this mindset perfectly.
Donald Knuth was asked to solve a seemingly simple problem:
read a text file, find the n most frequent words, and print them sorted by frequency.
Knuth's solution spanned more than ten pages of Pascal.
Doug McIlroy, one of the pioneers of Unix, responded with this:
tr -cs A-Za-z '\n' | tr A-Z a-z | sort | uniq -c | sort -rn | sed ${1}q
Instead of one big, custom-built algorithm, McIlroy chained small, independent tools. Each one does exactly one thing, and the output of one becomes the input of the next.
No hidden state, no complex control flow — just data flowing forward. That's the essence of pipeline thinking.
From object conversations to data transformations
Pipeline architecture is often explained as a pattern, but it works better when understood as a mental model.
In object-oriented systems, data is usually locked inside objects. Methods exist primarily to protect and mutate internal state. As systems grow, this leads to tight coupling and implicit behavior — you need to understand who owns what before you can safely change anything.
Pipeline thinking flips this around: instead of hoarding state you pass data forward, so data becomes the main character and functions are just transformations applied along the way — each step receives something, transforms it, and hands it off to the next.
This idea is old, but it's also very current. Languages like Elixir, F#, and Swift have embraced this model for years with a pipeline operator (|>). PHP 8.5 joining this group is not just a syntactic upgrade — it's an invitation to rethink how we structure everyday PHP code.
Pipes and Filters: strengths, limits, and reality
At an architectural level, this idea shows up as the pipes and filters style.
A typical pipeline is composed of:
- Producers – which initiate the flow
- Transformers – which modify or enrich the data
- Testers – which validate, filter, or route it
- Consumers – which terminate the process
This structure has clear and well-understood strengths:
- Extremely easy to reason about, largely because the execution model is linear and explicit
- Each stage is small, focused, and replaceable, which results in strong modularity and a low cognitive load
- From a cost perspective, this usually translates into simpler maintenance and a low cost of change
But it also has trade-offs:
- Pipelines are often deployed as a single unit, especially in monolithic systems
- If one step fails badly (for example, an out-of-memory error), the entire application goes down - this limits fault tolerance and elasticity, and it can significantly increase recovery times in larger applications
It’s also worth making an important distinction: pipes and filters as an architectural style often describe system-level structure and deployment concerns, while pipeline thinking can be applied locally. You can use it inside a single request, use case, or domain workflow without committing to a full pipes-and-filters architecture.
Used in the right context — request processing, domain workflows, data transformation — pipelines provide one of the clearest ways to structure execution flow.
Error Handling in a linear world
Real systems quickly raise an important question: how do you handle errors in a linear pipeline without destroying readability?
Pipelines remain clean when values are never passed in their raw form. Each step should operate on a wrapper that explicitly encodes success or failure so failures can travel through the pipeline as ordinary data.
Common representations include Result<T, E> or Either<E, T> in typed languages. Elixir expresses the same concept through a convention rather than a strict type system, using tuples such as {:ok, value} and {:error, reason}. Regardless of syntax, the goal is the same: make failure a first-class value that can safely flow through the system.
There are two primary strategies for working with these wrapped values in a pipeline:
Handle errors inside each transformation
- The most direct approach is to handle errors inside each transformation: functions are written with multiple branches—one that pattern-matches on a successful value (for example
{:ok, ...}orRight) and performs the transformation, and another that matches on a failure (for example{:error, ...}orLeft) and simply returns it unchanged. - Once a failure occurs, it propagates through the pipeline and prevents further computation. This approach is straightforward but tends to scatter error-handling logic across many functions, increasing boilerplate and obscuring the core business logic.
Centralize error handling in the pipeline (preferred)
- Lift propagation out of individual transformations and let the pipeline orchestrate control flow using a
bind-style operation (also calledand_then,flatMap, orchain). bindtakes a wrapped value and a transformation function: if the value is success it unwraps and applies the function; if it's a failure it short-circuits and propagates the error — the function is skipped entirely and the error is propagated automatically.- In terms of
ResultorEither, this means composing computations instead of conditionals. With this approach, transformation functions remain pure and narrowly focused: they operate on raw input, return aResultorEither, and have no awareness of how errors are propagated or short-circuited.
With centralized handling, the pipeline becomes a clear description of the happy path. Failures are simply another kind of data that short-circuit the flow, producing code that reads linearly, scales with complexity, and treats errors in a controlled, predictable way.
Where pipeline thinking fits naturally
Pipeline thinking is not tied to a specific language or paradigm.
It naturally emerges whenever a problem can be described as a sequence of transformations, where each step receives an input, produces an output, and may fail in a controlled way.
This model appears across very different domains. ETL workflows are a canonical example: data is extracted, validated, transformed, enriched, and finally loaded. Each stage depends on the previous one, and failures must short-circuit the process while preserving context about what went wrong. The same structure applies to data processing pipelines, stream processing, and batch jobs, where values flow through well-defined stages.
The pattern is just as common in application-level workflows. Request and command handling, domain use cases, and business processes often follow a linear narrative: parse input, validate rules, apply transformations, persist state, trigger side effects.
Integrations with external systems such as calling third-party APIs, normalizing responses, and mapping them into internal models, also fit naturally into this shape.
Anywhere the logic can be summarized as:
first do this, then validate, then transform, then persist
…you already have a pipeline.
Pipeline thinking doesn't introduce a new structure so much as it reveals one that was already there, making the flow explicit, composable, and easier to reason about.
A practical example using PHP 8.5 pipe operator
Let's model a common workflow: creating a user from an incoming request. The goal here is not to build a framework, but to show how pipeline thinking can emerge with very little structure.
Step 1: a simple Result wrapper
Instead of introducing a full class hierarchy, we'll stick to a simple, explicit convention:
function ok($value): array
{
return ['ok' => true, 'value' => $value];
}
function error(string $reason): array
{
return ['ok' => false, 'error' => $reason];
}
Step 2: a single composition primitive
The entire pipeline relies on one operation: apply the next step only if the previous one succeeded.
This is the bind, flatMap, and_then concept in its simplest possible form.
function then(array $result, callable $fn): array
{
return $result['ok']
? $fn($result['value'])
: $result;
}
Step 3: pure transformation functions
Each function does exactly one thing, takes raw input, and returns a Result. None of them knows it's part of a pipeline.
Each function is easy to test and has no knowledge of the pipeline
function sanitize(array $input): array
{
return ok([
'email' => trim($input['email'] ?? ''),
'name' => trim($input['name'] ?? ''),
]);
}
function validate(array $data): array
{
return filter_var($data['email'], FILTER_VALIDATE_EMAIL)
? ok($data)
: error('Invalid email');
}
function createUser(array $data): array
{
return ok(User::create($data));
}
function sendWelcomeEmail(User $user): array
{
Mail::send($user->email);
return ok($user);
}
function present(User $user): array
{
return ok([
'id' => $user->id,
'email' => $user->email,
]);
}
In real systems, not every step can be pure, side effects like persistence, messaging, or sending emails still fit naturally into pipelines
The key is that they remain explicit, localized, and ordered, rather than implicit side effects hidden behind object state.
Step 4: the pipeline itself
You get a visible, linear flow with a clear separation between transformation and orchestration, built-in error handling without nested conditionals, and code that reads like a process rather than a puzzle.
$result =
sanitize($input)
|> fn($r) => then($r, 'validate')
|> fn($r) => then($r, 'createUser')
|> fn($r) => then($r, 'sendWelcomeEmail')
|> fn($r) => then($r, 'present');
Even without native pipe operators, this style can be applied today using function composition or simple helpers.
Conclusion: rediscovering the flow
From Unix shell pipelines in the 1970s to PHP's pipe operator in 8.5, this idea has survived because it works.
Thinking in pipelines encourages you to:
- Stop hiding data
- Reduce implicit state
- Favor composition over control flow
- Write code that explains itself
This is not about abandoning objects entirely.
Objects still make sense for modeling long-lived state and invariants. Pipelines shine when the problem is primarily about flow, orchestration, and transformation.
Sometimes, the cleanest design isn't a clever hierarchy of classes, but a simple, explicit flow of data.
So the next time a piece of PHP code feels harder than it should be, try asking a different question:
What would this look like if it were just a pipeline?
Chances are, the answer will be simpler than you expect.
References
- Doug McIlroy - "More Shell, Less Egg"
- David Thomas & Andrew Hunt - "The Pragmatic Programmer"
- Mark Richards & Neal Ford - "Fundamentals of Software Architecture"
- PHP Functional Operators - "Pipe Operator"
- Scott Wlaschin - "Railway Oriented Programming"