Compatibility in Event driven architectures

Making systems compatible over time

Oct 03, 2023

Welcome to The Engineer Banker, a weekly newsletter dedicated to organizing and delivering insightful technical content on the payments domain, making it easy for you to follow and learn at your own pace

Learn payments with The Engineer Banker

TEB

April 20, 2023

Read full story

Refer a friend

Welcome to our second article about event driven architectures. Software and systems are in a constant state of evolution, driven by factors such as emerging requirements, regulatory shifts, service deprecation, and data model updates. Consequently, our system design must be robust enough to accommodate both internal and external changes seamlessly. While short-term goals focus on timely releases and immediate value delivery to our customer base, it's imperative to also strategize for long-term system evolution. This dual focus ensures that our architecture can transition smoothly from its current state to future versions, thereby sustaining its operational efficiency and adaptability. The insights offered in this article aim to guide us in navigating the journey of system evolution with a focus on maintaining compatibility.

In an ideal scenario, we would prefer to have every component within a system operating on the same version, ensuring consistent functionality and seamless interaction. However, given the inherently distributed nature of payment systems, microservices architectures and the multitude of actors involved - ranging from banks, payment gateways, to technology vendors and others - achieving simultaneous updates across all parties becomes a logistical challenge. Each actor may have their unique operational constraints, technological resources, and update schedules, which means that version upgrades across the system will inevitably occur in a staggered, non-synchronized manner. Therefore, it becomes crucial to maintain robust backward compatibility measures and have in place mechanisms to handle version mismatches, ensuring smooth operation of the system despite the diversity in component versions.

Compatibility is evaluated from the perspective of the consumer, as it is at the point of message consumption where parsing and interpretation occur. This stage is particularly sensitive to failure, making it crucial to ensure that the received messages are compatible with the consumer's capabilities for accurate interpretation and processing. The previous diagram fully explains the difference between backward and forward compatibility.

Backward compatibility means that consumers with a newer schema version can correctly parse and interpret data from producers with an older schema version.
Forwards compatibility means that consumers with an older schema version can correctly parse data from producers with a newer schema version.

When a modification is designed to be both backward and forward compatible, it is termed as fully compatible. This scenario commonly arises when an optional field is introduced to an existing system. If a change in the schema or specification is neither forward nor backward compatible, then it is an incompatible change or a breaking change. In order to cope with this scenarios we can make use of the parallel change pattern. Certain changes can introduce incompatibilities between the producer and consumer in an event-driven system. Examples of incompatible changes include:

Removing Fields: Eliminating a field that the consumer expects will lead to errors during deserialization or may result in incorrect behavior.
Changing Data Types: Modifying the data type of an existing field can cause type mismatch errors at the consumer end.
Reordering Fields: Changing the order of fields in a fixed-order format like CSV could lead to incorrect data mapping during consumption.
Adding Mandatory Fields: Introducing a new field that is mandatory can break consumers who are not programmed to provide this field.
Enum Restriction: Removing an enum value that the consumer might be using could lead to issues.
Changing Field Semantics: Modifying the meaning or usage of a field without changing its name or type can cause logical errors.
Modifying Default Values: Changing the default value of a field may lead to unexpected behavior for consumers relying on the previous default.
Renaming Fields: Changing the name of an existing field will lead to issues if the consumer expects the old name.
Switching from Single to Repeated Fields: Changing a field from being a single value to a list of values can cause type errors on the consumer side.
Constraints and Validations: Adding or tightening validation rules or constraints on the data can result in previously valid messages being rejected.

Some of these incompatible changes pertain directly to the message's data structure, affecting how the consumer parses and deserializes the incoming data. Others are more subtle and relate to alterations in the business logic that governs how the consumer interprets and processes these messages. Both types of changes can have significant implications for the reliability and accuracy of data consumption in an event-driven system.

Thank you for reading The Engineer Banker. This post is public so feel free to share it.

Introducing incompatible changes in our architecture

The parallel change pattern is a software engineering technique used for evolving a system in a manner that maintains its integrity and continuity. This pattern involves creating a new path or functionality while keeping the old one operational. In essence, it is running the old and new systems in parallel. The new functionality is added and used alongside the old one, with the system being capable of supporting both simultaneously. This approach allows for testing and gradual migration to the new system while maintaining operational stability.

The Parallel Change pattern unfolds in two stages: 'Expansion' and 'Contraction'.

Initially, during the 'Expansion' phase, we seamlessly incorporate the new interface into the existing system. This allows the legacy functionality and the new behavior to coexist harmoniously. Once all system users have transitioned smoothly onto the new schema, we transition to the 'Contraction' phase. During this phase, we systematically eliminate the old behaviour from the system, ensuring a clean and efficient upgrade to the new interface. Let’s understand the pattern with some diagrams.

In essence, this technique or strategy can be used in a number of other situations, namely:

Database Refactoring: A crucial element of evolutionary database design, most refactorings follow the parallel change pattern. This approach allows for a transition period between the old and new schemas until all code is updated.

Deployments: Techniques like canary releases and Blue-Green Deployment use parallel change to incrementally shift users between old and new code versions, reducing risk and simplifying service orchestration in microservices architectures.

API Evolution: For non-backwards-compatible changes to remote APIs like REST services, it can be applied to modify payload requirements or introduce new endpoints to distinguish between versions.

You may want to continue reading the rest of the event-driven architecture series: