A microservice journey part 5 - dependancies

Moving from a monolith to a microservice Architecture is hard. It is important to understand what are the issues with the monolith before jumping right it. If you understand what your problem is, you can make sure you don't introduce the same problem in a microservice ecco-system.

Distributed Monolith
When we start realising the benefit of building new things and linking them via an API, it can be quite tempting to make everything a service. The problem starts to arise when one service calls another which calls another. This leads to a bunch of services which are dependent on each other. You end up back to where you started in a monolith like structure, you end up with a distributed monolith. 
The problem with Monoliths is that everything is accessible. It seems like a good idea at the time, sharing code seems like a no brainier. Usually there is one big catch-all project which contains all the utility type code. Once shared between other parts of code, you can never change it anymore without massive impacts. The more dependencies there are on this bit of code, the less likely it is that you can change it. In the case of microservice, this is exponentially worse as the dependencies are not clear. You need to keep a list of dependent APIs so you know who's using it and what you need to test when you release it. If you keep the key goals of microservice in mind and ask the questions, can I deploy this independently, can I scale this independently and is the entire sub system fault tolerant (or will this one service bring everything down) if you can answer these questions and you are happy with the response, then you are good to go. And not by caveat (If we automate test all the APIs then we will be sweet right?)

Networks
The CAP theorem is a long standing computer science theorem which basically says you can only ever have 2 of these three element;

C - consistency
A - availability
P - network partition

If you want consistency and availability, then keep your monolith. if you choose to add a network hop between services, then you must choose between availability or consistency but not both. 
If you build a lot of microservices and they are all dependent on each other, you end up adding a heap of network hops, and you will need to deal with availability problems. Network are inherently flaky and as such, the more network hops there are, the more likely it is that some requests will be dropped. if you have 1 network hop, then it is less likely. consider the follow; if we use the SLA example and assume the network has an availability of 99.9%. That's is 8h 45m 56s of downtime per year. If I string together 3 services all with 99.9% up-time I get 99.9 * 99.9 * 99.9 = 99.7% = 1d 2h 17m 50s outage per year. That's a lot of extra down time added by splitting this into multiple services.
If you consider your business domain, microservices should marry to those domains and keep everything part of that domain as close as possible.
Thing of an eCommerce platform. Search is a key component of any online platform. The options are
Option 1. Call an orchestrating service which then calls a search service, followed by Article service, inventory service, pricing service, promo service.... The list goes on for a while...
Option 2. Call the Search service.

Option 1 is has many microservices and seems very flexible, since the consumer decides how to use it, but actually it is extremely brittle. Think about all the cases of how option 1 could fail. All of those failures need to be handled gracefully. Think beyond just search, Article service is a core service can be used by many other services and it becomes a central point of failure. How can we deploy the Article service easily if  there are 10 other dependencies on it? How do we upgrade this service 10 time a day without some really good blue green deployment strategy, versioning of APIs and really good regression testing? To release changes in search, we actually need to deploy 2 or 3 services to make a simple change.
Option 2 has all the data processed and replicated into a data store or data stores the service has direct access to. Deploying this service is much easier. Scaling this service is much easier. There is still a dependency on the article, pricing, & inventory service, but the dependencies are asynchronous (more on this later).


Coupling
Tight Coupling is a major problem in IT systems, and it can be very hard to solve. It makes upgrading components hard. Lots of people need to be involved to make a simple change, large regression impact, and lots of risk associated with the change. The problem is that IT systems are always changing, and tight coupling means we can not easily change system components without affecting the whole. A good IT system, and a good microservice architecture should allow you to easily change one part of the Architecture without fear of breaking something else. When a service call an API, we immediately get a tight coupling to that service. Now we are dependent on another service which might not have been built by us, which we don't have control over, and which we need to ensure is always backward compatible. If we have many service calling many services, chances are there will be a few critical services which every other service uses (a utility service), which can bring the whole thing down.
To reduce coupling we can use an event stream and event driven architecture.

Event driven Architecture.
This significance of this approach is massive, and what is really important to get right is the way we think about events. An event is a business event emitted from a system after the event has occurred and has no prior knowledge of how it's going to be consumed. This last point is critical if we want to reduce coupling. If we are using events as a replacement for APIs we will end up in the same place, and may as well just use an API. But if the event emitted has no knowledge of the downstream consumers, but simply includes all the information relevant to the event. Then as new consumers popup, they are not dependent on a version of the API but instead they have available the entire content of the message and they can choose what they need. Further more, if the event is modeled around a business event it will be much more extensible in the future. An good example is a new order. Many downstream systems might be interested in a new order. An example of a non business event could be saved Order to database. It might interesting to the internals of a service, but it is not really the business event of a new order being created. thus it will have limited extensibility.
It reverses the dependency somewhat. It also decouples the consumer from the publisher. So if we want to change any of there systems then they just need to be backward compatible to the event and we can easily change the publishing system, or the consuming one. Since the event stream is a queue of messages we can easily deploy a new version without too much planning. As long as the same messages are being emitted we can deploy whenever we like, as the queue buffers the service request  and what was a synchronous dependency with APIs has now become an asynchronous dependency, and now the service can tolerate faults much easier.

Conclusion
Just because you have chosen to build microservices, doesn't mean you will your system will be great. There are many challenges building a distributed system and the most important thing is to understand why. Why are we going to all this trouble to split our system? If you understand the why and keep asking yourself these questions during the design phase, it will ensure you don't just end up in the same place you started. possible worse.

Comments

Popular posts from this blog

What good looks like!!

A microservice journey - part 2: what type of micro service are you?

Validation Rules