HSG-MCS-HS21_tapas/doc/architecture/decisions/0008-switch-to-an-event-driven-microservices-architecture.md
2021-12-02 23:39:21 +01:00

2.5 KiB
Raw Blame History

8. Switch to an event-driven microservices architecture

Date: 2021-11-21

Status

Proposed

Supercedes 5. Event driven communication TODO Fix this. Should only supercede it if this has been accepted. This should also subercede 0013 - Microservice Architecture if accepted.

Context

Our Tapas App is currently implemented based on a microservice architecture, where the services communicate synchronously via request-response. Each service encapsulates a different bounded context with different functional and non-functional requirements. Internal communication could also be done using asynchronous or event-driven communication.

Decision

Pros: Scalability: Different services within the Tapas app are not always able to scale at the same rate. For example, we could have thousands of users adding printing tasks at the same time, but maybe we only have one printer. In this scenario we might want to scale the task-list service up to handle the creation load, but scaling up the printing executor operates on a different time-scale (i.e. adding a printer takes time). Moreover, we could have a lot of new tasks coming in, most of which can be executed internally. In this case we want to be able to scale up the task list but might not need to scale up the auction house. Event-driven communication would decrease the coupling of services. Consequently, the scalability of individual services would be enhanced as they no longer depend on the scalability of other services. This improves the apps overall scalability. Since scalability is one of the systems top 3 -ility, this seems quite important. Fault tolerance: Another of the systems top 3 -ilities is fault tolerance. We could have highly unstable IoT executors that fail often. This should not disrupt the systems overall operation. The decoupling facilitated by event-driven, asynchronous, communication ensures that when individual services go down, the impact of other services is limited and once they go back up then can recover the systems state from persisted messages. Cons: Error handling, workflow control, and event timing: The aforementioned topics outline the drawbacks of event- driven architecture. These drawbacks can be mitigated by using an orchestrator (like we currently do with the roster) to orchestrate assignment of tasks, auctioning off tasks and error handling when executors fail. More research needed.

Consequences

Consequences to be determined but would relate to the three concepts mentioned as cons.