A lot of docs
This commit is contained in:
parent
e519baf0f9
commit
70cf269dde
BIN
TAPAS-Final/Editorial Notes.pdf
Normal file
BIN
TAPAS-Final/Editorial Notes.pdf
Normal file
Binary file not shown.
Binary file not shown.
Before Width: | Height: | Size: 1.2 MiB |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
BIN
TAPAS-Final/Exercises/Exercise 8: Testing/Testing Report.pdf
Normal file
BIN
TAPAS-Final/Exercises/Exercise 8: Testing/Testing Report.pdf
Normal file
Binary file not shown.
Binary file not shown.
BIN
TAPAS-Final/doc/System Overview.jpeg
Normal file
BIN
TAPAS-Final/doc/System Overview.jpeg
Normal file
Binary file not shown.
After Width: | Height: | Size: 408 KiB |
|
@ -0,0 +1,25 @@
|
||||||
|
# 1. microservice architecture
|
||||||
|
|
||||||
|
Date: 2021-12-02
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
Accepted
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
The system is made up of five distinct bounded contexts, namely the Task Domain, the Roster Domain, the Executor Pool Domain, the Executor Domain, and the Auction Domain. The way that these bounded contexts function together to fulfil the system requirements can be based on many different architectures.
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
The system will follow the Microservice architecture.
|
||||||
|
|
||||||
|
The Microservices architecture suits our system's driving architectural characteristics particularly well. Scalability and fault tolerance are two of the systems top 3 driving characteristics and elasticity and evolvability are two of the systems other driving characteristics. These are all characteristics that the Microservices architecture excels at. Furthermore, none of our system's driving characteristics is hindered by the Microservices architecture (none of them have below 3 stars our of 5).
|
||||||
|
|
||||||
|
We do not expect to have a single monolithic database, so this is not a concern.
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
There is a considerable amount of communication between bounded contexts. This could cause responsiveness and performance issues due to added latency. This could therefore mean we would need to use asynchronous REST calls or publish-subscribe communication to mitigate these issues as much of the communication does not have to be synchronous.
|
||||||
|
|
||||||
|
Using a distributed architecture makes managing transactions more complex. This means that we could need to use sagas to manage distributed transactions. So far the only workflow that would require transactions between domains would be the deletion of tasks.
|
|
@ -1,19 +0,0 @@
|
||||||
# 1. Record architecture decisions
|
|
||||||
|
|
||||||
Date: 2021-10-18
|
|
||||||
|
|
||||||
## Status
|
|
||||||
|
|
||||||
Accepted
|
|
||||||
|
|
||||||
## Context
|
|
||||||
|
|
||||||
We need to record the architectural decisions made on this project.
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
|
|
||||||
We will use Architecture Decision Records, as [described by Michael Nygard](http://thinkrelevance.com/blog/2011/11/15/documenting-architecture-decisions).
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
|
|
||||||
See Michael Nygard's article, linked above. For a lightweight ADR toolset, see Nat Pryce's [adr-tools](https://github.com/npryce/adr-tools).
|
|
|
@ -0,0 +1,24 @@
|
||||||
|
# 5. Event driven communication
|
||||||
|
|
||||||
|
Date: 2021-10-18
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
Accepted
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
The TAPAS system will be implemented following the Microservices architecture. Each service encapsulates a different bounded context with different functional and non-functional requirements. The TAPAS system could either be implemented as Synchronous Microservices or Event-Driven Microservices.
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
The TAPAS system could either be implemented as Event-Driven Microservices.
|
||||||
|
|
||||||
|
Event-driven asynchronous communication suits the TAPAS system better than synchronous communication. This will facilitate looser coupling between services.
|
||||||
|
|
||||||
|
Scalability, one of the system's top 3 driving architectural characteristics, will be positively impacted as individual services can be scaled up and down as needed.
|
||||||
|
|
||||||
|
Responsiveness (another of the system's driving architectural characteristics) will also improve as services do not need to wait for other services to respond before responding themselves.
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
Asynchronous communication makes error handling more complex. At this moment in time we do not see this coming a problem. However, if the error handling becomes overly complex, then we might need to implement the workflow event pattern in order to combat this increased complexity.
|
|
@ -1,29 +0,0 @@
|
||||||
# 2. Seperate service for Executor Pool
|
|
||||||
|
|
||||||
Date: 2021-11-21
|
|
||||||
|
|
||||||
## Status
|
|
||||||
|
|
||||||
Accepted
|
|
||||||
|
|
||||||
## Context
|
|
||||||
|
|
||||||
The Executor Pool has to keep track of which Executors are online and what task types they can execute. The Roster keeps track of tasks that need to be executed and assigns Executors to tasks. The Executor Pool functionalty could be implemented in a seperate service or as a part of the Roster.
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
|
|
||||||
The executor pool will be implemented as a seperate service.
|
|
||||||
|
|
||||||
Most importantly, the Executor Pool and the Roster have quite different responsibilities and reasons to change. On one hand, the Executor Pool manages the Executors, and therefore it will change if we want to add functionality that impacts how Executors log onto the system and how we keep track of them. For example, if we want to check if an executor fulfills some requirements before logging on. On the other hand, the Roster manages the execution of tasks. TODO
|
|
||||||
|
|
||||||
Single responsibility
|
|
||||||
|
|
||||||
Potential Code volatility
|
|
||||||
|
|
||||||
Scalability - Roster needs much more scale
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
|
|
||||||
The one funcionalty that is duplicated is error handling when an executor disconnects nongracefully...
|
|
||||||
|
|
||||||
TODO
|
|
|
@ -0,0 +1,26 @@
|
||||||
|
# 3. Separate service for the Roster
|
||||||
|
|
||||||
|
Date: 2021-11-21
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
Accepted
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
The Task domain includes the creation and deletion of tasks from the system. It also stores all tasks and keeps track of a task's status. The Roster domain keeps track of internal task execution. It receives tasks from the Task domain. If the task can be executed internally then it queues it and keeps track of it until it has been executed by an internal executor. If it cannot be executed internally then it sends the task on to the Auction domain. These two domains could be implemented as a consolidated service or as two separate services.
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
We will create two separate services; a Task List service and a Roster service.
|
||||||
|
|
||||||
|
Firstly, having two separate services will improve the system's fault tolerance. If the Task List goes down then the Roster can keep on managing the internal execution of tasks. Similarly, if the Roster goes down then the Task List can keep on being receiving input from users. Since we use asynchronous messaging for internal communication, the services can process new events when they come back up. The creation and execution of tasks is the heartbeat of the TAPAS system. Therefore, making it more fault tolerant is critical.
|
||||||
|
|
||||||
|
|
||||||
|
Secondly, we don't expect the Task List to change as frequently as the Roster. The Task List usually just needs to change when there are large new features that will impact more than just the Task domain. These large changes will be not very frequent. Conversely, we expect the Roster to change more frequently due to changes in the assignment algorithm, communication with executors, or error handling.
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
Before deleting a task from the Task List we first need to delete it from the Roster (and make sure it is not being executed). Having these as two separate services makes this transaction more complicated to manage.
|
||||||
|
|
||||||
|
Moreover, the services have to talk a lot to each other and having them seperated will add latency between them.
|
|
@ -1,21 +0,0 @@
|
||||||
# 3. Separate service for the Roster
|
|
||||||
|
|
||||||
Date: 2021-11-21
|
|
||||||
|
|
||||||
## Status
|
|
||||||
|
|
||||||
Accepted
|
|
||||||
|
|
||||||
## Context
|
|
||||||
|
|
||||||
The roster acts as an orchestrator for the system. It communicates directly with the task list, the executors, the executor pool, and the auction house. It handles the assignment of a task to a corresponding and available executor, keeps track of all the connections between tasks and executors, and communicates the status of tasks and executors to other services.
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
|
|
||||||
The Roster domain will be its own service.
|
|
||||||
The Roster service will be a central point in our application. It will have most of the workflow logic in it and will communicate with all the different services. Therefore, other services can focus on their business logic and be largely ignorant of the overall workflow.
|
|
||||||
The code of the assignment will change more often than the code of the other services, thus having the assignment service split from the other makes it more deployable.
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
|
|
||||||
Having this system as its own service will reduce the fault tolerance because the assignment service can be the single point of failure. We can mitigate this risk by implementing (server) replication and/or having an event driven communication with persisting messages. Therefore, all other services can run independently, and the assignment service can recover from a crash. Additionally, we need to ensure a high level of interoperability, since the roster has to communicate with all other parts of the system.
|
|
|
@ -1,4 +1,4 @@
|
||||||
# 12. seperate service for each executor
|
# 4. Separate service for each Executor
|
||||||
|
|
||||||
Date: 2021-11-21
|
Date: 2021-11-21
|
||||||
|
|
||||||
|
@ -12,13 +12,13 @@ Executors must receive tasks of different types and execute them. The executors
|
||||||
|
|
||||||
## Decision
|
## Decision
|
||||||
|
|
||||||
We will have a seperate service for each type of executor.
|
We will have a separate service for each type of executor.
|
||||||
|
|
||||||
Firstly, execution time differs significantly between task types. Therefore, having seperate services will allow the executos to scale differently based on their tasks' specific needs.
|
Firstly, execution time differs significantly between task types. Therefore, having seperate services will allow the executors to scale differently based on their tasks' specific needs.
|
||||||
|
|
||||||
Secondly, the systems functioning should not be disrupted in case an Executor fails. Having each type of executor in a seperate service will increase fault tolerance in this regard.
|
Secondly, the systems functioning should not be disrupted in case an Executor fails. Having each type of executor in a separate service will increase fault tolerance in this regard.
|
||||||
|
|
||||||
Lastly, extensibilty is one of the systems most important non-functional requirement and providers of executors need to be able to easily add executors to the executor pool. These factors are also positively impacted by having seperate services.
|
Lastly, evolvability is one of the systems most important non-functional requirement and providers of executors need to be able to easily add executors to the executor pool. These factors are also positively impacted by having seperate services.
|
||||||
|
|
||||||
There should not be any shared data between the executors. Additionally, there should be no flow of information between them. Thus, there should be no issues due to workflow and data concerns due to this decision.
|
There should not be any shared data between the executors. Additionally, there should be no flow of information between them. Thus, there should be no issues due to workflow and data concerns due to this decision.
|
||||||
|
|
|
@ -1,20 +0,0 @@
|
||||||
# 4. Separate service for the Task List
|
|
||||||
|
|
||||||
Date: 2021-11-21
|
|
||||||
|
|
||||||
## Status
|
|
||||||
|
|
||||||
Accepted
|
|
||||||
|
|
||||||
## Context
|
|
||||||
|
|
||||||
Tasks are created in the task list, and the status of each task (created, assigned, executing, executed) is tracked in the task list as well. The task list mainly communicates with the roster so that tasks can get assigned and the roster will give the task list feedback about the tasks’ status.
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
|
|
||||||
The task list will be its own service.
|
|
||||||
The task list needs to scale based on the number of active users and the intensity of their activity at any time while the scaling of other parts of the system can be constrained by other factors.
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
|
|
||||||
Although having the task list as its own service might slightly increase the complexity of the system and decrease the testability, it also makes the system easier to deploy and protective of its data. However, to ensure that this data is always available and does not get lost, the task list needs to be able to recover all its data (the entire history of all tasks) in case it goes down.
|
|
|
@ -1,20 +0,0 @@
|
||||||
# 5. Event driven communication
|
|
||||||
|
|
||||||
Date: 2021-10-18
|
|
||||||
|
|
||||||
## Status
|
|
||||||
|
|
||||||
Superceded by [8. Switch to an event-driven microservices architecture](0008-switch-to-an-event-driven-microservices-architecture.md)
|
|
||||||
|
|
||||||
## Context
|
|
||||||
|
|
||||||
Services need to be able to communicate with each other. Services need to be scalable and therefore multiple services will need to get the same messages. Most of the processes are about responding to events that are happening throughout the system.
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
|
|
||||||
We will use mainly event driven communication.
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
|
|
||||||
Event driven communication will help use to create a system which has high scalability and elasticity. Through persisting messages, we will also reach way higher fault tolerance and recoverability.
|
|
||||||
Having an event driven communication, we can only guarantee eventual consistency.
|
|
|
@ -0,0 +1,26 @@
|
||||||
|
# 5. Separate service for the Executor Pool
|
||||||
|
|
||||||
|
Date: 2021-11-21
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
Accepted
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
The Executor Pool has to keep track of which Executors are online and what task types they can execute. The Roster keeps track of tasks that need to be executed and assigns Executors to tasks. The Executor Pool functionality could be implemented in a separate service or as a part of the Roster.
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
The executor pool will be implemented as a separate service.
|
||||||
|
|
||||||
|
Most importantly, the Executor Pool and the Roster need to scale differently. The Roster needs to scale depending on the number of tasks in the system while the Executor Pool needs to scale based on the number of Executors in the system. As the number of tasks is likely to far exceed the number of Executors, the two services should scale differently.
|
||||||
|
|
||||||
|
|
||||||
|
Another reason why the two should be separate services is that they have quite different responsibilities and reasons to change. On one hand, the Executor Pool manages the Executors, and therefore it will change if we want to add functionality that impacts how Executors log onto the system and how we keep track of them. For example, if we want to check if an executor fulfills some requirements before logging on. On the other hand, the Roster manages the internal execution of tasks.
|
||||||
|
|
||||||
|
Lastly, separating the two will improve the fault tolerance of the system. If the Executor Pool goes down then the Roster will keep being able to delegate tasks to internal executors, although the Roster will not be notified about added/removed executors while the Executor Pool is down. This should however not be a problem if the Executor Pool is only down for a short amount of time. Similarly, the Executor Pool can continue to keep track of added and removed executors while the Roster is down.
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
Having the services separate will add latency between an executor being added to the Executor Pool and the Roster being notified about it. However, it is only really critical that the Roster is notified about new Executors that execute a new Task Type, which should happen relatively rarely. Therefore, a small increase in latency to this workflow should not have a large effect on the overall operation of the system.
|
|
@ -1,20 +0,0 @@
|
||||||
# 6. One global database or one database per service
|
|
||||||
|
|
||||||
Date: 2021-10-18
|
|
||||||
|
|
||||||
## Status
|
|
||||||
|
|
||||||
Accepted
|
|
||||||
|
|
||||||
## Context
|
|
||||||
|
|
||||||
We can have one database for all services or each Microservice can have its own database.
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
|
|
||||||
Each Microservice will have its own database.
|
|
||||||
The different services don’t need to store a lot of similar data. Therefore, we can have every Microservice handle its own data. This also gives the advantage that every Microservice owns its own data and is also responsible for it. (Data ownership, Data responsibility).
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
|
|
||||||
Having one database per Microservice will lead to eventual consistency. Having an event driven communication we can use event-based synchronisation to keep the data in sync between the services, thus the individual services don’t need to know about each other. To guarantee data consistency we can also use a pattern like sagas.
|
|
|
@ -0,0 +1,29 @@
|
||||||
|
# 7. Separate service for Auction House
|
||||||
|
|
||||||
|
Date: 2021-11-21
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
Accepted
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
The Auction House has to launch auctions and manage auctions for tasks that we cannot execute internally. Moreover, it has to subscribe to other Auction Houses and bid on their auctions. The Roster keeps track of tasks that need to be executed and assigns Executors to tasks. Additionally, the Roster passes tasks that cannot be executed internally onto the Auction House. The Auction House functionality can either be implemented as a separate service or as a part of the Roster.
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
The Auction House will be implemented as a separate service.
|
||||||
|
|
||||||
|
Most importantly, the Auction House and the Roster need to scale differently. The Roster needs to scale depending on the number of tasks in the system as all tasks go through the Roster initially. We predict that the majority of tasks will be executed internally. The Auction House also needs to scale depending on the number of external auctions. Moreover, the Auction House has different throughput as the Auction House has to communicate synchronously with a number of external clients.
|
||||||
|
|
||||||
|
Furthermore, separating the Auction House from the Roster will allow us to isolate the Roster from external communication. This improves the security of the TAPAS system as the Roster is more critical to the functioning of the system than the Auction House. Opening the Roster up to external communication could for example expose it to denial of service attacks.
|
||||||
|
|
||||||
|
Similarly, having two services separate will improve fault tolerance as either can fail without taking down the other. This is more important for the Roster as it can keep the main workflow of executing tasks going if the Auction House goes down.
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
Latency is added to the communication between the two domains. However, this should not have a major impact on the overall functioning of the system.
|
||||||
|
|
||||||
|
Data needs to be duplicated since both services need to keep track of the executors available in our system.
|
||||||
|
|
||||||
|
Deleting of tasks becomes an even more complex transaction as now we also need to check the Auction House to see if the task is being auctioned before deleting (we don't want to delete a task that another group will then go on to execute). This transaction now involves three services (the Task List, the Roster, and the Auction House). We will most likely need to implement a saga pattern to manage this transaction properly.
|
|
@ -0,0 +1,23 @@
|
||||||
|
# 7. Common library for shared code
|
||||||
|
|
||||||
|
Date: 2021-11-21
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
Accepted
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
The numerous services that make up the Tapas app all have common, non-domain specific, functionality that can be shared via a common library, or be replicated in each service.
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
We will use a common code library for shared code which does not change frequently, but if it would be changed, would need to be changed everywhere.
|
||||||
|
|
||||||
|
This improves maintainability as you only have to change the code in one place. This does not only save time, but also reduces the likelihood of forgetting to replace it in one service which could introduce bugs.
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
Changes in the common code will most likely require multiple services to be redeployed. However, those services would most likely have to have been changed individually and redeployed anyways.
|
||||||
|
|
||||||
|
Another consequence is that versioning becomes more complicated.
|
|
@ -1,22 +0,0 @@
|
||||||
# 7. Seperate service for Auction House
|
|
||||||
|
|
||||||
Date: 2021-11-21
|
|
||||||
|
|
||||||
## Status
|
|
||||||
|
|
||||||
Accepted
|
|
||||||
|
|
||||||
## Context
|
|
||||||
|
|
||||||
The auction house is the service that can connect to other groups’ auction houses. If there is a task whose task type does not match that of our executors, the auction house can start an auction where other groups can bid on doing the task for us. Moreover, it can also bid on other groups’ auctions.
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
|
|
||||||
The auction house will be its own service.
|
|
||||||
The auction house is the only part of our system that has external communication; therefore, it makes sense to have it as its own service, also to guarantee better deployability.
|
|
||||||
The auction house does not scale directly based on the number of tasks, but only the proportion which needs external executors. Moreover, there could be limits on the number of auctions that could be started. Therefore, the auction house scales differently to other services.
|
|
||||||
Moreover, having the auction house as its own service also improves the fault tolerance of our system.
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
|
|
||||||
Since the auction house will be a standalone service, we have to make sure that if it goes down, it can recover its data in some way (which auctions it has launched, which auctions it has placed bids on or even won, etc.). Even though the testability and latency of our system might worsen by having a separate service for the auction house, we can implement different kinds of communication for internal and external communication in a much easier way.
|
|
|
@ -0,0 +1,29 @@
|
||||||
|
# 8. Executor base library
|
||||||
|
|
||||||
|
Date: 2021-11-21
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
Accepted
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
According to the project requirements, Executors can be developed by other teams within the organisation. Executors all use the same logic to communicate with other services. This means that their code base is near identical except for the code that implements their specific task execution. The code that implements the executor's shared logic can either be implemented by a shared library, or replicated across all executors.
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
We will implement a shared library for common Executor functionality.
|
||||||
|
|
||||||
|
All executors use the same logic to communicate with other services, any change to this logic would have to be made for every executor. Having this shared logic in a separate library makes it easy to change the common logic at one place. The code sharing happens at compile time, which reduces the chance of runtime errors, compared to other code sharing approaches.
|
||||||
|
|
||||||
|
Additionally, if other teams need to create executors, they can just reference the executor-base library and implement the actual execution part. Therefore, they don’t need to worry about the connection implementations to the TAPAS system.
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
Using a shared library will increase the complexity of the executors.
|
||||||
|
|
||||||
|
Changes in the common code will most likely require multiple services to be redeployed. However, those services would most likely have to have been changed individually and redeployed anyways.
|
||||||
|
|
||||||
|
Another consequence is that versioning becomes more complicated.
|
||||||
|
|
||||||
|
Lastly, we have to make sure that we don't become over reliant on everyone using this library to communicate with the TAPAS system. Future IoT Executors might want to use a more lightweight way to communicate, so we will have to be aware of this.
|
|
@ -1,26 +0,0 @@
|
||||||
# 8. Switch to an event-driven microservices architecture
|
|
||||||
|
|
||||||
Date: 2021-11-21
|
|
||||||
|
|
||||||
## Status
|
|
||||||
|
|
||||||
Proposed
|
|
||||||
|
|
||||||
Supercedes [5. Event driven communication](0005-event-driven-communication.md) TODO Fix this. Should only supercede it if this has been accepted. This should also subercede 0013 - Microservice Architecture if accepted.
|
|
||||||
|
|
||||||
## Context
|
|
||||||
|
|
||||||
Our Tapas App is currently implemented based on a microservice architecture, where the services communicate synchronously via request-response. Each service encapsulates a different bounded context with different functional and non-functional requirements. Internal communication could also be done using asynchronous or event-driven communication.
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
|
|
||||||
Pros:
|
|
||||||
Scalability: Different services within the Tapas app are not always able to scale at the same rate. For example, we could have thousands of users adding printing tasks at the same time, but maybe we only have one printer. In this scenario we might want to scale the task-list service up to handle the creation load, but scaling up the printing executor operates on a different time-scale (i.e. adding a printer takes time). Moreover, we could have a lot of new tasks coming in, most of which can be executed internally. In this case we want to be able to scale up the task list but might not need to scale up the auction house. Event-driven communication would decrease the coupling of services. Consequently, the scalability of individual services would be enhanced as they no longer depend on the scalability of other services. This improves the apps overall scalability. Since scalability is one of the systems top 3 -ility, this seems quite important.
|
|
||||||
Fault tolerance: Another of the systems top 3 -ilities is fault tolerance. We could have highly unstable IoT executors that fail often. This should not disrupt the system’s overall operation. The decoupling facilitated by event-driven, asynchronous, communication ensures that when individual services go down, the impact of other services is limited and once they go back up then can recover the systems state from persisted messages.
|
|
||||||
Cons:
|
|
||||||
Error handling, workflow control, and event timing:
|
|
||||||
The aforementioned topics outline the drawbacks of event- driven architecture. These drawbacks can be mitigated by using an orchestrator (like we currently do with the roster) to orchestrate assignment of tasks, auctioning off tasks and error handling when executors fail. More research needed.
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
|
|
||||||
Consequences to be determined but would relate to the three concepts mentioned as cons.
|
|
|
@ -1,19 +0,0 @@
|
||||||
# 9. common library for shared code
|
|
||||||
|
|
||||||
Date: 2021-11-21
|
|
||||||
|
|
||||||
## Status
|
|
||||||
|
|
||||||
Accepted
|
|
||||||
|
|
||||||
## Context
|
|
||||||
|
|
||||||
The numerous services that make up the Tapas app all have common, non-domain specific, functionality that can be shared via a common library, or be replicated in each service.
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
|
|
||||||
Use a common code library for shared code which does not change frequently, but if it would be changed, would need to be changed everywhere.
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
|
|
||||||
Changes in the common code will most likely require multiple services to be redeployed. However, those services would most likely have to have been changed individually and redepolyed anyways. Another consequence is that versioning becomes more complicated.
|
|
|
@ -0,0 +1,21 @@
|
||||||
|
# 9. Separation of common and executor-base library
|
||||||
|
|
||||||
|
Date: 2021-11-21
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
Accepted
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
We have two code sharing libraries, the executor-base and the common library. The executor-base implements shared logic that all executors need, but other services don't. The common library has much more wide-reaching implementations, such as the implementation of the SelfValidating class. These could form a single common library, or two separate common libraries.
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
There will be a separate library for common and executor-base.
|
||||||
|
|
||||||
|
The libraries share different type of code, and have different reasons to change. It would not make sense to have the shared code from executors in every other service which needs access to other shared code. Services that use the code in the common library should not need to be dependent in any way on the executor-base.
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
Changes impact fewer services. However, this decision will increase the number of service dependencies and therefore increase complexity in managing those dependencies.
|
|
@ -1,20 +0,0 @@
|
||||||
# 10. executor base library
|
|
||||||
|
|
||||||
Date: 2021-11-21
|
|
||||||
|
|
||||||
## Status
|
|
||||||
|
|
||||||
Accepted
|
|
||||||
|
|
||||||
## Context
|
|
||||||
|
|
||||||
Executors all use the same logic to communicate with other services. This means that their code base is near identical except for the code that implements their specific task execution. The code that implements the executor's shared logic can either be implemented by a shared library, or replicated across all executors.
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
|
|
||||||
Since all executors use the same logic to communicate with other services, any change to this logic would have to be made for every executor. We will therefore use a shared library for executors, called executor-base. The library includes all the shared logic which every executor needs.
|
|
||||||
Having this shared logic in a separate library makes it easy to change the common logic at one place. The code sharing happens at compile time, which reduces the chance of runtime errors, compared to other code sharing approaches. If other people need to create executors, they can just reference the executor-base library and implement the actual execution part. Therefore they don’t need to worry about the the connection implementations to the Tapas system.
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
|
|
||||||
It becomes easier to change the way that the executors communicate with the rest of the system. Moreover, changes are less risky as they only need to be implemented once. It also becomes easier for other teams within the organisation to create executors as they can use the executor-base to implement the shared logic. However, using a shared library will increase the complexity of the executors. Also there needs to be a clear way to use proper versioning. Lastly, by using this library we might be making assumptions for future executors that might not hold. For example, if we want to create more lightweight executors for IoT devices we might need to create a separate base package (if the current one becomes too fat), so that the executors can stay lightweight and don’t implement unused code.
|
|
|
@ -0,0 +1,23 @@
|
||||||
|
# 10. Single ownership for services
|
||||||
|
|
||||||
|
Date: 2021-10-18
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
Accepted
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
There are generally three options for persisting data. Single ownership, joint ownership, or common ownership.
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
We will go for single ownership for all our databases. That is, each domain/service that will persist data will have its own database of which that service will be the sole owner. Any service that wants to write data to or read data from a database other than its owen will have to go through the database's owner.
|
||||||
|
|
||||||
|
Having single ownership preserves the bounded contexts and allows each service to be its own architectural quanta. This allows the services to stay decoupled and therefore can allow us to also decouple the scope of the architectural characteristics of each bounded context. We can for example more easily scale each service up and down as we can also scale the databases (size, performance) for each service.
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
Having a distributed architecture and asynchronous communication along with single ownership will force us to rely on eventual consistency.
|
||||||
|
|
||||||
|
Moreover, this decision could negatively impact performance as we add latency to data access across bounded contexts.
|
21
TAPAS-Final/doc/architecture/decisions/0011-data-access.md
Normal file
21
TAPAS-Final/doc/architecture/decisions/0011-data-access.md
Normal file
|
@ -0,0 +1,21 @@
|
||||||
|
# 11. Data access
|
||||||
|
|
||||||
|
Date: 2021-10-18
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
Accepted
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
Services can generally access data that they do not own via the Interservice Communication Pattern, the Column Schema Replication Pattern, the Replicated Caching Pattern, or the Data Domain Pattern
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
Data access will follow the Interservice Communication Pattern.
|
||||||
|
|
||||||
|
All the information needed to process any given event in our system is almost always included in the event itself or already cached in the service from a previous event (e.g. the Executor registry being cached in the Roster and Auction House via new/removed executor events). Therefore, there are very few occasions where access to data from other services is needed. Since the Interservice Communication Pattern is the simplest, we will go for that.
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
There will be performance and fault tolerance issues when we need to access data from other services, but since does not often occur it will not have a significant effect on the system overall.
|
|
@ -1,19 +0,0 @@
|
||||||
# 11. seperation of common and executor-base library
|
|
||||||
|
|
||||||
Date: 2021-11-21
|
|
||||||
|
|
||||||
## Status
|
|
||||||
|
|
||||||
Accepted
|
|
||||||
|
|
||||||
## Context
|
|
||||||
|
|
||||||
We have two code sharing libraries, the executor-base and the common library. The executor-base implements shared logic that all executors need, but other services don't. The common library has much more wide reaching implementations, such as the implementation of the SelfValidating class. These could form a single common library, or two seperate common libraries.
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
|
|
||||||
There will be a separate library for common and executor-base. The libraries share different type of code, and have different reasons to change. It would not make sense to have the shared code from executors in every other service which needs access to other shared code. Services that use the code in the common library should not need to be dependent in any way on the executor-base.
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
|
|
||||||
Changes impact fewer services. However, this decision will increase the number of service dependencies and therefore increase complexity in managing those dependencies.
|
|
|
@ -0,0 +1,27 @@
|
||||||
|
# 12. Separate service for Crawler
|
||||||
|
|
||||||
|
Date: 2021-10-18
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
Accepted
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
The Auction House Discovery Crawler (for simplicity referred to as the Crawler) continuously crawls external auction houses that we already know about to try and discover new auction houses. It maintains the information we have about other auction houses and notifies our Action House when it has discovered new information. The Crawler can either be a part of the Auction House service or be a standalone service
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
The Crawler will be implemented as a standalone service.
|
||||||
|
|
||||||
|
The most important reason for this is that it has different responsibilities to the Auction House. The Crawler has to continuously crawl for new information while the Auction House only needs to launch auctions for internal tasks and bid on external auctions.
|
||||||
|
|
||||||
|
Additionally, the services might have to scale differently depending on how rapidly we want to crawl new auction house information.
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
We will have to duplicate some code and data as the auction house information will be duplicates across both services (although the business logic for updating the information based on the timestamp will solely reside in the Crawler).
|
||||||
|
|
||||||
|
There will be an added latency to the communication between the two domains. However, since this information should not be overly time sensitive (a few ms should not matter), this should not impact overall system performance.
|
||||||
|
|
||||||
|
|
|
@ -1,27 +0,0 @@
|
||||||
# 12. separate service for each executor
|
|
||||||
|
|
||||||
Date: 2021-11-21
|
|
||||||
|
|
||||||
## Status
|
|
||||||
|
|
||||||
Accepted
|
|
||||||
|
|
||||||
## Context
|
|
||||||
|
|
||||||
Executors must receive tasks of different types and execute them. The executors could either all be implemented within one service or as multiple services, one for each type of executor.
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
|
|
||||||
We will have a seperate service for each type of executor.
|
|
||||||
|
|
||||||
Firstly, execution time differs significantly between task types. Therefore, having seperate services will allow the executos to scale differently based on their tasks' specific needs.
|
|
||||||
|
|
||||||
Secondly, the systems functioning should not be disrupted in case an Executor fails. Having each type of executor in a seperate service will increase fault tolerance in this regard.
|
|
||||||
|
|
||||||
Lastly, extensibility is one of the systems most important non-functional requirement and providers of executors need to be able to easily add executors to the executor pool. These factors are also positively impacted by having seperate services.
|
|
||||||
|
|
||||||
There should not be any shared data between the executors. Additionally, there should be no flow of information between them. Thus, there should be no issues due to workflow and data concerns due to this decision.
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
|
|
||||||
Executors share a lot of functionality when it comes to connecting to the rest of the system. Therefore, this decision means that we will either have to duplicate the code that implements the common functionality, or we have to have a way to share the code (e.g. through a common library)
|
|
|
@ -1,23 +0,0 @@
|
||||||
# 13. microservice architecture
|
|
||||||
|
|
||||||
Date: 2021-12-02
|
|
||||||
|
|
||||||
## Status
|
|
||||||
|
|
||||||
Accepted
|
|
||||||
|
|
||||||
## Context
|
|
||||||
|
|
||||||
The system is made up of five distinct bounded contexts, namely the Task Domain, the Roster Domain, the Executor Pool Domain, the Executor Domain, and the Auction Domain. The way that these bounded contexts function together to fulfil the system requirements can be based on many different architectures. (Feedback needed. Should we name specific 'next-best' alternative architectures to compare, or just leave it as is since technically all architectures were considered)
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
|
|
||||||
The system will follow the Microservice architecture.
|
|
||||||
|
|
||||||
Scalability and fault tolerance are two of the systems top 3 -ilities. Moreover, elasticity and evolvability are two of the systems other main -ilities. These are all non-functional requirements that the Microservice architecture excels at.
|
|
||||||
|
|
||||||
We do not expect to have a single monolithic database, so this is not a concern.
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
|
|
||||||
There is a considerable amount of communication between bounded contexts. This could cause responsiveness and performance issues due to added latency. This could therefore mean we would need to use asynchronous REST calls or publish-subscribe communication to mitigate these issues as much of the communication does not have to be synchronous.
|
|
|
@ -0,0 +1,21 @@
|
||||||
|
# 13. Use Hypermedia APIs when possible
|
||||||
|
|
||||||
|
Date: 2021-10-18
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
Accepted
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
When Executors need to communicate with external devices they can either hard code the API requests or it can discover them at run time via Semantic Hypermedia.
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
We will discover the requests at run-time via the Semantic Hypermedia APIs.
|
||||||
|
|
||||||
|
This will decouples the Executors that offer physical services from the Web-enabled devices they use. This will make these Executors more resilient to change in the devices they use and therefore less likely to experience problems over time as the device APIs change.
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
The implementation of the run-time discovery is more complex, at least until the team has gained more experience in this domain.
|
|
@ -1,19 +0,0 @@
|
||||||
# 14. data ownership
|
|
||||||
|
|
||||||
Date: 2021-12-02
|
|
||||||
|
|
||||||
## Status
|
|
||||||
|
|
||||||
Accepted
|
|
||||||
|
|
||||||
## Context
|
|
||||||
|
|
||||||
The issue motivating this decision, and any context that influences or constrains the decision.
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
|
|
||||||
The change that we're proposing or have agreed to implement.
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
|
|
||||||
What becomes easier or more difficult to do and any risks introduced by the change that will need to be mitigated.
|
|
|
@ -0,0 +1,27 @@
|
||||||
|
# 13. Use Choreography For Task Execution Workflow
|
||||||
|
|
||||||
|
Date: 2021-12-10
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
Accepted
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
The Task Execution Workflow (workflow that takes a task from created to executed) is the main workflow in the whole TAPAS system. It can either be orchestrated, or choreographed. If it is choreographed then it should follow one of the following patterns: Front Controller Pattern, Stateless, or Stamp coupling.
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
The Task Execution Workflow will be choreographed and will follow the Front Controller Pattern, storing the workflow state in the Task List.
|
||||||
|
|
||||||
|
The reason for why it should be choreographed is that this will optimise the for responsiveness, scalability, and fault tolerance. There are all driving architectural characteristics for our system.
|
||||||
|
|
||||||
|
Moreover, the workflow should utilise the Front Controller Pattern as this pattern fits the architecture naturally. The tasks in the Task List already have a Status property. This property can be used to keep track of the state of the task within the execution workflow. Moreover, since updating this state is already a part of the architecture, the system will not suffer greatly from the normal drawback of using this pattern (additional state property, increased communication overhead).
|
||||||
|
|
||||||
|
Using this pattern will make querying the state trivial since we just need to look up the status of the task in the Task List. Moreover, it makes the Task List a pseudo-orchestrator within our choreography which will reduce the complexity for error handling as much of it can be added to the Task List. Lastly, since the Task List is accessible by the users that created the task, they can then make ultimate decisions for error handling. For example, when no executor has been found for a task (neither internal nor external) then the end user could either delete the task or ask for it to be re-run through the roster and auction house at a later date.
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
Making the Task Execution Workflow choreographed will make error handling more complex. For now it seems manageable, but if the complexity becomes to great we could implement the Workflow Event Pattern in order to mitigate the increased complexity.
|
||||||
|
|
||||||
|
Recoverability and state management might also suffer, but a good implementation of the Front Controller Pattern should mitigate more of the negatives for those aspects.
|
Loading…
Reference in New Issue
Block a user