The thesis documents the implementation of the 24/7 high availability (there still can be failures in the service) solution for a database using Java server application.
At first the document goes through what dependable computing, availability, reliability is, what kind of and what level errors of the system have influence on them, what exactly the failure of a system is, and when we can say the system has failed. The document also gives a look into how we can prevent the failure on different levels. It also illustrates with examples how to use these definitions during the design phase of a project.
During designing the first step was to identify the failure of the system according to the specifications and other requirements. Among the first steps it was also important to identify, with the analysis of the current system architecture, the fault modes and errors of the system which could prevent the system in reaching the required SLA. After the fault mode analysis the document gives a list of available solutions in the form of high-level plans and models so in this first round it doesn’t describe any concrete technologies to be used during implementation. The purpose of these high-level plans is to gather the pros and cons of each solutions, compare them and to be able to choose an appropriate solution in an early phase of the project which satisfies the different expectations of the system. After the solution has been chosen the description of the new system comes where system plans and models are included with concrete high availablity supporting technologies (like VCS and F5) to describe how the new system will be implemented and how it will work and what are the configuration requirements. The solution plan also gives look into how the new components will be placed physically to grant the reliability of the system even in case of natural disasters.
After the implementation is ready the evaluation of the system comes and to give some options how it can be developed further in the future.