RECAP targets the operation of large-scale application in large-scale geographically distributed infrastructure. As such, RECAP components need to operate in distributed regions of an infrastructure which make the RECAP tooling in itself a distributed application. This document introduces the RECAP architecture, the RECAP integration strategy, and the RECAP models connecting the different parts of the architecture: user models comprise locality, behaviour, and user-specific meta-data. Application models consist of the deployment description, and the compositional relationships. Workload models capture information of application utilisation. Infrastructure models capture actual information on the physical, virtual and application level. Finally, load translation models express how resource demands on application level map to resource consumption on the physical layer.
RECAP supports three execution modes: (i) run-time operational mode, (ii) simulation and planning mode, and (iii) data analytics and machine learning mode. The run-time mode captures the needed functionalities for managing a physical distributed infrastructure and its applications. This includes the collection of monitoring data from the live system, but also the adaption of applications to changed environmental conditions. The simulation and planning mode helps the operators to run what-if scenarios or plan the enhancement of their infrastructures. For setting up simulations, the simulator can make use of the data collected from the live system. Finally, the data analytics and machine learning mode can be used to perform intensive analysis on the collected monitoring data in order to improve the insight into the infrastructure, its applications, and its configuration.
From an architectural point of view, three major building blocks, so-called sub-systems, can be found in the RECAP system. The Infrastructure Modelling and Monitoring sub-system is responsible for capturing monitoring data from the infrastructure up to the application. Further, it is tasked with gathering information about the structure of the current infrastructure and application landscape; and finally, it provides access to the collected data, be it for other sub-systems, for operators through visualisation, or for data scientists for bulk access to the collected data.
It is the main resource for the Data Analytics and Machine Learning sub-system. Within these components, RECAP acquires the data for processing and analysing the data which may lead to general human-interpretable insights, or – when machine learning techniques come into play lead to the generation of trained models that can be used by the optimisation sub-system.
RECAP's optimisation module supports optimisation steps on both infrastructure and application level. Both optimisers take their decision based on live information provided by the monitoring and infrastructure modelling sub-system. In order to avoid conflicting decisions, the Optimisation Orchestrator mediates between the two before passing optimisation steps on to enactment.
Finally, the simulation and planning sub-system is concerned with implementing the simulation and planning mode. Its main task is to support the validation of RECAP, but also to exercise what-if scenarios and to support the planning of the evolvement of large scale infrastructure.
For integration, RECAP targets lose coupling between the components and borrows methods from the DevOps paradigm, particularly Continuous Delivery and containerisation.
To download this deliverable, please fill in the form.