fbpx

PUBLICATIONS

SAT-IoT: An Architectural Model for a High-Performance Fog/Edge/Cloud IoT Platform

Authors:

Miguel Angel López Peña, Isabel Muñoz Fernández

Abstract:

Current new IoT standards do not detail enough some important and emergent aspects as the Fog/Edge computing support, the IoT computation topology management or the IoT visualization systems. This work defines three new concepts: a) the paradigm of edge/cloud computing transparency that lets the computation nodes change dynamically without administrator intervention; b) the IoT computing topology management that gives an IoT system global view, from the hardware and communication infrastructures to the software deployed on them, and c) the automation and integration of IoT visualization systems for real time data visualization, current IoT topology and current paths of data flows. It is also defined a new architectural model that includes these concepts and covers other IoT demands, like security safeguard services based on Blockchain. This architectural model definition is taken as the basis for developing a new advanced IoT platform referred as SAT-IoT.

Self-service Cybersecurity Monitoring as Enabler for DevSecOps

Authors:

Jessica Diaz, Jorge E. Pérez, Miguel A. Lopez-Peña, Gabriel A. Mena, Agustín Yagüe

Abstract:

Current IoT systems are highly distributed systems that integrate cloud, edge and fog computing approaches depending on where intelligence and processing capabilities are allocated. This distribution and heterogeneity make development and deployment pipelines very complex and fragmented with multiple delivery endpoints above hardware. This fact prevents rapid development and makes the operation and monitoring of production systems a difficult and tedious task, including cybersecurity event monitoring. DevSecOps can be defined as a cultural approach to improve and accelerate the delivery of business value by making dev/sec/ops teams’ collaboration effective. This paper focuses on self-service cybersecurity monitoring as an enabler to introduce security practices in a DevOps environment. To that end, we have defined and formalized an activity that supports ‘Fast and Continuous Feedback from Ops to Dev’ by providing a flexible monitoring infrastructure so that teams can configure their monitoring and alerting services according their criteria (you build, you run, and now you monitor) to obtain fast and continuous feedback from operation and thus, better anticipate problems when a production deployment is performed. This activity has been formalized using the Software & Systems Process Engineering Metamodel by OMG and its instantiation is described through a case study that shows the versioned and repeatable configuration of a cybersecurity monitoring infrastructure (Monitoring as Code) through virtualization and containerization technology. This self-service monitoring/alerting allows breaking silos between dev, ops, and sec teams by opening access to key security metrics, which enables a sharing culture and continuous improvement.

Machine Learning Methods for Reliable Resource Provisioning in Edge-Cloud Computing: A Survey

Authors:

Thang Le Duc, Rafael García Leiva, Paolo Casari, and Per-Olov Östberg

Abstract:

Large-scale software systems are currently designed as distributed entities and deployed in cloud data centers. To overcome the limitations inherent to this type of deployments, applications are increasingly being supplemented with components instantiated closer to the edges of networks – a paradigm known as edge computing. The problem of how to efficiently orchestrate combined edge-cloud applications is however incompletely understood, and a wide range of techniques for resource and application management are currently in use. This paper investigates the problem of reliable resource provisioning in joint edge-cloud environments and surveys technologies, mechanisms, and methods that can be used to improve the reliability of distributed applications in diverse and heterogeneous network environments. Due to the complexity of the problem, special emphasis is placed on solutions to the characterization, management, and control of complex distributed applications using machine learning approaches. The survey is structured around a decomposition of the reliable resource provisioning problem into three categories of techniques: workload characterization and prediction, component placement and system consolidation, and application elasticity and remediation. Survey results are presented along with a problem-oriented discussion of the state of the art. Finally, a summary of identified challenges and an outline of future research directions are presented to conclude the paper.

Mowgli: Finding Your Way in the DBMS Jungle

Authors:

Daniel Seybold, Moritz Keppler, Daniel Gründler, Jörg Domaschka

Abstract:

Big Data and IoT applications require highly-scalable database management system (DBMS), preferably operated in the cloud to ensure scalability also on the resource level. As the number of existing distributed DBMS is extensive, the selection and operation of a distributed DBMS in the cloud is a challenging task. While DBMS benchmarking is a supportive approach, existing frameworks do not cope with the runtime constraints of distributed DBMS and the volatility of cloud environments. Hence, DBMS evaluation frameworks need to consider DBMS runtime and cloud resource constraints to enable portable and reproducible results. In this paper we present Mowgli, a novel evaluation framework that enables the evaluation of non-functional DBMS features in correlation with DBMS runtime and cloud resource constraints. Mowgli fully automates the execution of cloud and DBMS agnostic evaluation scenarios, including DBMS cluster adaptations. The evaluation of Mowgli is based on two IoT-driven scenarios, comprising the DBMSs Apache Cassandra and Couchbase, nine DBMS runtime configurations, two cloud providers with two different storage backends. Mowgli automates the execution of the resulting 102 evaluation scenarios, verifying its support for portable and reproducible DBMS evaluations. The results provide extensive insights into the DBMS scalability and the impact of different cloud resources. The significance of the results is validated by the correlation with existing DBMS evaluation results.

Unified Container Environments for Scientific Cluster Scenarios

Authors:

Benjamin Schanzel, Mark Leznik, Simon Volpert, Jörg Domaschka, Stefan Wesner

Abstract:

Providing runtime dependencies for computational workflows in shared environments, like HPC clusters, requires appropriate management efforts from users and administrators. Users of a cluster define the software stack required for a workflow to execute successfully, while administrators maintain the mechanisms to offer libraries and applications in different versions and combinations for the users to have maximum flexibility. The Environment Modules system is the tool of choice on bwForCluster BinAC for this purpose. In this paper, we present a solution to execute a workflow which relies on a software stack not available via Environment Modules on BinAC. The paper describes the usage of a containerized, user-defined software stack for this particular problem using the Singularity and Docker container platforms. Additionally, we present a solution for the reproducible provisioning of identical software stacks across HPC and non-HPC environments. The approach uses a Docker image as the basis for a Singularity container. This allows users to define arbitrary software stacks giving them the ability to execute their workflows across different environments, from local workstations to HPC clusters. This approach provides identical versions of software and libraries across all environments.

Simulating Fog and Edge Computing Scenarios: An Overview and Research Challenges

Authors:

Sergej Svorobej, Patricia Takako Endo, Malika Bendechache, Christos Filelis-Papadopoulos, Konstantinos M. Giannoutakis, George A. Gravvanis, Dimitrios Tzovaras, James Byrne and Theo Lynn

Abstract:

The fourth industrial revolution heralds a paradigm shift in how people, processes, things, data and networks communicate and connect with each other. Conventional computing infrastructures are struggling to satisfy dramatic growth in demand from a deluge of connected heterogeneous end points located at the edge of networks while, at the same time, meeting quality of service levels. The complexity of computing at the edge makes it increasingly difficult for infrastructure providers to plan for and provision resources to meet this demand. While simulation frameworks are used extensively in the modelling of cloud computing environments in order to test and validate technical solutions, they are at a nascent stage of development and adoption for fog and edge computing. This paper provides an overview of challenges posed by fog and edge computing in relation to simulation.

A Novel Hyperparameter-free Approach to Decision Tree Construction that Avoids Overfitting by Design

Authors:

Rafael Garcia Leiva, Antonio Fernandez Anta, Vincenzo Mancuso, Paolo Casari

Abstract:

Decision trees are an extremely popular machine learning technique. Unfortunately, overfitting in decision trees still remains an open issue that sometimes prevents achieving good performance. In this work, we present a novel approach for the construction of decision trees that avoids the overfitting by design, without losing accuracy. A distinctive feature of our algorithm is that it requires neither the optimization of any hyperparameters, nor the use of regularization techniques, thus significantly reducing the decision tree training time. Moreover, our algorithm produces much smaller and shallower trees than traditional algorithms, facilitating the interpretability of the resulting models.

Adaptive Resource Provisioning based on Application State

Authors:

Constantine Ayimba, Paolo Casari, Vincenzo Mancuso

Abstract:

Infrastructure providers employing Virtual Network Functions (VNFs) in a cloud computing context need to find a balance between optimal resource utilization and adherence to agreed Service Level Agreements (SLAs). Tenants should be allocated as much computing, storage and network capacity as they need in order not to violate SLAs, but not more so that the infrastructure provider can accommodate more tenants to increase revenue. This paper presents an optimizer VNF that ensures that a given virtual machine (VM) is sufficiently utilized before directing traffic to another VM, and an orchestrator VNF that scales the number of VMs up or down as needed when workloads change, thereby limiting the number of active VMs to a minimum that can deliver the service. We setup a testbed to transcode and stream Video on Demand (VoD) as a service. We present experimental results which show that when the optimizer and orchestrator are used together they outperform static provisioning in terms of both resource utilization and service response times.

A Hybrid Fitness-Utility Algorithm for improved service chain placement

Authors:

Radhika Loomba, Thijs Metsch, Leonard Feehan, Joe Butler

Abstract:

Optimal placement of service chains (composed of multiple connected service components) on heterogeneous cloud/fog infrastructure is a challenging problem. Even when deploying a small set of service components, the search-space of all possible solutions is quite large. Furthermore, current approaches sacrifice either precision or scalability when presented with heterogeneous platform configurations. In this scenario, the goal is to support orchestration that optimally selects solutions that meet performance constraints, leverages differentiating platform features and delivers these solutions within reasonable run-time. This paper presents the novel Hybrid Fitness-Utility Algorithm that addresses these issues by utilizing concepts derived from evolutionary algorithms to dimensionally reduce complexity, and incorporates the utility of placing each individual service component on an infrastructure resource for optimized selection between potential solutions. Our results show that the algorithm succeeds in determining optimal service chain placement onto distributed infrastructure for a simulated cloud/fog scenario and for a live multi point-of-presence OpenStack-based testbed with over 90 percent confidence.

Simulating Large vCDN Networks: A Parallel Approach

Authors:

Christos K.Filelis-Papadopoulos, Konstantinos M.Giannoutakis, George A.Gravvanis, Patricia Takako Endo, Dimitrios Tzovaras, Sergej Svorobej, Theo Lynn

Abstract:

Virtualization and cloud computing are being used by Communication Service Providers to deploy and utilize virtual Content Distribution Networks (vCDNs) to reduce costs and increase elasticity thereby avoiding performance, quality, reliability and availability limitations that characterize traditional CDNs. As cache placement is based on both the content type and geographic location of a user request, it has a significant impact on service delivery and network congestion. To study the effectiveness of cache placements and hierarchical network architectures composed of sites, a novel parallel simulation framework is proposed utilizing a discrete-time approach. Unlike other simulation approaches, the proposed simulation framework can update, in parallel, the state of sites and their resource utilization with respect to incoming requests in a significantly faster manner at hyperscale. It allows for simulations with multiple types of content, different virtual machine distributions, probabilistic caching, and forwarding of requests. In addition, power consumption models allow the estimation of energy consumption of the physical resources that host virtual machines. The results of simulations conducted to assess the performance and applicability of the proposed simulation framework are presented. Results are promising for the potential of this simulation framework in the study of vCDNs and optimization of network infrastructure.

Modeling the Availability of an E-Health System Integrated with Edge, Fog and Cloud Infrastructures

Authors:

Matheus Felipe Ferreira da Silva Lisboa Tigre, Guto Leoni Santos, Theo Lynn, Djamel Sadok, Judith Kelner, Patricia Takako Endo

Abstract:

The Internet of Things has the potential of transforming health systems through the collection and analysis of patient physiological data via wearable devices and sensor networks. Such systems can offer assisted living services in realtime and offer a range of multimedia-based health services. However, lack of service availability, particularly in the cases of emergencies, can lead to adverse outcomes and in the worst case, death. In this paper, we propose an e-health monitoring architecture based on sensors and cloud and fog infrastructure scenarios. Further, we propose stochastic models to analyze how failures impact on the e-health system availability. We analyze four different scenarios and from results, we identify that the sensors and fog devices are the components that have the most significant impact on the availability of the entire e-health system in the scenarios analyzed.

Integrating IoT + Fog + Cloud Infrastructures: System Modeling and Research Challenges

Authors:

Guto Leoni Santos, Matheus Ferreira, Leylane Ferreira, Judith Kelner, Djamel Sadok, Edison Albuquerque, Theo Lynn, Patricia Takako Endo

Abstract:

Recent years have seen the explosive growth of the Internet of Things (IoT): the internet-connected network of devices that includes everything from personal electronics and home appliances to automobiles and industrial machinery. Responding to the ever-increasing bandwidth demands of the IoT, Fog and Edge computing concepts have developed to collect, analyze, and process data more efficiently than traditional cloud architecture. Fog and Edge Computing: Principles and Paradigms provides a comprehensive overview of the state-of-the-art applications and architectures driving this dynamic field of computing while highlighting potential research directions and emerging technologies.

A Modelling Language for Defining Cloud Simulation Scenarios in RECAP Project Context

Authors:

Cleber Matos de Morais, Patricia Endo, Sergej Svorobej, Theo Lynn

Abstract:

RECAP is a European Union funded project that seeks to develop a next-generation resource management solution, from both technical and business perspectives, when adopting technological solutions spanning across cloud, fog, and edge layers. The RECAP project is composed of a set of use cases that present highly complex and scenario-specific requirements that should be modelled and simulated in order to find optimal solutions for resource management. Due use cases characteristics, configuring simulation scenarios is a high time consuming task and requires staff with specialist expertise.

ALPACA: Application Performance Aware Server Power Capping

Authors:

Jakub Krzywda, Ahmed Ali-Eldin, Eddie Wadbro, Per-Olov Östberg, Erik Elmroth

Abstract:

Server power capping limits the power consumption of a server to not exceed a specific power budget. This allows data center operators to reduce the peak power consumption at the cost of performance degradation of hosted applications. Previous work on server power capping rarely considers Quality-of-Service (QoS) requirements of consolidated services when enforcing the power budget. In this paper, we introduce ALPACA, a framework to reduce QoS violations and overall application performance degradation for consolidated services. ALPACA reduces unnecessary high power consumption when there is no performance gain, and divides the power among the running services in a way that reduces the overall QoS degradation when the power is scarce. We evaluate ALPACA using four applications: MediaWiki, SysBench, Sock Shop, and CloudSuite’s Web Search benchmark. Our experiments show that ALPACA reduces the operational costs of QoS penalties and electricity by up to 40% compared to a non optimized system.

Application, Workload, and Infrastructure Models for Virtualized Content Delivery Networks Deployed in Edge Computing Environments

Authors:

Thang Le Duc, Per-Olov Ostberg

Abstract:

Content Delivery Networks (CDNs) are handling a large part of the traffic over the Internet and are of growing importance for management and operation of coming generations of data intensive applications. This paper addresses modeling and scaling of content-oriented applications, and presents workload, application, and infrastructure models developed in collaboration with a large-scale CDN operating infrastructure provider aimed to improve the performance of content delivery subsystems deployed in wide area networks. It has been shown that leveraging edge resources for the deployment of caches of content greatly benefits CDNs. Therefore, the models are described from an edge computing perspective and intended to be integrated in network topology aware application orchestration and resource management systems.

Analyzing the Availability and Performance of an E-Health System Integrated with Edge, Fog and Cloud Infrastructures

Authors:

Guto Leoni Santos, Patricia Takako Endo, Matheus Felipe Ferreira da Silva Lisboa Tigre, Leylane Graziele Ferreira da Silva, Djamel Sadok, Judith Kelner and Theo Lynn

Abstract:

The Internet of Things has the potential of transforming health systems through the collection and analysis of patient physiological data via wearable devices and sensor networks. Such systems can offer assisted living services in real-time and offer a range of multimedia-based health services. However, service downtime, particularly in the case of emergencies, can lead to adverse outcomes and in the worst case, death. In this paper, we propose an e-health monitoring architecture based on sensors that relies on cloud and fog infrastructures to handle and store patient data. Furthermore, we propose stochastic models to analyze availability and performance of such systems including models to understand how failures across the Cloud-to-Thing continuum impact on e-health system availability and to identify potential bottlenecks. To feed our models with real data, we design and build a prototype and execute performance experiments. Our results identify that the sensors and fog devices are the components that have the most significant impact on the availability of the e-health monitoring system, as a whole, in the scenarios analyzed. Our findings suggest that in order to identify the best architecture to host the e-health monitoring system, there is a trade-off between performance and delays that must be resolved.

ATMoN: Adapting the “Temporality” in Large-Scale Dynamic Networks

Authors:

Demetris Trihinas, Luis F. Chiroque, George Pallis, Antonio Fernandez Anta, Marios D. Dikaiakos

Abstract:

With the widespread adoption of temporal graphs to study fast evolving interactions in dynamic networks, attention is needed to provide graph metrics in time and at scale. In this paper, we introduce ATMoN, an open-source library developed to computationally offload graph processing engines and ease the communication overhead in dynamic networks over an unprecedented wealth of data. This is achieved, by efficiently adapting, in place and inexpensively, the temporal granularity at which graph metrics are computed based on runtime knowledge captured by a low-cost probabilistic learning model capable of approximating both the metric stream evolution and the volatility of the graph topology. After a thorough evaluation with real-world data from mobile, face-to-face and vehicular networks, results show that ATMoN is able to reduce the compute overhead by at least 76%, data volume by 60% and overall cloud costs by at least 54%, while always maintaining accuracy above 88%.

Done Yet? A Critical Introspective of the Cloud Management Toolbox

Authors:

Mark Leznik, Simon Volpert, Frank Griesinger, Daniel Seybold, Jörg Domaschka

Abstract:

With the rapid rise of the cloud computing paradigm, the manual maintenance and provisioning of the technological layers behind it, both in their hardware and virtualized form, became cumbersome and error-prone. This has opened up the need for automated capacity planning strategies in heterogeneous cloud computing environments. However, even with mechanisms to fully accommodate customers and fulfill service-level agreements, providers often tend to over-provision their hardware and virtual resources. A proliferation of unused capacity leads to higher energy costs, and correspondingly, the price for cloud technology services. Capacity planning algorithms rely on data collected from the utilized resources. Yet, the amount of data aggregated through the monitoring of hardware and virtual instances does not allow for a manual supervision, much less data analysis or a correlation and anomaly detection. Current data science advancements enable the assistance of efficient automation, scheduling and provisioning of cloud computing resources based on supervised and unsupervised machine learning techniques. In this work, we present the current state of the art in monitoring, storage, analysis and adaptation approaches for the data produced by cloud computing environments, to enable proactive, dynamic resource provisioning.

Power-Performance Tradeoffs in Data Center Servers: DVFS, CPU pinning, Horizontal, and Vertical Scaling

Authors:

Jakub Krzywda, Ahmed Ali-Eldin, Trevor E.Carlson, Per-Olov Östberg, Erik Elmroth

Abstract:

Dynamic Voltage and Frequency Scaling (DVFS), CPU pinning, horizontal, and vertical scaling, are four techniques that have been proposed as actuators to control the performance and energy consumption on data center servers. This work investigates the utility of these four actuators, and quantifies the power-performance tradeoffs associated with them. Using replicas of the German Wikipedia running on our local testbed, we perform a set of experiments to quantify the influence of DVFS, vertical and horizontal scaling, and CPU pinning on end-to-end response time (average and tail), throughput, and power consumption with different workloads. Results of the experiments show that DVFS rarely reduces the power consumption of underloaded servers by more than 5%, but it can be used to limit the maximal power consumption of a saturated server by up to 20% (at a cost of performance degradation). CPU pinning reduces the power consumption of underloaded server (by up to 7%) at the cost of performance degradation, which can be limited by choosing an appropriate CPU pinning scheme. Horizontal and vertical scaling improves both the average and tail response time, but the improvement is not proportional to the amount of resources added. The load balancing strategy has a big impact on the tail response time of horizontally scaled applications.

Towards understanding HPC users and systems: A NERSC case study

Authors:

Gonzalo P. Rodrigo, P-O Östberg, Erik Elmroth, Katie Antypas, Richard Gerber, Lavanya Ramakrishnan

Abstract:

High performance computing (HPC) scheduling landscape currently faces new challenges due to the changes in the workload. Previously, HPC centers were dominated by tightly coupled MPI jobs. HPC workloads increasingly include high-throughput, data-intensive, and stream-processing applications. As a consequence, workloads are becoming more diverse at both application and job levels, posing new challenges to classical HPC schedulers. There is a need to understand the current HPC workloads and their evolution to facilitate informed future scheduling research and enable efficient scheduling in future HPC systems.

Reliable Capacity Provisioning for Distributed Cloud/Edge/Fog Computing Applications

Authors:

P-O Ostberg, James Byrne, Paolo Casari, Philip Eardley, Antonio Fernández Anta, Johan Forsman, John Kennedy, Thang Le Duc, Manuel Noya Mariño, Radhika Loomba, Miguel Angel López Peña, Jose Lopez Veiga, Theo Lynn, Vincenzo Mancuso, Sergej Svorobej, Anders Torneus, Stefan Wesner, Peter Willis, Jörg Domaschka

Abstract:

The REliable CApacity Provisioning and enhanced remediation for distributed cloud applications (RECAP) project aims to advance cloud and edge computing technology, to develop mechanisms for reliable capacity provisioning, and to make application placement, infrastructure management, and capacity provisioning autonomous, predictable and optimized. This paper presents the RECAP vision for an integrated edge-cloud architecture, discusses the scientific foundation of the project, and outlines plans for toolsets for continuous data collection, application performance modeling, application and component auto-scaling and remediation, and deployment optimization. The paper also presents four use cases from complementing fields that will be used to showcase the advancements of RECAP.

A Preliminary Systematic Review of Computer Science Literature on Cloud Computing Research using Open Source Simulation Platforms

Authors:

Theo Lynn, Anna Gourinovitch, James Byrne, PJ Byrne, Sergej Svorobej, Konstaninos Giannoutakis, David Kenny and John Morrison

Abstract:

Research and experimentation on live hyperscale clouds is limited by their scale, complexity, value and and issues of commercial sensitivity. As a result, there has been an increase in the development, adaptation and extension of cloud simulation platforms for cloud computing to enable enterprises, application developers and researchers to undertake both testing and experimentation. While there have been numerous surveys of cloud simulation platforms and their features, few surveys examine how these cloud simulation platforms are being used for research purposes. This paper provides a preliminary systematic review of literature on this topic covering 256 papers from 2009 to 2016. The paper aims to provide insights into the current status of cloud computing research using open source cloud simulation platforms. Our two-level analysis scheme includes a descriptive and synthetic analysis against a highly cited taxonomy of cloud computing. The analysis uncovers some imbalances in research and the need for a more granular and refined taxonomy against which to classify cloud computing research using simulators. The paper can be used to guide literature reviews in the area and identifies potential research opportunities for cloud computing and simulation researchers, complementing extant surveys on cloud simulation platforms.

A Review of Cloud Computing Simulation Platforms and Related Environments

Authors:

James Byrne, Sergej Svorobej, Konstantinos Giannoutakis, Dimitrios Tzovaras, PJ Byrne, P-O Östberg, Anna Gourinovitch, Theo Lynn

Abstract:

Recent years have seen an increasing trend towards the development of Discrete Event Simulation (DES) platforms to support cloud computing related decision making and research. The complexity of cloud environments is increasing with scale and heterogeneity posing a challenge for the efficient management of cloud applications and data centre resources. The increasing ubiquity of social media, mobile and cloud computing combined with the Internet of Things and emerging paradigms such as Edge and Fog Computing is exacerbating this complexity. Given the scale, complexity and commercial sensitivity of hyperscale computing environments, the opportunity for experimentation is limited and requires substantial investment of resources both in terms of time and effort. DES provides a low risk technique for providing decision support for complex hyperscale computing scenarios. In recent years, there has been a significant increase in the development and extension of tools to support DES for cloud computing resulting in a wide range of tools which vary in terms of their utility and features. Through a review and analysis of available literature, this paper provides an overview and multi-level feature analysis of 33 DES tools for cloud computing environments. This review updates and extends existing reviews to include not only autonomous simulation platforms, but also on plugins and extensions for specific cloud computing use cases. This review identifies the emergence of CloudSim as a de facto base platform for simulation research and shows a lack of tool support for distributed execution (parallel execution on distributed memory systems).