«INTRODUCTION The emerging Cloud computing paradigm (Carr, 2008)(Wallis, 2008)(M. Armbrust et al., 2009) for hosting Internet-based services in ...»
Monitoring Services in a Federated Cloud
- The RESERVOIR Experience
Stuart Clayman, Giovanni Toffetti, Alex Galis, Clovis Chapman†
Dept of Electronic Engineering, Dept of Computer Science†, University College London
firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com
This chapter presents the need, the requirements, and a design for a monitoring system that is suitable for
supporting the operations and management of a Federated Cloud environment. We discuss these issues within the context of the RESERVOIR Service Cloud computing project. We first present the RESERVOIR architecture itself, then we introduce the issues of service monitoring in a federated environment, together with the specific solutions that have been devised for RESERVOIR. We end with a review of our experience in this area by showing a use-case application executing on RESERVOIR, which is responsible for the computational prediction of organic crystal structures.
Service Clouds are just the latest incarnation of a concept that has been around since the 1960’s, namely the manifestation of a general-purpose public computing utility. Throughout the history of computing we have seen such utilities appear in one form or another. Even though some success stories exist, such as in in the area of high performance scientific computing, where Grid computing made significant progress over the past decade, none of these attempts materialized into a true general purpose compute utility that is accessible by anyone, at any time, from anywhere. Now however, the advent of new approaches utilizing an always-on Internet and virtualization, has brought about system designs which will enable the desired progress. An example of such a system design is the RESERVOIR Service Cloud, which is described in the next section.
RESERVOIRThe RESERVOIR FP7 project (Rochwerger et al, 2009)(Rochwerger et al, 2009b)(Rochwerger et al, 2011) aims to support the emergence of Service-Oriented Computing as a new computing paradigm and to investigate the fundamental aspects of Service Clouds as a fundamental element of the Future Internet.
RESERVOIR is a Service Cloud which has a new and unique approach to Service-Oriented Cloud computing. In the RESERVOIR model there is a clear separation between service providers and infrastructure providers. Service providers are the entities that understand the needs of particular business and create and offer service applications to address those needs. Service providers do not need to own the computational resources needed by these service applications, instead, they lease resources from an infrastructure provider.
The infrastructure provider owns and leases out sections of a computing cluster, which supplies the service provider with a finite pool of computational resources. The cluster is presented as a Service Cloud site which is capable of allocating resources to many service providers at the same time. Through federation agreements, multiple infrastructure providers can factor together all of their compute resources thus offering a seemingly infinite resource pool for their customers - the service providers.
The high-level objective of RESERVOIR is to significantly increase the effectiveness of the compute and service utility model thus enabling the deployment of complex services on a Service Cloud that spans infrastructure providers and even geographies, while ensuring QoS and security guarantees. In doing so, RESERVOIR provides a foundation where resources and services are transparently and flexibly provisioned and managed like utilities.
RESERVOIR ARCHITECTUREThe essence of the RESERVOIR Service Cloud is to effectively manage a service specified as a collection of virtual execution environments (VEEs). A VEE is an abstraction representing both virtual machines running on a generic hypervisor infrastructure, as well as any application component that can be run (and/or migrated) on a leased infrastructure (e.g., Web applications on Google’s App Engine, a Java based OSGi bundle). A Service Cloud, such as RESERVOIR, operates by acting as a platform for running virtualized applications in VEEs, which have been deployed on behalf of a service provider.
The service provider defines the details and requirements of the application in a Service Definition Manifest.
This is done by specifying which virtual machine images are required to run, as well as specifications for (i) Elasticity Rules or performance objectives, which determine how the application will scale across a Cloud, and (ii) Service Level Agreement (SLA) Rules, which determine how and if the Cloud site is providing the right level of service to the application. Within each Service Cloud site there is a Service Manager (SM) and a VEE Manager (VEEM) which together provide all the necessary management functionality for both the services and the infrastructure. These management components of a Cloud system are shown in Figure 1 and are presented in more detail.
Figure 1: RESERVOIR Service Cloud Architecture
The Service Manager (SM) is the component responsible for accepting the Service Definition Manifest and the raw VEE images from the service provider. It is then responsible for the instantiation of the service application by requesting the creation and configuration of executable VEEs for each service component in the manifest. In addition, it is the Service Manager that is responsible for (i) evaluating and executing the elasticity rules and (ii) ensuring SLA compliance, by monitoring the execution of the service applications in real-time. Elasticity of a service is done by adjusting the application capacity, either by adding or removing service components and/or changing the resource requirements of a particular component according to the load and measurable application behaviour.
The Virtual Execution Environment Manager (VEEM) is the component responsible for the placement of VEEs into VEE hosts (VEEHs). The VEEM receives requests from the Service Manager to create VEEs, to adjust resources allocated to VEEs, and to also finds the best placement for these VEEs in order to satisfy a given set of constraints. The role of the VEEM is to optimize a site and its main task is to place and move the VEEs anywhere, even on remote sites, as long as the placement is done within the constraints set in the Manifest, including specifications of VEE affinity, VEE anti-affinity, security, and cost. In addition to serving local requests, the VEEM is the component in the system that is responsible for the migration of VEEs to and from remote sites. This is achieved by interacting with the VEEMs that manage other Clouds.
The Virtual Execution Environment Host (VEEH) is a resource that can host a certain type of VEEs. For example one type of a VEEH can be a physical machine with the Xen hypervisor (Barham et al., 2003) controlling it, whereas another type can be a machine with the KVM hypervisor (Kivity, 2007). In a Service Cloud hosting site there is likely to be a considerable number of VEEHs organised as a cluster.
These three main components of the Service Cloud architecture interact with each other using specific interfaces, namely SMI (service management interface), VMI (VEE management interface), and VHI (VEE host interface), within a site and also use the VMI interface for site-to-site federation via the VEEM. In Figure 1 the relationship between these components and interfaces is shown.
In the RESERVOIR platform, as can be seen in Figure 1, a Service Provider specifies the details and requirements of his application in the Service Definition Manifest. The Manifest also has the specifications of Elasticity Rules, which determine how the application will scale across the Cloud, and Service Level Agreement (SLA) objectives for the application as well as the infrastructure. The former specify which performance objectives should be attained by the service, the latter are used to determine if the platform is providing the right level of service to the application.
Federation and Networking Apart from virtual execution environment requirements, the Service Definition Manifest also specifies the networking requirements of a service. In particular, the manifest can specify one or more private virtual networks called Virtual Area Networks (VANs), implemented as virtual Ethernet overlay services.
Furthermore, the manifest specifies the public access points for the deployed service. Virtual execution environment interfaces can then be mapped to public IP addresses.
Public interfaces for a service are mapped to the available public IP addresses available at each compute Cloud, which allows users of the service to access it via the public IP address. However, virtual area networks (VANs) are implemented at the infrastructure level ensuring the VAN has separation, isolation,
elasticity, and federation. The advantages of these attributes of the VAN are explained further:
• separation — the service network and the infrastructure network are kept separate. RESERVOIR seeks to reduce mutual dependency between the infrastructure and the services. A VAN of a service, offered as part of RESERVOIR, needs to be separated from the infrastructure used by an infrastructure provider, similarly to the manner in which a virtual execution environment is separated from the physical host.
• isolation — a VAN of one service is isolated from all VANs of other services. RESERVOIR seeks to isolate services such that possibly competing applications may securely share the infrastructure provider resources whilst being unaware of the other services. Isolated VAN services need to be offered side by side while sharing network resources of the infrastructure provider.
• elasticity — a VAN can grow or shrink as necessary. RESERVOIR seeks to offer an elastic and extendable environment so that application providers will be able to adjust the size of their application on demand. A VAN service needs to enable application elasticity.
• federation — a VAN can span over more than one Cloud provider. RESERVOIR seeks to form a federation of possibly competing infrastructure providers so that each provider offers an interchangeable pool of resources allowing service and resource migration without barriers. An interchangeable VAN service needs to be offered across administrative domains such that service providers would not be concerned by the identity of the infrastructure provider, the physical network used, or its configuration.
The VAN implementation of RESERVOIR provides the required virtual Ethernet overlay for each service.
As a consequence of service elasticity and site management policies, some virtual machines belonging to a same service might be placed across different sites.
Figure 2 illustrates a simple federated scenario across two Service Clouds (Cloud A on the left and Cloud B on the right). Each site leases out compute resources to different service providers, in our example Cloud A is serving services1 and 2 while Cloud B is managing service 3.
When virtual execution environments are placed on different sites, (as for Service 2 in Figure 2), the VEEMs at each participating site keep the VEEs connected by spawning the appropriate VAN proxies so that VANs stay connected and VEEs remain unaware of their actual placement. There are some network issues such latency, jitter, and round trip time considerations that arise, but the important aspect is that connectivity is maintained across the federated domains. In a more complex scenario (such as, when more services and Service Clouds are participating in the federation) a single service can be scattered across several sites requiring multiple VAN proxies.
Once the VEEs are in place and the service is running, the Service Manager (SM) is responsible for managing all of the services on the Cloud. To undertake such management, a Cloud needs monitoring facilities.
MONITORING IN RESERVOIRMonitoring is a fundamental aspect of a Service Cloud such as RESERVOIR because it is used both by the
infrastructure itself and for service management. The monitoring system needs to be pervasive as:
• it is required by several components of the Service Cloud;
• it cuts across the layers of the Cloud system creating vertical paths; and
• it spans out across all the Service Clouds in a federation in order to link all the elements of a service.
For full operation of a RESERVOIR Service Cloud we observe that monitoring is a vital part of the full control loop that goes from the service management, through a control path, to the Probes which collect and send data, back to the service management which makes various decisions based on the data. The monitoring is a fundamental part of RESERVOIR as it allows the integration of components in all of the architectural layers.
The RESERVOIR monitoring system has a model of information consumers and information producers, connected by a monitoring data plane. The producers send measurements across the data plane, and these are read and processed by the consumers. In Figure 3 we show some of these producers and consumers, but this is not an exhaustive list.