Ignore:
Timestamp:
06/03/13 13:55:25 (12 years ago)
Author:
wojtekp
Message:
 
File:
1 edited

Legend:

Unmodified
Added
Removed
  • papers/SMPaT-2012_DCWoRMS/elsarticle-DCWoRMS.tex

    r1068 r1069  
    174174%Furthermore, available power supply is usually limited so it also may reduce data center development capabilities, especially looking at challenges related to exascale computing breakthrough foreseen within this decade. 
    175175 
    176 For these reasons many efforts were undertaken to measure and study energy efficiency of data centers. There are projects focused on data center monitoring and management \cite{games}\cite{fit4green} whereas others on  energy efficiency of networks \cite{networks}. Additionally, vendors offer a wide spectrum of energy efficient solutions for computing and cooling \cite{sgi}\cite{colt}\cite{ecocooling}. However, a variety of solutions and configuration options can be applied planning new or upgrading existing data centers. 
     176For these reasons many efforts were undertaken to measure and study energy efficiency of data centers. There are projects focused on data center monitoring and management \cite{games}\cite{fit4green} whereas others on  energy efficiency of networks \cite{networks} or distributed computing infrastructures, like grids \cite{fit4green_carbon_scheduler}. Additionally, vendors offer a wide spectrum of energy efficient solutions for computing and cooling \cite{sgi}\cite{colt}\cite{ecocooling}. However, a variety of solutions and configuration options can be applied planning new or upgrading existing data centers. 
    177177In order to optimize a design or configuration of data center we need a thorough study using appropriate metrics and tools evaluating how much computation or data processing  can be done within given power and energy budget and how it affects temperatures, heat transfers, and airflows within data center.  
    178178Therefore, there is a need for simulation tools and models that approach the problem from a perspective of end users and take into account all the factors that are critical to understanding and improving the energy efficiency of data centers, in particular, hardware characteristics, applications, management policies, and cooling. 
    179179These tools should support data center designers and operators by answering questions how specific application types, levels of load, hardware specifications, physical arrangements, cooling technology, etc. impact overall data center energy efficiency.  
    180 There are various tools that allow simulation of computing infrastructures. On one hand they include advanced packages for modeling heat transfer and energy consumption in data centers \cite{ff} or tools concentrating on their financial analysis \cite{DCD_Romonet}. On the other hand, there are simulators focusing on computations such as CloudSim \cite{CloudSim}. The CoolEmAll project aims to integrate these approaches and enable advanced analysis of data center efficiency taking into account all these aspects \cite{e2dc12}\cite{coolemall}. 
     180There are various tools that allow simulation of computing infrastructures, like SimGrid\cite{SimGrid}. On one hand they include advanced packages for modeling heat transfer and energy consumption in data centers \cite{ff} or tools concentrating on their financial analysis \cite{DCD_Romonet}. On the other hand, there are simulators focusing on computations such as CloudSim \cite{CloudSim}. The CoolEmAll project aims to integrate these approaches and enable advanced analysis of data center efficiency taking into account all these aspects \cite{e2dc12}\cite{coolemall}. 
    181181 
    182182One of the results of the CoolEmAll project is the Data Center Workload and Resource Management Simulator (DCworms) which enables modeling and simulation of computing infrastructures to estimate their performance, energy consumption, and energy-efficiency metrics for diverse workloads and management policies. 
     
    200200In terms of application modeling, all tools, except DCSG Simulator, describe the application with a number of computational and communicational requirements. In addition, GreenCloud and DCworms allow introducing the QoS requirements by taking into account the time constraints during the simulation. DCSG Simulator instead of modeling of the single application, enables the definition of workload that leads to a given utilization level. However, only DCworms supports application performance modeling by not only incorporating simple requirements that are taken into account during scheduling, but also by allowing specification of task execution time. 
    201201 
    202 GreenCloud, CloudSim and DCworms are released as Open Source under the GPL. DCSG Simulator is available under an OSL V3.0 open-source license, however, it can be only accessed by the DCSG members. 
     202GreenCloud, CloudSim and DCworms are released as Open Source under the GPL. DCSG Simulator is available under the OSL V3.0 open-source license, however, it can be only accessed by the DCSG members. 
    203203 
    204204Summarizing, DCworms stands out from other tools due to the flexibility in terms of data center equipment and structure definition. 
     
    210210 
    211211Data Center workload and resource management simulator (DCworms) is a simulation tool based on the GSSIM framework \cite{GSSIM} developed by Poznan Supercomputing and Networking Center (PSNC). 
    212 GSSIM has been proposed to provide an automated tool for experimental studies of various resource management and scheduling strategies in distributed computing systems. DCworms extends its basic functionality and adds some additional features related to the energy efficiency issues in data centers. In this section we will introduce the functionality of the simulator, in terms of modeling and simulation of large scale distributed systems like Grids and Clouds. 
     212GSSIM has been proposed to provide an automated tool for experimental studies of various resource management and scheduling strategies in distributed computing systems. DCworms extends its basic functionality and adds some additional features related to the energy efficiency issues in data centers. In this section we will introduce the functionality of the simulator, in terms of modeling and simulation of data centers. 
    213213 
    214214 
     
    222222\end{figure} 
    223223 
    224 DCworms is an event-driven simulation tool written in Java. In general, input data for the DCworms consist of workload and resources descriptions. They can be provided by the user, read from real traces or generated using the generator module. In this terms DCworms benefits from the GSSIM workload generator tool that allows creating synthetic workloads (\cite{GSSIM}). However, the key elements of the presented architecture are plugins. They allow the researchers to configure and adapt the simulation environment to the peculiarities of their studies, starting from modeling job performance, through energy estimations up to implementation of resource management and scheduling policies. Each plugin can be implemented independently and plugged into a specific experiment. Results of experiments are collected, aggregated, and visualized using the statistics module. Due to a modular and plug-able architecture DCworms can be applied to specific resource management problems and address different users’ requirements. 
     224DCworms is an event-driven simulation tool written in Java. In general, input data for the DCworms consist of workload and resource descriptions. They can be provided by the user, read from real traces or generated using the generator module. In this terms DCworms benefits from the GSSIM workload generator tool that allows creating synthetic workloads (\cite{GSSIM}). However, the key elements of the presented architecture are plugins. They allow the researchers to configure and adapt the simulation environment to the peculiarities of their studies, starting from modeling job performance, through energy estimations up to implementation of resource management and scheduling policies. Each plugin can be implemented independently and plugged into a specific experiment. Afterwards, politics introduced by particular plugins are applied to a given set of tasks and resources and are triggered by each change of simulated environment state. Results of experiments are collected, aggregated, and visualized using the statistics module. The output of each simulation consists of a number of statistics created to help in 
     225comparative evaluations of resource and workload management policies. Due to a modular and plug-able architecture DCworms can be applied to specific resource management problems and address different users’ requirements. 
    225226 
    226227 
     
    333334  \item network parameters 
    334335\end{itemize} 
    335 Using these parameters developers can for instance take into account the architectures of the underlying systems, such as multi-core processors, and their impact on the final performance of applications. 
     336Using these parameters developers can for instance take into account the architectures of the underlying systems, such as multi-core processors, and their impact on the final performance of applications. Examples of application performance modeling in DCworms are shown in \cite{GSSIM}. 
    336337 
    337338 
     
    339340\section{Modeling of energy consumption in DCworms} 
    340341 
    341 DCworms is an open framework in which various models and algorithms can be investigated as presented in Section \ref{sec:apps}. In this section, we discuss possible approaches to modeling that can be applied to simulation of energy-efficiency of distributed computing systems. In general, to facilitate the simulation process, DCworms provides some basic implementation of power consumption, air throughput and thermal models. We introduce power consumption models as examples and validate part of them by experiments in real computing system (in Section \ref{sec:experiments}).  Description of thermal models and corresponding experiments was presented in \cite{e2dc13}. 
     342DCworms is an open framework in which various models and algorithms can be investigated as presented in Section \ref{sec:apps}. In this section, we discuss possible approaches to modeling that can be applied to simulation of energy-efficiency of distributed computing systems. In general, to facilitate the simulation process, DCworms provides some basic implementation of power consumption, air throughput and thermal models. We introduce power consumption models as examples and validate part of them by experiments in real computing system (in Section \ref{sec:experiments}). Description of thermal models and corresponding experiments was presented in \cite{e2dc13}. 
    342343 
    343344The most common questions explored by researchers who study energy-efficiency of distributed computing systems is how much energy $E$ do these systems require to execute workloads. In order to obtain this value the simulator must calculate values of power $P_i(t)$ and load $L_i(t)$ in time for all $m$ computing nodes, $i=1..m$. Load function may depend on specific load models applied. In more complex cases it can even be defined as vectors of different resource usage in time. In a simple case load can be either idle or busy but even in this case estimation of job processing times $p_j$ is needed to calculate total energy consumption. The total energy consumption of computing nodes is given by (\ref{eq:E}): 
     
    513514 
    514515 
    515 \subsection{Models}\label{sec:models} 
    516  
    517 Based on the measured values we evaluated three types of models that can be applied, among others, to the simulation environment. 
    518  
    519 \textbf{Static} 
    520 This model refers to the static approach presented in Section~\ref{sec:power}. According to the measured values we created a resource power consumption model that is based on a static definition of resource power usage.  
    521 %With each node power state, understood as a possible operating state (p-state), we associated a power consumption value that derives from the averaged values of measurements obtained for different types of application. Therefore, the current power usage of the node, can be expressed as: $P = P_{idle} + P_{f}$ where $P$ denotes power consumed by the node, $P_{idle}$ is a power usage of node in idle state and $P_{f}$ stands for power usage of CPU operating at the given frequency level. 
    522  
    523 \textbf{Dynamic} 
    524 This model refers to the Resource load approach presented in Section~\ref{sec:power}. Based on the measured values of the total node power usage for various levels of load and frequencies of CPUs node power usage was defined as in \ref{eq:model}. 
    525  
    526 %and referring to the existing models presented in literature, we assumed the following equation: $P = P_{idle} + load*P_{cpubase}*c^{(f-f_{base})/100} + P_{app}$, where $P$ denotes power consumed by the node executing the given application, $P_{idle}$ is a power usage of node in idle state, load is the current utilization level of the node, $P_{cpubase}$ stands for power usage of fully loaded CPU working in the lowest frequency, $c$ is the constant factor indicating the increase of power consumption with respect to the frequency increase $f$- is a current frequency, $f_{base}$- is the lowest available frequency within the given CPU and $P_{app}$ denotes the additional power usage derived from executing a particular application). 
    527  
    528  
    529 Table~\ref{nodeBasePowerUsage} and Table~\ref{appPowerUsage} contain values of $P_{cpubase}$ and $P_{app}$, respectively, obtained for the particular application and resource architectures. Lack of the corresponding value means that the application did not run on the given type of node. 
    530 \begin {table}[h!] 
    531 \centering 
    532 \begin{tabular}{lccc} 
    533 \hline 
    534 Intel I7 & AMD Fusion  & Atom D510  \\ 
    535 \hline 
    536  8 & 2 & 1 \\ 
    537 \hline 
    538 \end{tabular} 
    539 \caption {\label{nodeBasePowerUsage} $P_{cpubase}$ values in Watts} 
    540 \end {table} 
    541  
    542  
    543 \begin {table}[h!] 
    544 \centering 
    545 \begin{tabular}{l|ccc} 
    546 \hline 
    547  & \multicolumn{3}{c} {Node type}\\  
    548 Application & Intel I7 & AMD Fusion  & Atom D510  \\ 
    549 \hline 
    550 Abinit & 3.3 &  - &  - \\ 
    551 Linpactiny & 2.5 & - & 0.2 \\ 
    552 Linpack3gb &  6 &  -  & -  \\ 
    553 C-Ray & 4 & 1 & 0.05 \\ 
    554 FFT & 3.5 & 2 & 0.1 \\ 
    555 Tar & 3 & 2.5 & 0.5 \\ 
    556  
    557 \hline 
    558 \end{tabular} 
    559 \caption {\label{appPowerUsage} $P_{app}$ values in Watts} 
    560 \end {table} 
    561  
    562  
    563 \textbf{Mapping} 
    564 This model refers to the Application specific approach presented in Section~\ref{sec:power}. However, in this model we applied the measured values for each application exactly to the power model. Neither dependencies with load nor application profiles are modeled. Obviously this model is contaminated only with the inaccuracy of the measurements and variability of power usage (caused by other unmeasured factors). 
    565          
    566 The following table (Table~\ref{expPowerModels}) contains the relative errors of the models with respect to the measured values 
    567 \begin {table}[h!] 
    568 \centering 
    569 \begin{tabular}{llr} 
    570 \hline 
    571 Static & Dynamic  & Mapping \\ 
    572 \hline 
    573 13.74 & 10.85 & 0 \\ 
    574 \hline 
    575 \end{tabular} 
    576 \caption {\label{expPowerModels} Power models error in \%} 
    577 \end {table} 
    578  
    579 Obviously, 0\% error in the case of the Mapping model is caused by the use of a tabular data, which for each application stores a specific power usage. Nevertheless, in all models we face possible deviations from the average caused by power usage fluctuations not explained by variables used in models. These deviations reached around 7\% for each case. 
    580  
    581 For the experimental purposes we decided to use the latter model. Thus, we introduce into the simulation environment exact values obtained within our testbed, to build both the power profiles of applications as well as the application performance models, denoting their execution times. 
    582  
    583516 
    584517\subsection{Methodology} 
     
    622555In all cases we assumed that tasks are scheduled and served in order of their arrival (FIFO strategy) using relaxed backfilling approach, with indefinite delay for the highest priority task. Moreover, all tasks were assigned to nodes with the condition that they can be assigned only to nodes of the type on which the application was able to run (in other words - we had the corresponding value of power consumption and execution time).  
    623556 
    624 \subsection{Computational analysis} 
     557 
     558\subsection{Models}\label{sec:models} 
     559 
     560Based on the measured values we evaluated three types of models that can be applied, among others, to the simulation environment. 
     561 
     562\textbf{Static} 
     563This model refers to the static approach presented in Section~\ref{sec:power}. According to the measured values we created a resource power consumption model that is based on a static definition of resource power usage.  
     564%With each node power state, understood as a possible operating state (p-state), we associated a power consumption value that derives from the averaged values of measurements obtained for different types of application. Therefore, the current power usage of the node, can be expressed as: $P = P_{idle} + P_{f}$ where $P$ denotes power consumed by the node, $P_{idle}$ is a power usage of node in idle state and $P_{f}$ stands for power usage of CPU operating at the given frequency level. 
     565 
     566\textbf{Dynamic} 
     567This model refers to the Resource load approach presented in Section~\ref{sec:power}. Based on the measured values of the total node power usage for various levels of load and frequencies of CPUs node power usage was defined as in \ref{eq:model}. 
     568 
     569%and referring to the existing models presented in literature, we assumed the following equation: $P = P_{idle} + load*P_{cpubase}*c^{(f-f_{base})/100} + P_{app}$, where $P$ denotes power consumed by the node executing the given application, $P_{idle}$ is a power usage of node in idle state, load is the current utilization level of the node, $P_{cpubase}$ stands for power usage of fully loaded CPU working in the lowest frequency, $c$ is the constant factor indicating the increase of power consumption with respect to the frequency increase $f$- is a current frequency, $f_{base}$- is the lowest available frequency within the given CPU and $P_{app}$ denotes the additional power usage derived from executing a particular application). 
     570 
     571 
     572Table~\ref{nodeBasePowerUsage} and Table~\ref{appPowerUsage} contain values of $P_{cpubase}$ and $P_{app}$, respectively, obtained for the particular application and resource architectures. Lack of the corresponding value means that the application did not run on the given type of node. 
     573\begin {table}[h!] 
     574\centering 
     575\begin{tabular}{lccc} 
     576\hline 
     577Intel I7 & AMD Fusion  & Atom D510  \\ 
     578\hline 
     579 8 & 2 & 1 \\ 
     580\hline 
     581\end{tabular} 
     582\caption {\label{nodeBasePowerUsage} $P_{cpubase}$ values in Watts} 
     583\end {table} 
     584 
     585 
     586\begin {table}[h!] 
     587\centering 
     588\begin{tabular}{l|ccc} 
     589\hline 
     590 & \multicolumn{3}{c} {Node type}\\  
     591Application & Intel I7 & AMD Fusion  & Atom D510  \\ 
     592\hline 
     593Abinit & 3.3 &  - &  - \\ 
     594Linpactiny & 2.5 & - & 0.2 \\ 
     595Linpack3gb &  6 &  -  & -  \\ 
     596C-Ray & 4 & 1 & 0.05 \\ 
     597FFT & 3.5 & 2 & 0.1 \\ 
     598Tar & 3 & 2.5 & 0.5 \\ 
     599 
     600\hline 
     601\end{tabular} 
     602\caption {\label{appPowerUsage} $P_{app}$ values in Watts} 
     603\end {table} 
     604 
     605 
     606\textbf{Mapping} 
     607This model refers to the Application specific approach presented in Section~\ref{sec:power}. However, in this model we applied the measured values for each application exactly to the power model. Neither dependencies with load nor application profiles are modeled. Obviously this model is contaminated only with the inaccuracy of the measurements and variability of power usage (caused by other unmeasured factors). 
     608         
     609The following table (Table~\ref{expPowerModels}) contains the relative errors of the models with respect to the measured values 
     610\begin {table}[h!] 
     611\centering 
     612\begin{tabular}{llr} 
     613\hline 
     614Static & Dynamic  & Mapping \\ 
     615\hline 
     61613.74 & 10.85 & 0 \\ 
     617\hline 
     618\end{tabular} 
     619\caption {\label{expPowerModels} Power models error in \%} 
     620\end {table} 
     621 
     622Obviously, 0\% error in the case of the Mapping model is caused by the use of a tabular data, which for each application stores a specific power usage. Nevertheless, in all models we face possible deviations from the average caused by power usage fluctuations not explained by variables used in models. These deviations reached around 7\% for each case. 
     623 
     624For the experimental purposes we decided to use the latter model. Thus, we introduce into the simulation environment exact values obtained within our testbed, to build both the power profiles of applications as well as the application performance models, denoting their execution times. 
     625 
     626 
     627 
     628\subsection{Resource management policies evaluation} 
    625629 
    626630In the following section we present the results obtained for the workload with load density equal to 70\% in the light of five resource management and scheduling strategies. The scheduling strategies were evaluated according to two criteria: total energy consumption and maximum completion time of all tasks (makespan). These evaluation criteria employed in our experiments represent interests of various groups of stakeholders present in data centers. 
     
    665669Type of processor within the node & Power usage in idle state [W]  \\ 
    666670\hline 
    667  Intel i7 & 11.5 \\ 
     671Intel i7 & 11.5 \\ 
    668672AMD Fusion & 10 \\ 
    669673Atom D510  & 19 \\ 
     
    706710 
    707711 
    708 As we were looking for the trade-off between total completion time and energy usage, we were searching for the workload load level that can benefit from the lower system performance in terms of energy-efficiency. For the frequency downgrading policy, we noticed the improvement on the energy usage criterion only for the workload resulting in 10\% system load. For this threshold we observed that slowdown in task execution does not affect the subsequent tasks in the system and thus the total completion time of the whole workload. 
     712As we were looking for the trade-off between total completion time and energy usage, we were searching for the workload load level that can benefit from the lower system performance in terms of energy-efficiency. For the frequency downgrading policy, we noticed the improvement on the energy usage criterion only for the workload resulting in 10\% system load. For this threshold we observed that slowdown in task execution does not affect the subsequent tasks in the system and thus the total completion time of the whole workload. In general, for the low loaded systems, where the longer execution times do not affect the computations of subsequent tasks, downgrading the frequency can reduce the overall energy consumption. However, for the more loaded systems, which are typical for HPC centers, and taking into account that idle nodes can be turned off after all, both the simulation results as well as the measurements obtained on real hardware show that the best way to save energy is to perform computations at the maximum frequency level. Due to the shorter execution times, idle nodes can be turned off sooner which results in lower energy usage. 
    709713         
    710714Figure~\ref{fig:dfsComp} shows schedules obtained for Random and Random + lowest frequency strategy.  
     
    717721\end{figure} 
    718722 
    719 \subsection{Discussion} 
     723\subsubsection{Summary} 
    720724The following tables: Table~\ref{loadEnergy} and Table~\ref{loadMakespan} contain the values of evaluation criteria (total energy usage and makespan respectively) gathered for all investigated workloads. 
    721725 
     
    756760We also demonstrated differences between power usage models. They span from rough static approach to accurate application specific models. However, the latter can be difficult or even infeasible to use, as it requires real measurements for specific applications beforehand. This issue can be partially resolved by introducing application profiles and classification, which can deteriorate the accuracy though. This issue is begin studied more deeply within CoolEmAll project.  
    757761 
     762\subsection{Verification of models}  
     763 
     764This section contains more detailed comparison of two types of power consumption models that can be applied, among others, within the DCworms. The first one, called Mapping approach, was applied to the experiments in the previous section. As mentioned within this model, the values measured on the CoolEmAll testbed for each application were applied directly to the power consumption model used in DCworms.  
     765 
     766Model evaluated in this section is a modification of the Mapping and Dynamic model by additional modeling of dependencies with the processor load.  
     767Within this model, we benefited from the power profiles based on the measurements made on CoolEmAll testbed (and adopted also by the previous model). However, data applied to the simulation environment consisted only of measurements gathered for applications ran in mode resulting in lowest and highest processor load. For all load levels between the given two values we assumed the linear dependency between the load and power consumption. Thus, the power consumption for the given processor load can be expressed using the following equation (\ref{eq:modelLoad}): 
     768 
     769\begin{equation} 
     770P_L = P_{LL} + (L-LL)*(P_{HL}-P_{LL})/(HL-LL), \label{eq:modelLoad} 
     771\end{equation} 
     772 
     773where $L$ is a given processor load, $LL$ is the lowest measured processor load, $HL$ is the highest measured processor load, $P_L$ denotes power consumption for a given processor load, $P_{LL}$ is a power consumption measured for the lowest processor load and $P_{HL}$ stands for power consumption measured for the highest processor load. 
     774 
     775\begin {table}[h!] 
     776\centering 
     777\begin{tabular}{l| c | c | c } 
     778\hline 
     779Policy / Model  & Mapping & Mapping + Dynamic  & Accuracy [\%]\\ 
     780\hline 
     781R & 46.883 & 44.476 & 94.87 \\ 
     782R+NPM & 30.568 &        28.250 & 92.42 \\ 
     783EO & 77.109 &   75.277 &        97.62\\ 
     784EO+NPM & 46.305 & 44.050 & 95.13\\ 
     785R+LF & 36.705 & 34.298 & 93.44\\ 
     786\hline 
     787\end{tabular} 
     788\caption {\label{modelAccuracy} Comparison of energy usage estimations [kWh] obtained for two power consumption models. R - Random, R+NPM - Random + node power management, EO - Energy optimization, EO+NPM - Energy optimization + node power management, R+LF - Random + lowest frequency} 
     789\end {table} 
     790 
     791As it can be observed the accuracy of the Mapping + Dynamic based model is high and exceeds visibly 90\%. Satisfactory accuracy suggests that applying various power consumption models, while verifying different approaches or in case of lack of detailed measurements, does not lead to deterioration of overall results. This fact confirms also the important role of simulations in the experiments related to the distributed computing systems.  
    758792 
    759793 
     
    830864\bibitem{DCSG} http://dcsg.bcs.org/welcome-dcsg-simulator 
    831865 
     866\bibitem{SimGrid} http://simgrid.gforge.inria.fr/index.html 
     867 
    832868\bibitem{DCD_Romonet} http://www.datacenterdynamics.com/blogs/ian-bitterlin/it-does-more-it-says-tin\%E2\%80\%A6 
    833869 
     
    852888\bibitem{fit4green_scheduler} O. M{\"a}mmel{\"a}, M. Majanen, R. Basmadjian, H. De Meer, A. Giesler, W. Homberg, Energy-aware job scheduler for high-performance computing, Computer Science - Research and Development, November 2012, Volume 27, Issue 4, pp 265-275. 
    853889 
     890\bibitem {fit4green_carbon_scheduler} M. Majanen, O. M{\"a}mmel{\"a}, A. Giesler, Energy and carbon aware scheduling in supercomputing, International Journal on Advances in Intelligent Systems. Vol. 5 (2012) Nr: 3 \& 4, pp. 451 - 469. 
     891 
    854892\bibitem {d2.2} U. Woessner, E. Volk, G. Gallizo, M. vor dem Berge, G. Da Costa, P. Domagalski, W. Piatek, J-M. Pierson. (2012) D2.2 Design of the CoolEmAll simulation and visualisation environment - CoolEmAll Deliverable, http://coolemall.eu 
    855893 
Note: See TracChangeset for help on using the changeset viewer.