Context Navigation

← Previous Change
Next Change →

elsarticle-DCWoRMS.tex

Timestamp:

12/28/12 09:18:29 (12 years ago)

Author:

wojtekp

Message:

File:

: 1 edited

papers/SMPaT-2012_DCWoRMS/elsarticle-DCWoRMS.tex (modified) (11 diffs)

Legend:

: Unmodified
: Added
: Removed

papers/SMPaT-2012_DCWoRMS/elsarticle-DCWoRMS.tex

r714	r715
118	118
119	119	In the recent years, the issue of computing infrastructures energy-efficiency has gained great attention. In this paper we present a Data Center Workload and Resource Management Simulator (DCWoRMS) which enables modeling and simulations of computing infrastructures to estimate their performance, energy consumption, and energy-efficiency metrics for diverse workloads and management policies.
120		We discuss methods of power usage modeling available in the simulator. To this end, we compare results of simulations to measurements from the real servers.
121		To demonstrate DCWoRMS capabilities we evaluate impact of several resource management policies on overall energy-efficiency of specific workloads on heterogeneous resources.
	120	We discuss methods of power usage modeling available in the simulator. To this end, we compare the results of simulations to measurements from the real servers.
	121	To demonstrate DCWoRMS capabilities we evaluate the impact of several resource management policies on overall energy-efficiency of specific workloads on heterogeneous resources.
122	122
123	123	\end{abstract}
…	…
419	419	TODO - correct, improve, refactor...
420	420
421		In this section, we present computational analysis that were conducted to emphasize the role of modelling and simulation in studying computing systems performance. The experiments were first performed on the CoolEmAll testbed to collect all necessary data and then repeated using DCWoRMS tool. Based on the obtained results we studied the impact of popular energy-aware resource management policies on the energy consumption. The following sections contains description of the used system, tested application and the results of simulation experiments conducted for the evaluated strategies.
	421	In this section, we present computational analysis that were conducted to emphasize the role of modelling and simulation in studying computing systems performance. To this end we evaluate the impact of energy-aware resource management policies on overall energy-efficiency of specific workloads on heterogeneous resources. The following sections contains description of the used system, tested application and the results of simulation experiments conducted for the evaluated strategies.
422	422
423	423	\subsection{Testbed description}
424
425	424
426	425	To obtain values of power consumption that could be later used in DCWoRMS environment to build the model and to evaluate resource management policies we ran a set of applications / benchmarks on the physical testbed. For experimental purposes we choose the high-density Resource Efficient Cluster Server (RECS) system. The single RECS unit consists of 18 single CPU modules, each of them can be treated as an individual node of PC class. Configuration of our RECS unit is presented in Table~\ref{testBed}.
…	…
443	442	\hline
444	443	\end{tabular}
445		\caption {\label{testBed} ~~CoolEmAll testbed~~}
	444	\caption {\label{testBed} RECS system configuration}
446	445	\end {table}
447	446
…	…
456	455	\textbf{Abinit} is a widely-used application for computational physics simulating systems made of electrons and nuclei to be calculated within density functional theory.
457	456
458		\textbf{C-Ray} is a ray-tracing benchmark that stresses floating point performance of a CPU. ~~The~~ test is configured with the 'scene' file at 16000x9000 resolution.
	457	\textbf{C-Ray} is a ray-tracing benchmark that stresses floating point performance of a CPU. Our test is configured with the 'scene' file at 16000x9000 resolution.
459	458
460	459	\textbf{Linpack} benchmark is used to evaluate system floating point performance. It is based on the Gaussian elimination methods that solves a dense N by N system of linear equations.
461	460
462		\textbf{Tar} it is a widely used data archiving software [tar]. In the tests the task was to create one compressed file of Linux kernel, which is about 2,3GB size, using bzip2.
463
464		\textbf{FFTE} benchmark measures the floating-point arithmetic rate of double precision complex one-dimensional Discrete Fourier Transforms of 1-, 2-, and 3-dimensional sequences of length $2^{p} * 3^{q} * 5^{r}$. In our tests only one core was used to run the application
465
	461	\textbf{Tar} it is a widely used data archiving software]. In our tests the task was to create one compressed file of Linux kernel (version 3.4), which is about 2,3 GB size, using bzip2.
	462
	463	\textbf{FFTE} benchmark measures the floating-point arithmetic rate of double precision complex one-dimensional Discrete Fourier Transforms of 1-, 2-, and 3-dimensional sequences of length $2^{p} * 3^{q} * 5^{r}$. In our tests only one core was used to run the application.
466	464
467	465
468	466	\subsection{Methodology}
469	467
470		Every chosen application / benchmark was executed on each type of node, for all frequencies supported by the CPU and for different levels of parallelization (number of cores). To eliminate the problem with assessing which part of the power consumption comes from which application, in case when more then one application is ran on the node, the queuing system (SLURM) was configured to run jobs in exclusive mode (one job per node). Such configuration is often used for at least dedicated part of HPC resources. The advantage of the exclusive mode scheduling policy consist in that the job gets all the resources of the assigned nodes for optimal parallel performance and applications running on the same node do not influence each other. For every configuration of application, type of node and CPU frequency we measure the average power consumption of the node and the execution time. The aforementioned values were used to configure the DCWoRMS environment providing energy and time execution models.
	468	Every chosen application / benchmark was executed on each type of node, for all frequencies supported by the CPU and for different levels of parallelization (number of cores). To eliminate the problem with assessing which part of the power consumption comes from which application, in case when more then one application is ran on the node, the queuing system (SLURM) was configured to run jobs in exclusive mode (one job per node). Such configuration is often used for at least dedicated part of HPC resources. The advantage of the exclusive mode scheduling policy consist in that the job gets all the resources of the assigned nodes for optimal parallel performance and applications running on the same node do not influence each other. For every configuration of application, type of node and CPU frequency we measure the average power consumption of the node and the execution time. The aforementioned values were used to configure the DCWoRMS environment providing energy and time execution models.
471	469	Based on the models obtained for the considered set of resources and applications we evaluated a set of resource management strategies in terms of energy consumption needed to execute three workloads varying in load intensity (10\%, 30\%, 50\%, 70\% ).
472		To generate a workload we ~~benefited from~~ the DCWoRMS workload generator tool using the following characteristics.
	470	To generate a workload we used the DCWoRMS workload generator tool using the following characteristics.
473	471
474	472	\begin {table}[ tp]
…	…
495	493	& \multicolumn{4}{c}{C-Ray} & uniform - 20\%\\
496	494	& \multicolumn{4}{c}{Tar} & uniform - 20\%\\
497		& \multicolumn{4}{c}{Linpack} & uniform - 20\%\\
	495	& \multicolumn{4}{c}{Linpac - 3Gb} & uniform - 10\%\\
	496	& \multicolumn{4}{c}{Linpac - tiny} & uniform - 10\%\\
498	497	& \multicolumn{4}{c}{FFT} & uniform - 20\%\\
499	498
…	…
503	502	\end {table}
504	503
505		~~Execution time of each application is based on the measurements collected within our testbed.~~
506	504	In all cases we assumed that tasks are scheduled and served in order of their arrival (FIFO strategy with easy backfilling strategy).
507	505
508	506	\subsection{Computational analysis}
509	507
510		In the following section presents the results obtained for the workload with load density equal to 70\% in the light of five resource management and scheduling strategies. Then we discusses the corresponding results received for workloads with other density level.
511		The first considered by us policy was the strategy in which tasks were assigned to nodes in random manner with the reservation that they can be assigned only to nodes of the type which the application was possible to execute on and we have the corresponding value of power consumption and execution time. The random strategy is only the reference one and will be later used to compare benefits in terms of energy efficiency resulting from more sophisticated algorithms.
	508	The scheduling strategy was evaluated according to two criteria: total energy consumption and total completion time.
	509	In the following section we present the results obtained for the workload with load density equal to 70\% in the light of five resource management and scheduling strategies. Then we discusses the corresponding results received for workloads with other density level.
	510
	511	The first considered by us policy was the Random strategy in which tasks were assigned to nodes in random manner with the reservation that they can be assigned only to nodes of the type which the application was possible to execute on and we have the corresponding value of power consumption and execution time. The random strategy is only the reference one and will be later used to compare benefits in terms of energy efficiency resulting from more sophisticated algorithms.
512	512
513	513
…	…
518	518	\end{figure}
519	519
520		\textbf{total energy usage ~~[kWh]} : 46,883~~
521		\textbf{mean power consumption ~~[W]} : 316,17~~
522		\textbf{workload completion ~~[s]} : 533 820~~
	520	\textbf{total energy usage}: 46,883 kWh
	521	\textbf{mean power consumption}: 316,17 W
	522	\textbf{workload completion}: 533 820 s
523	523
524	524	We investigated also the second version of this strategy, which is getting more popular due to energy costs in which unused nodes are switched off to reduce the total energy consumption. In the previous one, unused nodes are not switched off, which case is still the the primary one in many HPC centers.
…	…
530	530	\end{figure}
531	531
532		\textbf{total energy usage ~~[kWh]} : 36,705~~
533		\textbf{mean power consumption ~~[W]} : 247,53~~
534		\textbf{workload completion ~~[s]} : 533 820~~
	532	\textbf{total energy usage}: 36,705 kWh
	533	\textbf{mean power consumption}: 247,53 W
	534	\textbf{workload completion}: 533 820 s
535	535
536	536	In this version of experiment we neglected additional cost and time necessary to change the power state of resources. As expected, switching of unused nodes led to significant decrease of the total energy consumption. The overall savings reached 22\%
…	…
545	545	\end{figure}
546	546
547		\textbf{total energy usage ~~[kWh]} : 46,305~~
548		\textbf{mean power consumption ~~[W]} : 311,94~~
549		\textbf{workload completion ~~[s]} : 534 400~~
	547	\textbf{total energy usage}: 46,305 kWh
	548	\textbf{mean power consumption}: 311,94 W
	549	\textbf{workload completion}: 534 400 s
550	550
551	551
…	…
560	560	\end{figure}
561	561
562		\textbf{total energy usage ~~[kWh]} : 30,568~~
563		\textbf{mean power consumption ~~[W]} : 206,15~~
564		\textbf{workload completion ~~[s]} : 533 820~~
	562	\textbf{total energy usage}: 30,568 kWh
	563	\textbf{mean power consumption}: 206,15 W
	564	\textbf{workload completion}: 533 820 s
565	565
566	566	The last considered by us case is modification of the one of previous strategies taking into account the energy-efficiency of nodes. We assume that tasks do not have deadlines and the only criterion which is taken into consideration is the total energy consumption. All the considered workloads have been executed on the testbed configured for three different possible frequencies of CPUs â the lowest, medium and the highest one. The experiment was intended to check if the benefit of running the workload on less power-consuming frequency of CPU is not leveled by the prolonged time of execution of the workload.
…	…
573	573	\end{figure}
574	574
575		\textbf{total energy usage ~~[kWh]} : 77,108~~
576		\textbf{mean power consumption ~~[W]} : 260,57~~
577		\textbf{workload completion ~~[s]} : 1 065 356~~
	575	\textbf{total energy usage}: 77,108 kWh
	576	\textbf{mean power consumption}: 260,57 W
	577	\textbf{workload completion}: 1 065 356 s
578	578
579	579

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset 715 for papers/SMPaT-2012_DCWoRMS/elsarticle-DCWoRMS.tex

Legend:

papers/SMPaT-2012_DCWoRMS/elsarticle-DCWoRMS.tex

Download in other formats: