Changes between Version 111 and Version 112 of DCWoRMS

Show
Ignore:
Timestamp:
07/06/20 16:46:53 (5 years ago)
Author:
wojtekp
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • DCWoRMS

    v111 v112  
    1515= Introduction = 
    1616 
    17 Data Center workload and resource management simulator (DCworms) is an extended simulation tool developed by PSNC [1]. It has been based on based on the GSSIM simulation framework which enabled modeling and simulation of computing infrastructures to estimate their performance, energy consumption, and energy-efficiency metrics for diverse workloads and management policies as described in the [2]. 
     17Data Center workload and resource management simulator (DCworms) is an extended simulation tool developed by PSNC [1]. It has been based on the GSSIM simulation framework which enabled modeling and simulation of computing infrastructures to estimate their performance, energy consumption, and energy-efficiency metrics for diverse workloads and management policies as described in the [2]. 
    1818 
    1919 
     
    2323= DCworms architecture = 
    2424 
    25 In general, input data for the DCworms consist of a description of workload and resources. Input data can be read from real traces (for details see sections below) or generated using the generator module. However, the key elements of the presented architecture are plugins. They allow a researcher to configure and adapt the simulation framework to his/her experiment scenario starting from modeling job performance, through energy estimations up to implementation of resource management and scheduling policies. Each plugin can be implemented independently and plugged into a specific experiment. Results of experiments are collected, aggregated, and visualized using the statistics tool. Due to a modular and plug-able architecture DCworms enables adapting it to specific resource management problems and users’ requirements. 
     25In general, input data for the DCworms consist of a description of workload and resources. Input data can be read from real traces (for details see sections below) or generated using the generator module. However, the key elements of the presented architecture are plugins. They allow a researcher to configure and adapt the simulation framework to his/her experiment scenario starting from modeling job performance, through energy estimations up to the implementation of resource management and scheduling policies. Each plugin can be implemented independently and plugged into a specific experiment. Results of experiments are collected, aggregated, and visualized using the statistics module. Due to a modular and plug-able architecture DCworms enables adapting it to specific resource management problems and users’ requirements. 
    2626 
    2727The following figure presents the overall architecture of the simulation tool. 
     
    3636= Input data = 
    3737 
    38 In general, input data in DCworms consist of a single configuration file, description of workload and resources. Users may both generate new or read the existing synthetic data. Third party real workloads can also be imported by DCworms. If any parameters are missing after importing a workload, they can be generated by DCworms and added. 
     38In general, input data in DCworms consist of a single configuration file, a description of workload and resources. Users may both generate new or read the existing synthetic data. Third-party real workloads can also be imported by DCworms. If any parameters are missing after importing a workload, they can be generated by DCworms and added. 
    3939 
    4040 
     
    4343== Configuration file == #ConfigurationFile 
    4444 
    45 The Experiment configuration file has typical, java resource bundle format. List of all available parameters and their interpretation is available below. 
     45The Experiment configuration file has a typical, java resource bundle format. A list of all available parameters and their interpretation is available below. 
    4646 
    4747* '''resdesc''' - path to file containing description of resources (required) 
    4848 
    49 * '''readscenario.workloadfilename''' - path to workload file (in SWF or GWF format) (required) 
    50 * '''readscenario.inputfolder''' - path to directory with xml job descriptions 
    51  
    52 * '''createscenario.tasksdesc''' - path to xml file which describes detail workload generator configuration 
    53 * '''createscenario.outputfolder''' - path to directory where all generated jobs will be placed 
    54 * '''createscenario.workloadfilename''' - name of workload file in swf format which will be generated 
    55 * '''createscenario.overwrite_files''' - determines if previously generated files should be overwritten. Two possible values of this field are: "true" and "false". 
    56  
    57 * '''creatediagrams.gantt''' - determines if gantt chart should be generated. Two possible values of this field are: "true" and "false", deafualt value: "false". 
    58 * '''creatediagrams.tasks''' - determines if tasks execution times diagram should be generated. Two possible values of this field are: "true" and "false", deafualt value: "false". 
    59 * '''creatediagrams.taskswaitingtime''' - determines if tasks waiting time gantt diagram should be generated. Two possible values of this field are: "true" and "false", deafualt value: "false". 
    60  
    61 * '''creatediagrams.resutlization''' - determines if resource utlization chart should be generated. This field should contain the types of resources for which the diagram should be created 
    62 * '''creatediagrams.respowerusage''' - determines if resource power usage chart should be generated. This field should contain the types of resources for which the diagram should be created 
    63 * '''creatediagrams.resairflow''' - determines if resource airflow chart should be generated. This field should contain the types of resources for which the diagram should be created 
    64 * '''creatediagrams.restemperature''' - determines if resource power temperature chart should be generated. This field should contain the types of resources for which the diagram should be created 
    65  
    66 Parameters from readscenario and createscenario groups should be used mutual exclusively. 
     49* '''resourcedescriptionfile''' - path to file(s) containing description of resources (required) 
     50* '''workloadfile''' - path to workload file(s) (in SWF or GWF format) (required) 
     51* '''xmljobfolder''' - input folder with generated tasks descriptions 
     52* '''applicationprofilesfolder''' - input folder with application profiles 
     53* '''creatediagrams.generators''' - determines which library should be used for diagram generation. (JFreeChart / Gnuplot) 
     54* '''creatediagrams.gantt''' - determines if gantt chart should be generated. (true / false) 
     55* '''creatediagrams.gantt.applicationprofiles''' - todo 
     56* '''creatediagrams.tasks''' - determines if tasks execution times diagram should be generated. (true / false) 
     57* '''creatediagrams.taskswaitingtime''' - determines if tasks waiting time gantt diagram should be generated. (true / false) 
     58* '''creatediagrams.rooflinemodel''' - determines if roofline model diagram should be generated. It is specific for ESCAPE simulation mode. (true / false) 
     59* '''creatediagrams.resutlization''' - determines if resource utilization chart should be generated. This field should contain the types of resources for which the diagram should be created. 
     60* '''creatediagrams.cost''' - determines if resource utilization cost chart should be generated. This field should contain the types of resources for which the diagram should be created. 
     61* '''creatediagrams.respowerusage''' - determines if resource power usage chart should be generated. This field should contain the types of resources for which the diagram should be created. 
     62* '''creatediagrams.resairflow''' - determines if resource airflow chart should be generated. This field should contain the types of resources for which the diagram should be created. 
     63* '''creatediagrams.restemperature''' - determines if resourcetemperature chart should be generated. This field should contain the types of resources for which the diagram should be created 
     64* '''excludetextstats''' - disables creation of text files containing specific statistics. This field should contain the types of statistics for which the files should not be created. 
     65* '''creatextstatistics.jobs''' - determines if additional job statistics file should be created. (true / false) 
     66* '''creatextstatistics.simulation''' - determines if additional simulation statistics file should be created. (true / false) 
     67* '''outputfolder''' - path to directory where all generated statistics will be placed. 
     68* '''simulationidentifier''' - name of the simulation, will be used in result files names. 
     69* '''simplifiedschema''' - tells the simulator to use the simplified version of the resource schema and translate it to the regular one before loading resources. (true / false) (experimental) 
     70* '''usedatabase''' - determines if ….. should be stored in database instead of memory. (true / false) (experimental) 
     71* '''simulationmode''' - allows usage of certain features and behaviours specific for individual projects. (Default, CoolEmAll, ESCAPE) 
     72* '''numberofrepetitions''' - allows to automatically repeat an experiment for a given number of times. It can be used when using changing or random data. (Default: 1) 
     73* '''comparestatistics''' - allows automatic comparison of several performed simulations. It is also possible to select individual statistics to compare. (true / false) 
     74* '''comparestatistics.resourceload''' 
     75* '''comparestatistics.energyusage''' 
     76* '''comparestatistics.makespan''' 
     77* '''comparestatistics.taskexecutiontime''' 
     78* '''comparestatistics.taskqueuelength''' 
     79* '''comparestatistics.taskcompletiontime''' 
     80* '''comparestatistics.taskwaitingtime''' 
     81* '''comparestatistics.taskflowtime''' 
     82* '''comparestatistics.tasklateness''' 
     83* '''comparestatistics.delayedtasks''' 
     84* '''comparestatistics.tasktardiness''' 
     85* '''comparestatistics.failedrequests''' 
     86 
    6787 
    6888Path to this configuration file is the main argument of the DCworms program. It should be used as follows: 
     
    112132 
    113133Example: 
    114 Assume, that swf file contains two tasks with id 1 and 2. You can create new job, with two tasks by defining following mapping: 
     134Assume, that SWF file contains two tasks with id 1 and 2. You can create new job, with two tasks by defining following mapping: 
    115135{{{ 
    116136 ;IDMapping: swfID:jobID:taskID 
     
    121141New job with id = 4 consisting of two tasks with id 10 and 20 will be created. 
    122142 
    123 The only constraint of IDMapping section is that swf jobs, which will become tasks in new job, must occur in swf one by one. No other jobs are allowed between these swf jobs which are mapped to tasks of one new job. 
    124  
    125 The experiment can be executed with usage of single swf file or swf file with xml extension.  
    126 If single swf is used, then task requirements like cpu count and requested memory are read directly from swf file. Notice, that information included in swf file is insufficient for using advance reservation in scheduling algorithm. To do so, you must provide xml extension of each job description and fill up its executionTime section. In xml files you can use any ids for job and tasks but you must provide correct IDMapping section (in swf file header) between xml job/task ids and swf job id. Otherwise, task start up parameters like submit time or task length in instructions will not be calculated correctly. 
    127 If xml job description is used, then task requirements are read from xml description instead of swf file. 
     143The only constraint of IDMapping section is that the SWF jobs, which will become tasks in a new job, must occur in SWF one by one. No other jobs are allowed between these SWF jobs which are mapped to tasks of one new job. 
     144 
     145The experiment can be executed with usage of single SWF file or SWF file with xml extension.  
     146If single SWF is used, then task requirements like cpu count and requested memory are read directly from swf file. Notice, that information included in SWF file is insufficient for using advance reservation in scheduling algorithm. To do so, you must provide XML extension of each job description and fill up its executionTime section. In XML files you can use any ids for job and tasks but you must provide correct IDMapping section (in SWF file header) between XML job/task ids and swf job id. Otherwise, task start up parameters like submit time or task length in instructions will not be calculated correctly. 
     147If XML job description is used, then task requirements are read from XML description instead of SWF file. 
    128148 
    129149 
     
    131151=== Workload generation === 
    132152 
    133 The main goal of workload design was to ensure, that all job descriptions which were used in real resource management system like GRMS or obtained from SWF/GWF log can be used to perform experiment in DCworms simulator. However, it my be difficult for all users to reach such workloads, therefore workload generator was created. 
    134  
    135 Workload generator allows you to create any number of jobs and tasks, with sophisticated resource and time requirements. The result of generation process are: desired number of job descriptions in xml format and swf file with job descriptions and all necessary header parameters (see [https://git.man.poznan.pl/stash/projects/WORMS/repos/dcworms/browse/src/main/resources/simulator/schemas/WorkloadSchema3g.xsd WorkloadSchema.xsd] for details). 
     153The main goal of workload design was to ensure, that all job descriptions which were used in real resource management system like GRMS or obtained from SWF/GWF log can be used to perform an experiment in DCworms simulator. However, it my be difficult for all users to reach such workloads, therefore workload generator was created. 
     154 
     155Workload generator allows you to create any number of jobs and tasks, with sophisticated resource and time requirements. The result of generation process are: desired number of job descriptions in XML format and SWF file with job descriptions and all necessary header parameters (see [https://git.man.poznan.pl/stash/projects/WORMS/repos/dcworms/browse/src/main/resources/simulator/schemas/WorkloadSchema3g.xsd WorkloadSchema.xsd] for details). 
    136156 
    137157Configuration options are provided by two files: