diff --git a/README.md b/README.md index 7c471be2..0f060731 100644 --- a/README.md +++ b/README.md @@ -1,111 +1,66 @@ Peel Experiments Execution Framework ==================================== -Peel is a framework that assists you in the execution and results analysis of experiments evaluating massively-parallel algorithms and systems. +Peel is a framework that helps you to define, execute, analyze, and share experiments for distributed systems and algorithms. + +For more information and technical documentation about the project, please visit [peel-framework.org](http://peel-framework.org). + +Check the [Motivation section on our website](http://peel-framework.org/manual/motivation.html) to understand the problems Peel will solve for you. + +Check the [Getting Started guide](http://peel-framework.org/getting-started.html) and the [Bundle Basics](http://peel-framework.org/manual/motivation.html) section. + +

+ Peel Mango +

+ +## Main Features + +Peel offers the following features for your experiments. + +- **Unified Design** Specify and maintain collections of experiments using a simple, DI-based configuration. +- **Automated Execution** Automate the experiment execution lifecycle. +- **Automated Analysis** Extracts, transforms, and loads results into an RDBMS. +- **Result Sharing** Share your bundles and migrating to other evaluation environments without additional effort. + +## Supported Systems + +| System | Version | System bean ID | +| ---------------- | -------------- | ----------------- | +| HDFS | 1.2.1 | `hdfs-1.2.1` | +| HDFS | 2.4.1 | `hdfs-2.4.1` | +| HDFS | 2.7.1 | `hdfs-2.7.1` | +| Flink | 0.8.0 | `flink-0.8.0` | +| Flink | 0.8.1 | `flink-0.8.1` | +| Flink | 0.9.0 | `flink-0.9.0` | +| Flink | 0.10.0 | `flink-0.10.0` | +| Flink | 0.10.1 | `flink-0.10.1` | +| Flink | 1.0.0 | `flink-1.0.0` | +| MapReduce | 1.2.1 | `mapred-1.2.1` | +| MapReduce | 2.4.1 | `mapred-2.4.1` | +| Spark | 1.3.1 | `spark-1.3.1` | +| Spark | 1.4.0 | `spark-1.4.0` | +| Spark | 1.5.1 | `spark-1.5.1` | +| Spark | 1.5.2 | `spark-1.5.2` | +| Spark | 1.6.0 | `spark-1.6.0` | +| Zookeeper | 3.4.5 | `zookeeper-3.4.5` | +| Dstat | 0.7.2 | `dstat-0.7.2` | + +## Supported Commands + +| Command | Description | +| -------------------- | -------------------------------------------------- | +| `db:import` | import suite results into an initialized database | +| `db:initialize` | initialize results database | +| `exp:run` | execute a specific experiment | +| `exp:setup` | set up systems for a specific experiment | +| `exp:teardown` | tear down systems for a specific experiment | +| `hosts:generate` | generate a hosts.conf file | +| `res:archive` | archive suite results to a tar.gz | +| `res:extract` | extract suite results from a tar.gz | +| `rsync:pull` | pull bundle from a remote location | +| `rsync:push` | push bundle to a remote location | +| `suite:run` | execute all experiments in a suite | +| `sys:setup` | set up a system | +| `sys:teardown` | tear down a system | +| `val:hosts` | validates correct hosts setup | -### The main features are - -- Specify and maintain a collection of reusable experiments -- Use default parameters or configure your systems with your favorite settings -- Run different experiments on a variety of systems -- Evaluate and compare the performance of your systems - -#### Specify dependencies in your systems and let peel do the rest! -Peel automatically sets up the systems that you specify while taking care of dependency-relations. If you want to change parameters between experiments in your suite, peel updates the corresponding systems and their dependants automatically. All you need to do is to specify your experiments and let peel do the rest! - -*Supported Systems* -- [Flink](http://flink.incubator.apache.org/) 0.5.1 & 0.6 -- [Hadoop](http://hadoop.apache.org/) 1.2.1 (HDFS and MapReduce) -- [Spark](http://spark.apache.org/) 1.1.0 - - -## Executing Experiments With Peel - -The peel command line interface offers a comprehensive help of the available commands: - -```bash -./peel -h -``` - -Please refer to the help screens for more detailed information. - -### Executing a Single Experiment - -If you are not sure if the configuration and the experiment jars work fine, you can run a single experiment with the following sequence of commands: - -```bash -# 1. set-up all systems systems -./peel exp:setup ${SUITE} ${EXPERIMENT} --experiments ${EXPFILE} -# 2. just run the experiment, skip system set-up and tear-down -./peel exp:run ${SUITE} ${EXPERIMENT} --experiments ${EXPFILE} --run 1 --just -# 3. run the experiment without system set-up or tear-down -./peel exp:teardown ${SUITE} ${EXPERIMENT} --experiments ${EXPFILE} -``` - -Halt and repeat the second step if your algorithm does not terminate or is too slow. When you are done, make sure you execute the third step to shut everything down. - -### Executing All Experiments in a Suite - -To run a you use the **suite:run** command and specify the experiments file and the id of the suite in your fixture: - -```bash -./peel suite:run --experiments ${EXPFILE} ${SUITE} -``` - -Results from the suite experiments are written into the **${app.path.results}/${suite}/${experiment.name}** folders. -The **state.json** file in each folder contains information about the exit state of the experiment. -Per default, running the same suite again will skip experiments that returned a zero exit code. -If you want to rerun a particular experiment, delete the corresponding experiment folder before you rerun the suite. - -## Creating Fixtures - - -In **experiments.wordcount.xml** you can see a minimal example that runs the example Stratosphere wordcount job with one, two, three, and four slaves. -The following table provides a short descriptions of the beans configured in this file. - -Bean | Description -:--------------------------|------------------------ -stratosphere | Defines stratosphere as the system to use with the dependency to hdfs -dataset.shakespeare | The data set used for the experiment. The src arg specifies the location of the compressed data set. The dst arg specifies the location the data set is extracted to. -experiment.stratosphere | Abstract specification for an experiment. Defines the runs (6) and the system used (stratosphere) -experiment.stratosphre.wc | Specialization of experiment.stratosphere. Specifies the executed command ( ) and the data set used. -wc.local / wc.cloud-7 | Specifies a suite (collection) of experiments to be executed as a whole. - -To setup your own experiment for your algorithm, you just have to create a fixture describing the experiment. -The **experiments.template.xml** contains preconfigured beans for all data sets as well as inline pointers for adapting the file to your algorithm. -You can use this file as a starting point for your configuration. Overall, the required steps are: - -1. Introduce a new suffix for your experiment. -1. Replace the command string with the right path to your jar, input and output file. -1. Replace the referenced data set with the one suitable for your experiment. - -Placeholders in the fixture files (e.g. ${app.path.datasets}) are references to values specified in configuration files. -For each experiment, Peel will load and resolve the following hierarchy of configuration files: - -1. Reference application configuration: [**resource:reference.conf**](https://github.com/citlab/peel/blob/master/peel-core/src/main/resources/reference.conf). -1. For each **system** with bean id **systemID**: - 1. Reference system configuration [**reference.${system}.conf**](https://github.com/citlab/peel/blob/master/peel-extensions/src/main/resources). - 1. Custom bean configuration located at **${app.path.config}/${systemID}.conf** (optional). - 1. Host-specific custom bean configuration located at **${app.path.config}/${hostname}/${systemID}.conf** (optional). -1. Custom application configuration located at **${app.path.config}/application.conf** (optional). -1. Custom application configuration located at **${app.path.config}/${hostname}/application.conf** (optional). -1. Experiment configuration defined in the **config** argument of the current experiment. -1. The Java system properties. - -### Example: WordCount - -This section contains some examples based on the example **wordcount** experiments file. - -To execute **run #1** from the **wc.single-run** experiment in the **wc.default** suite, type: - -```bash -./peel exp:setup wc.default wc.single-run --experiments ./config/experiments.wordcount.xml -./peel exp:run wc.default wc.single-run --experiments ./config/experiments.wordcount.xml --run 1 --just -./peel exp:teardown wc.default wc.single-run --experiments ./config/experiments.wordcount.xml -``` - -To execute all runs from all experiments in the suite, type: - -```bash -./peel run-suite --experiments ./config/experiments.wordcount.xml -```