diff options
-rw-r--r-- | scripts/deploy/README.rst | 79 |
1 files changed, 79 insertions, 0 deletions
diff --git a/scripts/deploy/README.rst b/scripts/deploy/README.rst new file mode 100644 index 0000000..69eb1cb --- /dev/null +++ b/scripts/deploy/README.rst @@ -0,0 +1,79 @@ +How do I Reproduce Your Key Result in the Paper? +================================================ + +Step 1 - Environment and Dependencies +------------------------------------- + +Local Environment +----------------- + +- We assume you have the latest ansible_ installed on your work computer (could + be your laptop/home computer). +- On your work computer, you have cloned the latest ``libhotstuff`` repo and + updated all submodules (if not sure, run ``git submodules update --init + --recursive``). Right now, you should be at ``/scripts/deploy`` directory in + your shell (``cd <path-to-your-libhotstuff-repo/scripts/deploy``). + +Remote Environment +------------------ + +- In this example, we use a typical Linux image, Ubuntu 18.04, on Amazon EC2. + But any machine with Ubuntu 18.04 installed may work, in general. +- We assume you have already properly configured the intra-network for the + machines that participate in our experiment. This includes some replica machines + (machines dedicated to running replica processes) and several client + machines. + + - Replica machines should be able to talk to each other via TCP port ranging + from 10000 (default value generated by ``gen_conf.py``, which could + be changed). + - Each client machine should be able to talk to all replica machines via TCP + ranging from 20000. + + - NOTE: In our paper, we used ``c5.4xlarge`` to be match the config of our baselines. + +Step 2 - Generate the Deployment Setup +====================================== + +- Edit both ``replicas.txt`` and ``client.txt``: + + - ``replicas.txt``: each line is the external IP and local IP separated by + one or more spaces. The external IP will be used for control actions + between your work computer and replica machines, whereas the local IP is + the address used in your inter-replica network infrastructure, with which + replicas establish TCP connections with others. + - ``clients.txt``: each line is a single external IP. + - The same IP can appear multiple times in both files. In this case, you will + share the same machine among different processes (not recommended for + replicas due to performance reasons). + +- Generate ``node.ini`` and ``hotstuff.gen.*.conf`` by running ``./gen_all.sh``. +- Change the ssh key configuration in ``group_vars/all.yml``. +- Build ``libhotstuff`` on all remote machines by ``./run.sh setup``. + +Step 3 - Run the Experiment +=========================== + +- (optional) Change the parameters in ``hotstuff.gen.conf`` to your liking. +- (optional) Change the parameters in ``group_vars/clients.yml`` to your liking. +- (for replicas) Create a new experiment run and start all replica processes by ``./run.sh new myrun1``. +- (wait for a while until all replica processes settle down, for good network like EC2, 10 seconds should be more than enough) +- (for replicas) Create a new experiment run and start all client processes by ``./run_cli.sh new myrun1_cli``. +- (wait until all commands are submitted, or you simply would like to end the experiment) +- To collect the results, run ``./run_cli.sh stop myrun1_cli`` and then ``./run_cli.sh fetch myrun1_cli``. +- To analyze the results, run ``cat myrun1_cli/remote/*/log/stderr | python ../thr_hist.py``. +- Finally, stop replicas: ``./run.sh stop myrun1``. + +Other Notes +=========== + +- Each ``./run.sh new`` (same for ``./run_cli.sh``) will create a folder that + contains everything (chosen parameters, raw results) for the run. A good + practice is to always move on to a new name for a different run, so you keep + all of your previous experiments nicely. +- The ``run.sh`` script does NOT detect whether there is some other unfinished + run (it does, however, prevents you from messing up the state of the same run, + given the id like "myrun1"), so you need to make sure you always ``stop`` + (gracefully exit and all results are available) or ``reset`` (simply kill all + processes) any historical runs to start fresh. +- To check the whether processes are still alive: ``./run.sh check myrun1``. |