aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorDeterminant <tederminant@gmail.com>2020-08-30 01:57:21 -0400
committerDeterminant <tederminant@gmail.com>2020-08-30 01:57:21 -0400
commit04ba288dac334697bcac42788c9fd603cf93a7d6 (patch)
treed61a61f5fa75a63158c6f971df94ba99754848b2
parent4bb06b93ef16ec182ec24f38f214044615fc13df (diff)
add readme for deployment scripts
-rw-r--r--scripts/deploy/README.rst79
1 files changed, 79 insertions, 0 deletions
diff --git a/scripts/deploy/README.rst b/scripts/deploy/README.rst
new file mode 100644
index 0000000..69eb1cb
--- /dev/null
+++ b/scripts/deploy/README.rst
@@ -0,0 +1,79 @@
+How do I Reproduce Your Key Result in the Paper?
+================================================
+
+Step 1 - Environment and Dependencies
+-------------------------------------
+
+Local Environment
+-----------------
+
+- We assume you have the latest ansible_ installed on your work computer (could
+ be your laptop/home computer).
+- On your work computer, you have cloned the latest ``libhotstuff`` repo and
+ updated all submodules (if not sure, run ``git submodules update --init
+ --recursive``). Right now, you should be at ``/scripts/deploy`` directory in
+ your shell (``cd <path-to-your-libhotstuff-repo/scripts/deploy``).
+
+Remote Environment
+------------------
+
+- In this example, we use a typical Linux image, Ubuntu 18.04, on Amazon EC2.
+ But any machine with Ubuntu 18.04 installed may work, in general.
+- We assume you have already properly configured the intra-network for the
+ machines that participate in our experiment. This includes some replica machines
+ (machines dedicated to running replica processes) and several client
+ machines.
+
+ - Replica machines should be able to talk to each other via TCP port ranging
+ from 10000 (default value generated by ``gen_conf.py``, which could
+ be changed).
+ - Each client machine should be able to talk to all replica machines via TCP
+ ranging from 20000.
+
+ - NOTE: In our paper, we used ``c5.4xlarge`` to be match the config of our baselines.
+
+Step 2 - Generate the Deployment Setup
+======================================
+
+- Edit both ``replicas.txt`` and ``client.txt``:
+
+ - ``replicas.txt``: each line is the external IP and local IP separated by
+ one or more spaces. The external IP will be used for control actions
+ between your work computer and replica machines, whereas the local IP is
+ the address used in your inter-replica network infrastructure, with which
+ replicas establish TCP connections with others.
+ - ``clients.txt``: each line is a single external IP.
+ - The same IP can appear multiple times in both files. In this case, you will
+ share the same machine among different processes (not recommended for
+ replicas due to performance reasons).
+
+- Generate ``node.ini`` and ``hotstuff.gen.*.conf`` by running ``./gen_all.sh``.
+- Change the ssh key configuration in ``group_vars/all.yml``.
+- Build ``libhotstuff`` on all remote machines by ``./run.sh setup``.
+
+Step 3 - Run the Experiment
+===========================
+
+- (optional) Change the parameters in ``hotstuff.gen.conf`` to your liking.
+- (optional) Change the parameters in ``group_vars/clients.yml`` to your liking.
+- (for replicas) Create a new experiment run and start all replica processes by ``./run.sh new myrun1``.
+- (wait for a while until all replica processes settle down, for good network like EC2, 10 seconds should be more than enough)
+- (for replicas) Create a new experiment run and start all client processes by ``./run_cli.sh new myrun1_cli``.
+- (wait until all commands are submitted, or you simply would like to end the experiment)
+- To collect the results, run ``./run_cli.sh stop myrun1_cli`` and then ``./run_cli.sh fetch myrun1_cli``.
+- To analyze the results, run ``cat myrun1_cli/remote/*/log/stderr | python ../thr_hist.py``.
+- Finally, stop replicas: ``./run.sh stop myrun1``.
+
+Other Notes
+===========
+
+- Each ``./run.sh new`` (same for ``./run_cli.sh``) will create a folder that
+ contains everything (chosen parameters, raw results) for the run. A good
+ practice is to always move on to a new name for a different run, so you keep
+ all of your previous experiments nicely.
+- The ``run.sh`` script does NOT detect whether there is some other unfinished
+ run (it does, however, prevents you from messing up the state of the same run,
+ given the id like "myrun1"), so you need to make sure you always ``stop``
+ (gracefully exit and all results are available) or ``reset`` (simply kill all
+ processes) any historical runs to start fresh.
+- To check the whether processes are still alive: ``./run.sh check myrun1``.