add readme for deployment scripts

author: Determinant <tederminant@gmail.com> 2020-08-30 01:57:21 -0400
committer: Determinant <tederminant@gmail.com> 2020-08-30 01:57:21 -0400
commit: 04ba288dac334697bcac42788c9fd603cf93a7d6 (patch)
tree: d61a61f5fa75a63158c6f971df94ba99754848b2
parent: 4bb06b93ef16ec182ec24f38f214044615fc13df (diff)
1 files changed, 79 insertions, 0 deletions
diff --git a/scripts/deploy/README.rst b/scripts/deploy/README.rst
new file mode 100644
index 0000000..69eb1cb
--- /dev/null
+++ b/scripts/deploy/README.rst
@@ -0,0 +1,79 @@
+How do I Reproduce Your Key Result in the Paper?
+================================================
+
+Step 1 - Environment and Dependencies
+-------------------------------------
+
+Local Environment
+-----------------
+
+- We assume you have the latest ansible_ installed on your work computer (could
+  be your laptop/home computer).
+- On your work computer, you have cloned the latest ``libhotstuff`` repo and
+   updated all submodules (if not sure, run ``git submodules update --init
+   --recursive``). Right now, you should be at ``/scripts/deploy`` directory in
+   your shell (``cd <path-to-your-libhotstuff-repo/scripts/deploy``).
+
+Remote Environment
+------------------
+
+- In this example, we use a typical Linux image, Ubuntu 18.04, on Amazon EC2.
+  But any machine with Ubuntu 18.04 installed may work, in general.
+- We assume you have already properly configured the intra-network for the
+  machines that participate in our experiment. This includes some replica machines
+  (machines dedicated to running replica processes) and several client
+  machines.
+
+  - Replica machines should be able to talk to each other via TCP port ranging
+    from 10000 (default value generated by ``gen_conf.py``, which could
+    be changed).
+  - Each client machine should be able to talk to all replica machines via TCP
+    ranging from 20000.
+
+  - NOTE: In our paper, we used ``c5.4xlarge`` to be match the config of our baselines.
+
+Step 2 - Generate the Deployment Setup
+======================================
+
+- Edit both ``replicas.txt`` and ``client.txt``:
+
+  - ``replicas.txt``: each line is the external IP and local IP separated by
+    one or more spaces. The external IP will be used for control actions
+    between your work computer and replica machines, whereas the local IP is
+    the address used in your inter-replica network infrastructure, with which
+    replicas establish TCP connections with others.
+  - ``clients.txt``: each line is a single external IP.
+  - The same IP can appear multiple times in both files. In this case, you will
+    share the same machine among different processes (not recommended for
+    replicas due to performance reasons).
+
+- Generate ``node.ini`` and ``hotstuff.gen.*.conf`` by running ``./gen_all.sh``.
+- Change the ssh key configuration in ``group_vars/all.yml``.
+- Build ``libhotstuff`` on all remote machines by ``./run.sh setup``.
+
+Step 3 - Run the Experiment
+===========================
+
+- (optional) Change the parameters in ``hotstuff.gen.conf`` to your liking.
+- (optional) Change the parameters in ``group_vars/clients.yml`` to your liking.
+- (for replicas) Create a new experiment run and start all replica processes by ``./run.sh new myrun1``.
+- (wait for a while until all replica processes settle down, for good network like EC2, 10 seconds should be more than enough)
+- (for replicas) Create a new experiment run and start all client processes by ``./run_cli.sh new myrun1_cli``.
+- (wait until all commands are submitted, or you simply would like to end the experiment)
+- To collect the results, run ``./run_cli.sh stop myrun1_cli`` and then ``./run_cli.sh fetch myrun1_cli``.
+- To analyze the results, run ``cat myrun1_cli/remote/*/log/stderr | python ../thr_hist.py``.
+- Finally, stop replicas: ``./run.sh stop myrun1``.
+
+Other Notes
+===========
+
+- Each ``./run.sh new`` (same for ``./run_cli.sh``) will create a folder that
+  contains everything (chosen parameters, raw results) for the run. A good
+  practice is to always move on to a new name for a different run, so you keep
+  all of your previous experiments nicely.
+- The ``run.sh`` script does NOT detect whether there is some other unfinished
+  run (it does, however, prevents you from messing up the state of the same run,
+  given the id like "myrun1"), so you need to make sure you always ``stop``
+  (gracefully exit and all results are available) or ``reset`` (simply kill all
+  processes) any historical runs to start fresh.
+- To check the whether processes are still alive: ``./run.sh check myrun1``.
author	Determinant <tederminant@gmail.com>	2020-08-30 01:57:21 -0400
committer	Determinant <tederminant@gmail.com>	2020-08-30 01:57:21 -0400
commit	04ba288dac334697bcac42788c9fd603cf93a7d6 (patch)
tree	d61a61f5fa75a63158c6f971df94ba99754848b2
parent	4bb06b93ef16ec182ec24f38f214044615fc13df (diff)