add more doc

author: Determinant <ted.sybil@gmail.com> 2016-06-08 14:35:57 +0800
committer: Determinant <ted.sybil@gmail.com> 2016-06-08 14:35:57 +0800
commit: b7cdd5da65a3e4ae58ffcfdf74710cfb1ee6327f (patch)
tree: 26f9d18391450052d7d1c99262761b1bee510672
parent: d88a57f4852c50a2678de950ee650ed9b6a895f0 (diff)
6 files changed, 236 insertions, 29 deletions
diff --git a/README.rst b/README.rst
index 5e04a07..7f101af 100644
--- a/README.rst
+++ b/README.rst
@@ -8,14 +8,20 @@ Installation
 First, make sure you have at least one implementation of BLAS and CUDA installed
 on your computer.
 
-- Checkout NERV:
+- Clone NERV:
   
   ::
 
     bash
     git clone https://speechlab.sjtu.edu.cn/gitlab/nerv-dev/nerv.git
 
-- Checkout submodules (luajit, luarocks, Penlight, etc.):
+- Checkout the latest tagged version (please change the tag name to the
+  latest):
+
+  ::
+    git checkout beta-1.21
+
+- Download submodules (luajit, LuaRocks, Penlight, etc.):
 
   ::
 
@@ -43,9 +49,10 @@ on your computer.
 
   ::
 
-    # checkout speech repository to local directory nerv/speech (suppose you're
-    # still at the root directory of NERV repo)
+    # clone and checkout speech repository to local directory nerv/speech
+    # (suppose you're still at the root directory of NERV repo)
     git clone https://speechlab.sjtu.edu.cn/gitlab/nerv-dev/nerv-speech.git speech
+    git checkout beta-1.21 # please change the tag name to the latest
     # build and install HTK I/O support, Kaldi I/O support, Kaldi decoding support, etc.
     make speech BLAS_TYPE=mkl BLAS_BASE=/home/intel/mkl/lib/intel64/ KALDI_BASE=/speechlab/tools/KALDI/kaldi-master/
 
@@ -60,5 +67,6 @@ request (merge request) to the administrator of the project. If you want to fix
 any bugs in existing code, don't hesitate to create a pull (merge) request to
 the repository with clear and detailed analysis of the problem. If you want to
 add additional task-specific functionalities (modules) for speech to NERV,
-please create a luarocks-compliant package and also a pull (merge) request to
-the ``nerv-speech`` repository instead of ``nerv``.
+please create a LuaRocks-compliant package and also a pull (merge) request to
+the ``nerv-speech`` repository instead of ``nerv``. Please refer to the
+collaboration rules in NERV's doc.
diff --git a/TODO.rst b/TODO.rst
index 7ce606d..56747a8 100644
--- a/TODO.rst
+++ b/TODO.rst
@@ -1,7 +1,7 @@
 TODO List
 ---------
 
-- NERV user manual
-- NERV overview and introduction
+- NERV user manual (on-going)
+- NERV overview and introduction (done)
 - C header file dependency detection in Makefiles
 - remove layer ``batch_resize`` API?
diff --git a/nerv/doc/source/coding-convention.rst b/nerv/doc/source/coding-convention.rst
new file mode 100644
index 0000000..8e30dea
--- /dev/null
+++ b/nerv/doc/source/coding-convention.rst
@@ -0,0 +1,2 @@
+Coding Convention
+=================
diff --git a/nerv/doc/source/collaboration-rules.rst b/nerv/doc/source/collaboration-rules.rst
new file mode 100644
index 0000000..b7126c9
--- /dev/null
+++ b/nerv/doc/source/collaboration-rules.rst
@@ -0,0 +1,203 @@
+Collaboration Rules
+===================
+
+Introduction
+------------
+
+This document attempts to stipulate the rules and typical workflows that push
+forward NERV development. It may be updated or complemented with more details
+in future. Anyone who intends to contribute to the official repository must
+read this document before (s)he makes any pull requests to the development
+group or merges the changes into to the repository with permission.
+
+
+Repository
+----------
+
+The latest stable and on-going code are hosted at SpeechLab and maintained with
+the help of Git, a distributed version control system. Despite the "distribute"
+nature of the tool, our project management is centralized, just like Linux
+kernel development which was the original use case of Git. The NERV project, in
+a general sense, includes two major sub-projects whose names known as ``nerv`` and
+``nerv-speech``, respectively. The former contains the core part of NERV, which
+contains a general deep learning implementation. The latter, ``nerv-speech``
+provides with modules (classes) that comply to the API of core NERV and offer
+supports (such as I/O) that are relevant to speech and language processing
+(such as reading HTK/Kaldi features and labels).
+
+Like Torch, NERV uses LuaRocks_ to manage optional components as *packages*.
+When running ``make`` in ``nerv`` repository root, LuaRocks and LuaJIT
+(compiler) will be first setup, then a LuaRock package named ``nerv`` will be
+then installed via LuaRocks, which is to say, the core part of NERV is
+contained in a single LuaRocks package, ``nerv``. Next, by invoking ``make
+speech``, several speech processing packages (such as ``htk_io``, ``kaldi_io``,
+etc) will be compiled and installed from ``nerv/speech`` which ought to be
+checked out from ``nerv-speech`` repository.  Therefore, thanks to the
+flexibility of Lua and the modularity brought by LuaRocks, new functionalities
+can be added to NERV and managed in a clear way by building a self-contained
+LuaRocks package with possible dependencies on ``nerv`` or other packages. The
+package systems provides with good isolation so that the contributions can be
+better managed and decoupled from core NERV.
+
+.. _LuaRocks: https://luarocks.org/
+
+Isolation v.s. Completeness
+---------------------------
+
+The loosely organized nature of Lua and the package manager LuaRocks give us
+many possibilities in abstraction and collaboration. However, since no typical
+patterns are really enforced by Lua language, it is impossible to merely hope
+the compiler or interpreter can regulate the implementation by all
+contributors. As mentioned in NERV's overview document, one problem of Torch is
+it strives to isolate components and wrap them up respectively into different
+LuaRocks packages, which is seemingly a good choice for collaboration, however
+not very wise in the long run. The methodology of such "collaboration" means no
+collaboration at all. Under such methodology, each user has the to build
+her/his own package and the reluctance to merge others' code. This leads to
+less and less shared code base and erodes the completeness of a toolkit.
+
+When a new functionality is being added to NERV, there are several approaches,
+where each has its merits and demerits. Therefore, here, we describe each
+possibility and stipulate under which condition should the contributor takes it
+as the resort.
+
+- A gentle *modifition* (mod or "hacking"): just as those in video games, a
+  mod is like a temporary patch applied to the original toolkit that slightly
+  *overrides* some default features or behaviors. Thanks to the looseness of
+  Lua, any NERV components can be altered or overriden by simply redefining the
+  set of functions or classes that should be modified in the user script after
+  loading the default ones. These modifications are only legal in user scripts,
+  reflecting the difference between a task-specific user script with the
+  standard one. The advantage of such approach is to confine the modifications
+  into one place so all users can use the same toolkit code base while leaving
+  modifications visible to others, rather than hacking the official source
+  directly and individually which ends up in different code bases that cannot
+  be shared and are difficult to detect modifications to synchronize the implementations.
+
+  We encourage end users should first try this way if the default behavior of
+  NERV cannot be changed to suit your needs due to limited options or
+  generality. No matter how general your alternative approaches are, try this
+  at first to make sure your implementation works as expected without touching
+  the shared code base.  After that, if your modifications are meaningful for
+  many other tasks, which means, general enough, please abstract out the
+  non-task-specific part and consider directly contribute to the shared code base
+  (take other approaches listed below).
+
+- Making a *LuaRocks package*: a LuaRocks package is meant to be shared among the
+  users who demand an extra common functionality:
+
+  - which is not generally needed by the majority (e.g., an unusual network
+    structure or training method, etc.), or
+  - which is experimental, so temporarily cannot be merged into NERV (due to
+    some implementation or stability issues), or
+  - which is naturally a self-contained or de-coupled extension for NERV (e.g,
+    I/O readers)
+  - contains modifications or feature enhancements written in not only Lua but
+    also C/C++ (e.g, efficient data processing or new layer computations).
+
+  Please note that making a hybrid LuaRocks package containing C/C++
+  implementations might be a little difficult for the contributors who are not
+  very familiar with writing ``Makefile`` or similar C/C++ auto building
+  scripts. However, it is extremely easy to write a pure LuaRocks package or to
+  convert a above-mentioned modification into a valid package.
+
+- Making a git *branch* from "master": this measure is usually taken by
+  developers or a contributor who knows well about the NERV internals. This
+  branching technique can be used under the following circumstances.
+
+  - Core developers make major changes to NERV that can possibly break the
+    existing functionalities.
+  - Core developers merge major changes from pull request.
+  - Contributors make contributions in C/C++ code.
+  - Contributors submit their LuaRocks packages.
+  - End users need to locally modify the C/C++ code to change the default behavior
+    (these branches will only exist in their local repositories and are less
+    likely to be merged into the official master branch unless they generalize
+    them and send pull requests to core developers).
+
+  Contributors should keep the changes in their branches clear and should not
+  make changes that can only run correctly on their own tasks or with
+  particular settings, nor should they break the existing functionalities of
+  NERV. The developers need to carefully review and qualify the changes by
+  understanding the meaning of each line of code as well as the possible
+  side-effects, if exist, leave comments to explain.
+
+- Duplicate the code: this is only for testing or personal use. It is *NOT* a
+  way to collaborate or contribute.
+
+When making a Lua modification or LuaRocks package as mentioned, end users or
+contributors should always keep in mind the following principle:
+
+- Try to disentangle the original issue by abstraction.
+- Try to consider whether the solution could be generalized to solve others' problems.
+- Try to override the default components (implemented by functions, classes) as
+  "high-level" as possible. For example, when there is an opportunity to
+  achieve your goal by hacking a trainer (scheduler), *DO NOT* change
+  implementations for layers or buffers or even CUDA implementation. When there
+  is a change of changing one function of a trainer, *DO NOT* re-implement the
+  whole trainer.
+- Try to follow the coding convention in the official code.
+
+Workflows
+---------
+
+- End users usually slightly adjust the behavior of NERV via *modifications* if
+  options do not help much. These mods are only for local use.
+
+- For a contributor, when there is a common need of an additional
+  functionality:
+  
+  1. Fork the ``nerv-speech``: make a local branch with a concise name consists
+     of only lower case alphadigits or hyphens (regex: ``[a-z][a-z0-9-]*``).
+
+  2. Generalize your modifications into a LuaRocks package (naming convention:
+     ``[a-z][a-z0-9_]*``).
+     
+  3. Put the LuaRocks package as a new directory under the root directory of
+     ``nerv-speech``. Include possible tutorials in ``/tutorial`` if any.
+     Package documents should be located at ``doc`` directory of your
+     package. All documents should be in plain-text format, however,
+     human-readable lightweight markup formats are preferred, such as
+     Markdown or reStructuredText. *DO NOT* change other directories in
+     ``nerv-speech``.
+
+  4. Commit your changes with a brief but meaningful message. Try to stash your
+     commits to a single commit if there are too many. Avoid meaningless
+     messages such as "...".
+
+  5. Send a pull request of your branch to the developers.
+
+- For those contributors interested in contributing to core NERV:
+
+  1. Fork the ``nerv``: make a local branch with a concise name consist
+     of only lower case alphabets, digits or hyphens (regex: ``[a-z][a-z0-9-]*``).
+
+  2. Make changes.
+  3. Commit your changes with a brief but meaningful message. Try to stash your
+     commits to a single one if there are too many. Avoid meaningless
+     messages such as "...".
+
+  4. Send a pull request of your branch to the developers.
+
+- Developers could only merge the tested code written with appropriate coding
+  convention.
+
+- A stable release is denoted by a Git tag with version number as its name.
+- The version number is in the format of: ``<prefix>-<major number>.<minor
+  number>``, where the ``<prefix>-`` and ``.<minor number>`` are optional. Here
+  are some examples:
+
+  - ``alpha-1``
+  - ``alpha-1.1``
+  - ``alpha-4``
+  - ``beta-1.2``
+  - ``beta-1.21``
+  - ``1.0``
+
+- For a given version, the complete release is the commit tagged by the largest
+  version number which does not exceed the given number in both repositories,
+  i.e., ``nerv`` and ``nerv-speech``. End users should checkout the latest
+  version for general use by the tags with the largest version number in both
+  repositories, for checking out, please refer to ``README.rst`` in ``nerv``.
+
+- Developers must test major tasks on the version that is going to be tagged.
diff --git a/nerv/doc/source/dev.rst b/nerv/doc/source/dev.rst
index 30311a2..0b4661c 100644
--- a/nerv/doc/source/dev.rst
+++ b/nerv/doc/source/dev.rst
@@ -1,3 +1,7 @@
 Development Manual
 ==================
-To be filled.
+
+.. toctree::
+
+   collaboration-rules
+   coding-convention
diff --git a/nerv/doc/source/index.rst b/nerv/doc/source/index.rst
index 24d1fe2..097344a 100644
--- a/nerv/doc/source/index.rst
+++ b/nerv/doc/source/index.rst
@@ -17,26 +17,16 @@ Contents:
 TODO List
 ---------
 
-+----------+--------------------------------------------------------------------------+-------------+
-| Status/  | Task                                                                     | Assignee    |
-| Priority |                                                                          |             |
-+==========+==========================================================================+=============+
-| High     | Generalize nerv.Matrix to nerv.Tensor (use the same API as Torch Tensor) | Mengxiao Bi |
-+----------+--------------------------------------------------------------------------+-------------+
-| High     | Development manual: coding style & contribution rules                    | Ted Yin     |
-+----------+--------------------------------------------------------------------------+-------------+
-| High     | Development manual: Error reporting & Logging standard                   | Ted Yin     |
-+----------+--------------------------------------------------------------------------+-------------+
-| High     | support for basic RNN                                                    | Tianxing He |
-+----------+--------------------------------------------------------------------------+-------------+
-| High     | support for RNN/LSTM                                                     | Tianxing He |
-+----------+--------------------------------------------------------------------------+-------------+
-| High     | support for CNN                                                          | Mengxiao Bi |
-+----------+--------------------------------------------------------------------------+-------------+
-| Mid      | User manual                                                              | ALL         |
-+----------+--------------------------------------------------------------------------+-------------+
-| Low      | Development manual: general reference                                    | N/A         |
-+----------+--------------------------------------------------------------------------+-------------+
++----------+--------------------------------------------------------------------------+----------+
+| Status/  | Task                                                                     | Assignee |
+| Priority |                                                                          |          |
++==========+==========================================================================+==========+
+| On-going | Development manual: coding style & contribution rules                    | Ted Yin  |
++----------+--------------------------------------------------------------------------+----------+
+| High     | Generalize nerv.Matrix to nerv.Tensor (use the same API as Torch Tensor) | TBD.     |
++----------+--------------------------------------------------------------------------+----------+
+| High     | Merge the CNN branch                                                     | TBD.     |
++----------+--------------------------------------------------------------------------+----------+
 
 Indices and tables
 ==================
author	Determinant <ted.sybil@gmail.com>	2016-06-08 14:35:57 +0800
committer	Determinant <ted.sybil@gmail.com>	2016-06-08 14:35:57 +0800
commit	b7cdd5da65a3e4ae58ffcfdf74710cfb1ee6327f (patch)
tree	26f9d18391450052d7d1c99262761b1bee510672
parent	d88a57f4852c50a2678de950ee650ed9b6a895f0 (diff)