aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorDeterminant <ted.sybil@gmail.com>2016-06-08 14:35:57 +0800
committerDeterminant <ted.sybil@gmail.com>2016-06-08 14:35:57 +0800
commitb7cdd5da65a3e4ae58ffcfdf74710cfb1ee6327f (patch)
tree26f9d18391450052d7d1c99262761b1bee510672
parentd88a57f4852c50a2678de950ee650ed9b6a895f0 (diff)
add more doc
-rw-r--r--README.rst20
-rw-r--r--TODO.rst4
-rw-r--r--nerv/doc/source/coding-convention.rst2
-rw-r--r--nerv/doc/source/collaboration-rules.rst203
-rw-r--r--nerv/doc/source/dev.rst6
-rw-r--r--nerv/doc/source/index.rst30
6 files changed, 236 insertions, 29 deletions
diff --git a/README.rst b/README.rst
index 5e04a07..7f101af 100644
--- a/README.rst
+++ b/README.rst
@@ -8,14 +8,20 @@ Installation
First, make sure you have at least one implementation of BLAS and CUDA installed
on your computer.
-- Checkout NERV:
+- Clone NERV:
::
bash
git clone https://speechlab.sjtu.edu.cn/gitlab/nerv-dev/nerv.git
-- Checkout submodules (luajit, luarocks, Penlight, etc.):
+- Checkout the latest tagged version (please change the tag name to the
+ latest):
+
+ ::
+ git checkout beta-1.21
+
+- Download submodules (luajit, LuaRocks, Penlight, etc.):
::
@@ -43,9 +49,10 @@ on your computer.
::
- # checkout speech repository to local directory nerv/speech (suppose you're
- # still at the root directory of NERV repo)
+ # clone and checkout speech repository to local directory nerv/speech
+ # (suppose you're still at the root directory of NERV repo)
git clone https://speechlab.sjtu.edu.cn/gitlab/nerv-dev/nerv-speech.git speech
+ git checkout beta-1.21 # please change the tag name to the latest
# build and install HTK I/O support, Kaldi I/O support, Kaldi decoding support, etc.
make speech BLAS_TYPE=mkl BLAS_BASE=/home/intel/mkl/lib/intel64/ KALDI_BASE=/speechlab/tools/KALDI/kaldi-master/
@@ -60,5 +67,6 @@ request (merge request) to the administrator of the project. If you want to fix
any bugs in existing code, don't hesitate to create a pull (merge) request to
the repository with clear and detailed analysis of the problem. If you want to
add additional task-specific functionalities (modules) for speech to NERV,
-please create a luarocks-compliant package and also a pull (merge) request to
-the ``nerv-speech`` repository instead of ``nerv``.
+please create a LuaRocks-compliant package and also a pull (merge) request to
+the ``nerv-speech`` repository instead of ``nerv``. Please refer to the
+collaboration rules in NERV's doc.
diff --git a/TODO.rst b/TODO.rst
index 7ce606d..56747a8 100644
--- a/TODO.rst
+++ b/TODO.rst
@@ -1,7 +1,7 @@
TODO List
---------
-- NERV user manual
-- NERV overview and introduction
+- NERV user manual (on-going)
+- NERV overview and introduction (done)
- C header file dependency detection in Makefiles
- remove layer ``batch_resize`` API?
diff --git a/nerv/doc/source/coding-convention.rst b/nerv/doc/source/coding-convention.rst
new file mode 100644
index 0000000..8e30dea
--- /dev/null
+++ b/nerv/doc/source/coding-convention.rst
@@ -0,0 +1,2 @@
+Coding Convention
+=================
diff --git a/nerv/doc/source/collaboration-rules.rst b/nerv/doc/source/collaboration-rules.rst
new file mode 100644
index 0000000..b7126c9
--- /dev/null
+++ b/nerv/doc/source/collaboration-rules.rst
@@ -0,0 +1,203 @@
+Collaboration Rules
+===================
+
+Introduction
+------------
+
+This document attempts to stipulate the rules and typical workflows that push
+forward NERV development. It may be updated or complemented with more details
+in future. Anyone who intends to contribute to the official repository must
+read this document before (s)he makes any pull requests to the development
+group or merges the changes into to the repository with permission.
+
+
+Repository
+----------
+
+The latest stable and on-going code are hosted at SpeechLab and maintained with
+the help of Git, a distributed version control system. Despite the "distribute"
+nature of the tool, our project management is centralized, just like Linux
+kernel development which was the original use case of Git. The NERV project, in
+a general sense, includes two major sub-projects whose names known as ``nerv`` and
+``nerv-speech``, respectively. The former contains the core part of NERV, which
+contains a general deep learning implementation. The latter, ``nerv-speech``
+provides with modules (classes) that comply to the API of core NERV and offer
+supports (such as I/O) that are relevant to speech and language processing
+(such as reading HTK/Kaldi features and labels).
+
+Like Torch, NERV uses LuaRocks_ to manage optional components as *packages*.
+When running ``make`` in ``nerv`` repository root, LuaRocks and LuaJIT
+(compiler) will be first setup, then a LuaRock package named ``nerv`` will be
+then installed via LuaRocks, which is to say, the core part of NERV is
+contained in a single LuaRocks package, ``nerv``. Next, by invoking ``make
+speech``, several speech processing packages (such as ``htk_io``, ``kaldi_io``,
+etc) will be compiled and installed from ``nerv/speech`` which ought to be
+checked out from ``nerv-speech`` repository. Therefore, thanks to the
+flexibility of Lua and the modularity brought by LuaRocks, new functionalities
+can be added to NERV and managed in a clear way by building a self-contained
+LuaRocks package with possible dependencies on ``nerv`` or other packages. The
+package systems provides with good isolation so that the contributions can be
+better managed and decoupled from core NERV.
+
+.. _LuaRocks: https://luarocks.org/
+
+Isolation v.s. Completeness
+---------------------------
+
+The loosely organized nature of Lua and the package manager LuaRocks give us
+many possibilities in abstraction and collaboration. However, since no typical
+patterns are really enforced by Lua language, it is impossible to merely hope
+the compiler or interpreter can regulate the implementation by all
+contributors. As mentioned in NERV's overview document, one problem of Torch is
+it strives to isolate components and wrap them up respectively into different
+LuaRocks packages, which is seemingly a good choice for collaboration, however
+not very wise in the long run. The methodology of such "collaboration" means no
+collaboration at all. Under such methodology, each user has the to build
+her/his own package and the reluctance to merge others' code. This leads to
+less and less shared code base and erodes the completeness of a toolkit.
+
+When a new functionality is being added to NERV, there are several approaches,
+where each has its merits and demerits. Therefore, here, we describe each
+possibility and stipulate under which condition should the contributor takes it
+as the resort.
+
+- A gentle *modifition* (mod or "hacking"): just as those in video games, a
+ mod is like a temporary patch applied to the original toolkit that slightly
+ *overrides* some default features or behaviors. Thanks to the looseness of
+ Lua, any NERV components can be altered or overriden by simply redefining the
+ set of functions or classes that should be modified in the user script after
+ loading the default ones. These modifications are only legal in user scripts,
+ reflecting the difference between a task-specific user script with the
+ standard one. The advantage of such approach is to confine the modifications
+ into one place so all users can use the same toolkit code base while leaving
+ modifications visible to others, rather than hacking the official source
+ directly and individually which ends up in different code bases that cannot
+ be shared and are difficult to detect modifications to synchronize the implementations.
+
+ We encourage end users should first try this way if the default behavior of
+ NERV cannot be changed to suit your needs due to limited options or
+ generality. No matter how general your alternative approaches are, try this
+ at first to make sure your implementation works as expected without touching
+ the shared code base. After that, if your modifications are meaningful for
+ many other tasks, which means, general enough, please abstract out the
+ non-task-specific part and consider directly contribute to the shared code base
+ (take other approaches listed below).
+
+- Making a *LuaRocks package*: a LuaRocks package is meant to be shared among the
+ users who demand an extra common functionality:
+
+ - which is not generally needed by the majority (e.g., an unusual network
+ structure or training method, etc.), or
+ - which is experimental, so temporarily cannot be merged into NERV (due to
+ some implementation or stability issues), or
+ - which is naturally a self-contained or de-coupled extension for NERV (e.g,
+ I/O readers)
+ - contains modifications or feature enhancements written in not only Lua but
+ also C/C++ (e.g, efficient data processing or new layer computations).
+
+ Please note that making a hybrid LuaRocks package containing C/C++
+ implementations might be a little difficult for the contributors who are not
+ very familiar with writing ``Makefile`` or similar C/C++ auto building
+ scripts. However, it is extremely easy to write a pure LuaRocks package or to
+ convert a above-mentioned modification into a valid package.
+
+- Making a git *branch* from "master": this measure is usually taken by
+ developers or a contributor who knows well about the NERV internals. This
+ branching technique can be used under the following circumstances.
+
+ - Core developers make major changes to NERV that can possibly break the
+ existing functionalities.
+ - Core developers merge major changes from pull request.
+ - Contributors make contributions in C/C++ code.
+ - Contributors submit their LuaRocks packages.
+ - End users need to locally modify the C/C++ code to change the default behavior
+ (these branches will only exist in their local repositories and are less
+ likely to be merged into the official master branch unless they generalize
+ them and send pull requests to core developers).
+
+ Contributors should keep the changes in their branches clear and should not
+ make changes that can only run correctly on their own tasks or with
+ particular settings, nor should they break the existing functionalities of
+ NERV. The developers need to carefully review and qualify the changes by
+ understanding the meaning of each line of code as well as the possible
+ side-effects, if exist, leave comments to explain.
+
+- Duplicate the code: this is only for testing or personal use. It is *NOT* a
+ way to collaborate or contribute.
+
+When making a Lua modification or LuaRocks package as mentioned, end users or
+contributors should always keep in mind the following principle:
+
+- Try to disentangle the original issue by abstraction.
+- Try to consider whether the solution could be generalized to solve others' problems.
+- Try to override the default components (implemented by functions, classes) as
+ "high-level" as possible. For example, when there is an opportunity to
+ achieve your goal by hacking a trainer (scheduler), *DO NOT* change
+ implementations for layers or buffers or even CUDA implementation. When there
+ is a change of changing one function of a trainer, *DO NOT* re-implement the
+ whole trainer.
+- Try to follow the coding convention in the official code.
+
+Workflows
+---------
+
+- End users usually slightly adjust the behavior of NERV via *modifications* if
+ options do not help much. These mods are only for local use.
+
+- For a contributor, when there is a common need of an additional
+ functionality:
+
+ 1. Fork the ``nerv-speech``: make a local branch with a concise name consists
+ of only lower case alphadigits or hyphens (regex: ``[a-z][a-z0-9-]*``).
+
+ 2. Generalize your modifications into a LuaRocks package (naming convention:
+ ``[a-z][a-z0-9_]*``).
+
+ 3. Put the LuaRocks package as a new directory under the root directory of
+ ``nerv-speech``. Include possible tutorials in ``/tutorial`` if any.
+ Package documents should be located at ``doc`` directory of your
+ package. All documents should be in plain-text format, however,
+ human-readable lightweight markup formats are preferred, such as
+ Markdown or reStructuredText. *DO NOT* change other directories in
+ ``nerv-speech``.
+
+ 4. Commit your changes with a brief but meaningful message. Try to stash your
+ commits to a single commit if there are too many. Avoid meaningless
+ messages such as "...".
+
+ 5. Send a pull request of your branch to the developers.
+
+- For those contributors interested in contributing to core NERV:
+
+ 1. Fork the ``nerv``: make a local branch with a concise name consist
+ of only lower case alphabets, digits or hyphens (regex: ``[a-z][a-z0-9-]*``).
+
+ 2. Make changes.
+ 3. Commit your changes with a brief but meaningful message. Try to stash your
+ commits to a single one if there are too many. Avoid meaningless
+ messages such as "...".
+
+ 4. Send a pull request of your branch to the developers.
+
+- Developers could only merge the tested code written with appropriate coding
+ convention.
+
+- A stable release is denoted by a Git tag with version number as its name.
+- The version number is in the format of: ``<prefix>-<major number>.<minor
+ number>``, where the ``<prefix>-`` and ``.<minor number>`` are optional. Here
+ are some examples:
+
+ - ``alpha-1``
+ - ``alpha-1.1``
+ - ``alpha-4``
+ - ``beta-1.2``
+ - ``beta-1.21``
+ - ``1.0``
+
+- For a given version, the complete release is the commit tagged by the largest
+ version number which does not exceed the given number in both repositories,
+ i.e., ``nerv`` and ``nerv-speech``. End users should checkout the latest
+ version for general use by the tags with the largest version number in both
+ repositories, for checking out, please refer to ``README.rst`` in ``nerv``.
+
+- Developers must test major tasks on the version that is going to be tagged.
diff --git a/nerv/doc/source/dev.rst b/nerv/doc/source/dev.rst
index 30311a2..0b4661c 100644
--- a/nerv/doc/source/dev.rst
+++ b/nerv/doc/source/dev.rst
@@ -1,3 +1,7 @@
Development Manual
==================
-To be filled.
+
+.. toctree::
+
+ collaboration-rules
+ coding-convention
diff --git a/nerv/doc/source/index.rst b/nerv/doc/source/index.rst
index 24d1fe2..097344a 100644
--- a/nerv/doc/source/index.rst
+++ b/nerv/doc/source/index.rst
@@ -17,26 +17,16 @@ Contents:
TODO List
---------
-+----------+--------------------------------------------------------------------------+-------------+
-| Status/ | Task | Assignee |
-| Priority | | |
-+==========+==========================================================================+=============+
-| High | Generalize nerv.Matrix to nerv.Tensor (use the same API as Torch Tensor) | Mengxiao Bi |
-+----------+--------------------------------------------------------------------------+-------------+
-| High | Development manual: coding style & contribution rules | Ted Yin |
-+----------+--------------------------------------------------------------------------+-------------+
-| High | Development manual: Error reporting & Logging standard | Ted Yin |
-+----------+--------------------------------------------------------------------------+-------------+
-| High | support for basic RNN | Tianxing He |
-+----------+--------------------------------------------------------------------------+-------------+
-| High | support for RNN/LSTM | Tianxing He |
-+----------+--------------------------------------------------------------------------+-------------+
-| High | support for CNN | Mengxiao Bi |
-+----------+--------------------------------------------------------------------------+-------------+
-| Mid | User manual | ALL |
-+----------+--------------------------------------------------------------------------+-------------+
-| Low | Development manual: general reference | N/A |
-+----------+--------------------------------------------------------------------------+-------------+
++----------+--------------------------------------------------------------------------+----------+
+| Status/ | Task | Assignee |
+| Priority | | |
++==========+==========================================================================+==========+
+| On-going | Development manual: coding style & contribution rules | Ted Yin |
++----------+--------------------------------------------------------------------------+----------+
+| High | Generalize nerv.Matrix to nerv.Tensor (use the same API as Torch Tensor) | TBD. |
++----------+--------------------------------------------------------------------------+----------+
+| High | Merge the CNN branch | TBD. |
++----------+--------------------------------------------------------------------------+----------+
Indices and tables
==================