diff options
author | Determinant <[email protected]> | 2016-06-08 14:35:57 +0800 |
---|---|---|
committer | Determinant <[email protected]> | 2016-06-08 14:35:57 +0800 |
commit | b7cdd5da65a3e4ae58ffcfdf74710cfb1ee6327f (patch) | |
tree | 26f9d18391450052d7d1c99262761b1bee510672 | |
parent | d88a57f4852c50a2678de950ee650ed9b6a895f0 (diff) |
add more doc
-rw-r--r-- | README.rst | 20 | ||||
-rw-r--r-- | TODO.rst | 4 | ||||
-rw-r--r-- | nerv/doc/source/coding-convention.rst | 2 | ||||
-rw-r--r-- | nerv/doc/source/collaboration-rules.rst | 203 | ||||
-rw-r--r-- | nerv/doc/source/dev.rst | 6 | ||||
-rw-r--r-- | nerv/doc/source/index.rst | 30 |
6 files changed, 236 insertions, 29 deletions
@@ -8,14 +8,20 @@ Installation First, make sure you have at least one implementation of BLAS and CUDA installed on your computer. -- Checkout NERV: +- Clone NERV: :: bash git clone https://speechlab.sjtu.edu.cn/gitlab/nerv-dev/nerv.git -- Checkout submodules (luajit, luarocks, Penlight, etc.): +- Checkout the latest tagged version (please change the tag name to the + latest): + + :: + git checkout beta-1.21 + +- Download submodules (luajit, LuaRocks, Penlight, etc.): :: @@ -43,9 +49,10 @@ on your computer. :: - # checkout speech repository to local directory nerv/speech (suppose you're - # still at the root directory of NERV repo) + # clone and checkout speech repository to local directory nerv/speech + # (suppose you're still at the root directory of NERV repo) git clone https://speechlab.sjtu.edu.cn/gitlab/nerv-dev/nerv-speech.git speech + git checkout beta-1.21 # please change the tag name to the latest # build and install HTK I/O support, Kaldi I/O support, Kaldi decoding support, etc. make speech BLAS_TYPE=mkl BLAS_BASE=/home/intel/mkl/lib/intel64/ KALDI_BASE=/speechlab/tools/KALDI/kaldi-master/ @@ -60,5 +67,6 @@ request (merge request) to the administrator of the project. If you want to fix any bugs in existing code, don't hesitate to create a pull (merge) request to the repository with clear and detailed analysis of the problem. If you want to add additional task-specific functionalities (modules) for speech to NERV, -please create a luarocks-compliant package and also a pull (merge) request to -the ``nerv-speech`` repository instead of ``nerv``. +please create a LuaRocks-compliant package and also a pull (merge) request to +the ``nerv-speech`` repository instead of ``nerv``. Please refer to the +collaboration rules in NERV's doc. @@ -1,7 +1,7 @@ TODO List --------- -- NERV user manual -- NERV overview and introduction +- NERV user manual (on-going) +- NERV overview and introduction (done) - C header file dependency detection in Makefiles - remove layer ``batch_resize`` API? diff --git a/nerv/doc/source/coding-convention.rst b/nerv/doc/source/coding-convention.rst new file mode 100644 index 0000000..8e30dea --- /dev/null +++ b/nerv/doc/source/coding-convention.rst @@ -0,0 +1,2 @@ +Coding Convention +================= diff --git a/nerv/doc/source/collaboration-rules.rst b/nerv/doc/source/collaboration-rules.rst new file mode 100644 index 0000000..b7126c9 --- /dev/null +++ b/nerv/doc/source/collaboration-rules.rst @@ -0,0 +1,203 @@ +Collaboration Rules +=================== + +Introduction +------------ + +This document attempts to stipulate the rules and typical workflows that push +forward NERV development. It may be updated or complemented with more details +in future. Anyone who intends to contribute to the official repository must +read this document before (s)he makes any pull requests to the development +group or merges the changes into to the repository with permission. + + +Repository +---------- + +The latest stable and on-going code are hosted at SpeechLab and maintained with +the help of Git, a distributed version control system. Despite the "distribute" +nature of the tool, our project management is centralized, just like Linux +kernel development which was the original use case of Git. The NERV project, in +a general sense, includes two major sub-projects whose names known as ``nerv`` and +``nerv-speech``, respectively. The former contains the core part of NERV, which +contains a general deep learning implementation. The latter, ``nerv-speech`` +provides with modules (classes) that comply to the API of core NERV and offer +supports (such as I/O) that are relevant to speech and language processing +(such as reading HTK/Kaldi features and labels). + +Like Torch, NERV uses LuaRocks_ to manage optional components as *packages*. +When running ``make`` in ``nerv`` repository root, LuaRocks and LuaJIT +(compiler) will be first setup, then a LuaRock package named ``nerv`` will be +then installed via LuaRocks, which is to say, the core part of NERV is +contained in a single LuaRocks package, ``nerv``. Next, by invoking ``make +speech``, several speech processing packages (such as ``htk_io``, ``kaldi_io``, +etc) will be compiled and installed from ``nerv/speech`` which ought to be +checked out from ``nerv-speech`` repository. Therefore, thanks to the +flexibility of Lua and the modularity brought by LuaRocks, new functionalities +can be added to NERV and managed in a clear way by building a self-contained +LuaRocks package with possible dependencies on ``nerv`` or other packages. The +package systems provides with good isolation so that the contributions can be +better managed and decoupled from core NERV. + +.. _LuaRocks: https://luarocks.org/ + +Isolation v.s. Completeness +--------------------------- + +The loosely organized nature of Lua and the package manager LuaRocks give us +many possibilities in abstraction and collaboration. However, since no typical +patterns are really enforced by Lua language, it is impossible to merely hope +the compiler or interpreter can regulate the implementation by all +contributors. As mentioned in NERV's overview document, one problem of Torch is +it strives to isolate components and wrap them up respectively into different +LuaRocks packages, which is seemingly a good choice for collaboration, however +not very wise in the long run. The methodology of such "collaboration" means no +collaboration at all. Under such methodology, each user has the to build +her/his own package and the reluctance to merge others' code. This leads to +less and less shared code base and erodes the completeness of a toolkit. + +When a new functionality is being added to NERV, there are several approaches, +where each has its merits and demerits. Therefore, here, we describe each +possibility and stipulate under which condition should the contributor takes it +as the resort. + +- A gentle *modifition* (mod or "hacking"): just as those in video games, a + mod is like a temporary patch applied to the original toolkit that slightly + *overrides* some default features or behaviors. Thanks to the looseness of + Lua, any NERV components can be altered or overriden by simply redefining the + set of functions or classes that should be modified in the user script after + loading the default ones. These modifications are only legal in user scripts, + reflecting the difference between a task-specific user script with the + standard one. The advantage of such approach is to confine the modifications + into one place so all users can use the same toolkit code base while leaving + modifications visible to others, rather than hacking the official source + directly and individually which ends up in different code bases that cannot + be shared and are difficult to detect modifications to synchronize the implementations. + + We encourage end users should first try this way if the default behavior of + NERV cannot be changed to suit your needs due to limited options or + generality. No matter how general your alternative approaches are, try this + at first to make sure your implementation works as expected without touching + the shared code base. After that, if your modifications are meaningful for + many other tasks, which means, general enough, please abstract out the + non-task-specific part and consider directly contribute to the shared code base + (take other approaches listed below). + +- Making a *LuaRocks package*: a LuaRocks package is meant to be shared among the + users who demand an extra common functionality: + + - which is not generally needed by the majority (e.g., an unusual network + structure or training method, etc.), or + - which is experimental, so temporarily cannot be merged into NERV (due to + some implementation or stability issues), or + - which is naturally a self-contained or de-coupled extension for NERV (e.g, + I/O readers) + - contains modifications or feature enhancements written in not only Lua but + also C/C++ (e.g, efficient data processing or new layer computations). + + Please note that making a hybrid LuaRocks package containing C/C++ + implementations might be a little difficult for the contributors who are not + very familiar with writing ``Makefile`` or similar C/C++ auto building + scripts. However, it is extremely easy to write a pure LuaRocks package or to + convert a above-mentioned modification into a valid package. + +- Making a git *branch* from "master": this measure is usually taken by + developers or a contributor who knows well about the NERV internals. This + branching technique can be used under the following circumstances. + + - Core developers make major changes to NERV that can possibly break the + existing functionalities. + - Core developers merge major changes from pull request. + - Contributors make contributions in C/C++ code. + - Contributors submit their LuaRocks packages. + - End users need to locally modify the C/C++ code to change the default behavior + (these branches will only exist in their local repositories and are less + likely to be merged into the official master branch unless they generalize + them and send pull requests to core developers). + + Contributors should keep the changes in their branches clear and should not + make changes that can only run correctly on their own tasks or with + particular settings, nor should they break the existing functionalities of + NERV. The developers need to carefully review and qualify the changes by + understanding the meaning of each line of code as well as the possible + side-effects, if exist, leave comments to explain. + +- Duplicate the code: this is only for testing or personal use. It is *NOT* a + way to collaborate or contribute. + +When making a Lua modification or LuaRocks package as mentioned, end users or +contributors should always keep in mind the following principle: + +- Try to disentangle the original issue by abstraction. +- Try to consider whether the solution could be generalized to solve others' problems. +- Try to override the default components (implemented by functions, classes) as + "high-level" as possible. For example, when there is an opportunity to + achieve your goal by hacking a trainer (scheduler), *DO NOT* change + implementations for layers or buffers or even CUDA implementation. When there + is a change of changing one function of a trainer, *DO NOT* re-implement the + whole trainer. +- Try to follow the coding convention in the official code. + +Workflows +--------- + +- End users usually slightly adjust the behavior of NERV via *modifications* if + options do not help much. These mods are only for local use. + +- For a contributor, when there is a common need of an additional + functionality: + + 1. Fork the ``nerv-speech``: make a local branch with a concise name consists + of only lower case alphadigits or hyphens (regex: ``[a-z][a-z0-9-]*``). + + 2. Generalize your modifications into a LuaRocks package (naming convention: + ``[a-z][a-z0-9_]*``). + + 3. Put the LuaRocks package as a new directory under the root directory of + ``nerv-speech``. Include possible tutorials in ``/tutorial`` if any. + Package documents should be located at ``doc`` directory of your + package. All documents should be in plain-text format, however, + human-readable lightweight markup formats are preferred, such as + Markdown or reStructuredText. *DO NOT* change other directories in + ``nerv-speech``. + + 4. Commit your changes with a brief but meaningful message. Try to stash your + commits to a single commit if there are too many. Avoid meaningless + messages such as "...". + + 5. Send a pull request of your branch to the developers. + +- For those contributors interested in contributing to core NERV: + + 1. Fork the ``nerv``: make a local branch with a concise name consist + of only lower case alphabets, digits or hyphens (regex: ``[a-z][a-z0-9-]*``). + + 2. Make changes. + 3. Commit your changes with a brief but meaningful message. Try to stash your + commits to a single one if there are too many. Avoid meaningless + messages such as "...". + + 4. Send a pull request of your branch to the developers. + +- Developers could only merge the tested code written with appropriate coding + convention. + +- A stable release is denoted by a Git tag with version number as its name. +- The version number is in the format of: ``<prefix>-<major number>.<minor + number>``, where the ``<prefix>-`` and ``.<minor number>`` are optional. Here + are some examples: + + - ``alpha-1`` + - ``alpha-1.1`` + - ``alpha-4`` + - ``beta-1.2`` + - ``beta-1.21`` + - ``1.0`` + +- For a given version, the complete release is the commit tagged by the largest + version number which does not exceed the given number in both repositories, + i.e., ``nerv`` and ``nerv-speech``. End users should checkout the latest + version for general use by the tags with the largest version number in both + repositories, for checking out, please refer to ``README.rst`` in ``nerv``. + +- Developers must test major tasks on the version that is going to be tagged. diff --git a/nerv/doc/source/dev.rst b/nerv/doc/source/dev.rst index 30311a2..0b4661c 100644 --- a/nerv/doc/source/dev.rst +++ b/nerv/doc/source/dev.rst @@ -1,3 +1,7 @@ Development Manual ================== -To be filled. + +.. toctree:: + + collaboration-rules + coding-convention diff --git a/nerv/doc/source/index.rst b/nerv/doc/source/index.rst index 24d1fe2..097344a 100644 --- a/nerv/doc/source/index.rst +++ b/nerv/doc/source/index.rst @@ -17,26 +17,16 @@ Contents: TODO List --------- -+----------+--------------------------------------------------------------------------+-------------+ -| Status/ | Task | Assignee | -| Priority | | | -+==========+==========================================================================+=============+ -| High | Generalize nerv.Matrix to nerv.Tensor (use the same API as Torch Tensor) | Mengxiao Bi | -+----------+--------------------------------------------------------------------------+-------------+ -| High | Development manual: coding style & contribution rules | Ted Yin | -+----------+--------------------------------------------------------------------------+-------------+ -| High | Development manual: Error reporting & Logging standard | Ted Yin | -+----------+--------------------------------------------------------------------------+-------------+ -| High | support for basic RNN | Tianxing He | -+----------+--------------------------------------------------------------------------+-------------+ -| High | support for RNN/LSTM | Tianxing He | -+----------+--------------------------------------------------------------------------+-------------+ -| High | support for CNN | Mengxiao Bi | -+----------+--------------------------------------------------------------------------+-------------+ -| Mid | User manual | ALL | -+----------+--------------------------------------------------------------------------+-------------+ -| Low | Development manual: general reference | N/A | -+----------+--------------------------------------------------------------------------+-------------+ ++----------+--------------------------------------------------------------------------+----------+ +| Status/ | Task | Assignee | +| Priority | | | ++==========+==========================================================================+==========+ +| On-going | Development manual: coding style & contribution rules | Ted Yin | ++----------+--------------------------------------------------------------------------+----------+ +| High | Generalize nerv.Matrix to nerv.Tensor (use the same API as Torch Tensor) | TBD. | ++----------+--------------------------------------------------------------------------+----------+ +| High | Merge the CNN branch | TBD. | ++----------+--------------------------------------------------------------------------+----------+ Indices and tables ================== |