aboutsummaryrefslogtreecommitdiff
path: root/nerv/doc/source/collaboration-rules.rst
diff options
context:
space:
mode:
Diffstat (limited to 'nerv/doc/source/collaboration-rules.rst')
-rw-r--r--nerv/doc/source/collaboration-rules.rst203
1 files changed, 203 insertions, 0 deletions
diff --git a/nerv/doc/source/collaboration-rules.rst b/nerv/doc/source/collaboration-rules.rst
new file mode 100644
index 0000000..b7126c9
--- /dev/null
+++ b/nerv/doc/source/collaboration-rules.rst
@@ -0,0 +1,203 @@
+Collaboration Rules
+===================
+
+Introduction
+------------
+
+This document attempts to stipulate the rules and typical workflows that push
+forward NERV development. It may be updated or complemented with more details
+in future. Anyone who intends to contribute to the official repository must
+read this document before (s)he makes any pull requests to the development
+group or merges the changes into to the repository with permission.
+
+
+Repository
+----------
+
+The latest stable and on-going code are hosted at SpeechLab and maintained with
+the help of Git, a distributed version control system. Despite the "distribute"
+nature of the tool, our project management is centralized, just like Linux
+kernel development which was the original use case of Git. The NERV project, in
+a general sense, includes two major sub-projects whose names known as ``nerv`` and
+``nerv-speech``, respectively. The former contains the core part of NERV, which
+contains a general deep learning implementation. The latter, ``nerv-speech``
+provides with modules (classes) that comply to the API of core NERV and offer
+supports (such as I/O) that are relevant to speech and language processing
+(such as reading HTK/Kaldi features and labels).
+
+Like Torch, NERV uses LuaRocks_ to manage optional components as *packages*.
+When running ``make`` in ``nerv`` repository root, LuaRocks and LuaJIT
+(compiler) will be first setup, then a LuaRock package named ``nerv`` will be
+then installed via LuaRocks, which is to say, the core part of NERV is
+contained in a single LuaRocks package, ``nerv``. Next, by invoking ``make
+speech``, several speech processing packages (such as ``htk_io``, ``kaldi_io``,
+etc) will be compiled and installed from ``nerv/speech`` which ought to be
+checked out from ``nerv-speech`` repository. Therefore, thanks to the
+flexibility of Lua and the modularity brought by LuaRocks, new functionalities
+can be added to NERV and managed in a clear way by building a self-contained
+LuaRocks package with possible dependencies on ``nerv`` or other packages. The
+package systems provides with good isolation so that the contributions can be
+better managed and decoupled from core NERV.
+
+.. _LuaRocks: https://luarocks.org/
+
+Isolation v.s. Completeness
+---------------------------
+
+The loosely organized nature of Lua and the package manager LuaRocks give us
+many possibilities in abstraction and collaboration. However, since no typical
+patterns are really enforced by Lua language, it is impossible to merely hope
+the compiler or interpreter can regulate the implementation by all
+contributors. As mentioned in NERV's overview document, one problem of Torch is
+it strives to isolate components and wrap them up respectively into different
+LuaRocks packages, which is seemingly a good choice for collaboration, however
+not very wise in the long run. The methodology of such "collaboration" means no
+collaboration at all. Under such methodology, each user has the to build
+her/his own package and the reluctance to merge others' code. This leads to
+less and less shared code base and erodes the completeness of a toolkit.
+
+When a new functionality is being added to NERV, there are several approaches,
+where each has its merits and demerits. Therefore, here, we describe each
+possibility and stipulate under which condition should the contributor takes it
+as the resort.
+
+- A gentle *modifition* (mod or "hacking"): just as those in video games, a
+ mod is like a temporary patch applied to the original toolkit that slightly
+ *overrides* some default features or behaviors. Thanks to the looseness of
+ Lua, any NERV components can be altered or overriden by simply redefining the
+ set of functions or classes that should be modified in the user script after
+ loading the default ones. These modifications are only legal in user scripts,
+ reflecting the difference between a task-specific user script with the
+ standard one. The advantage of such approach is to confine the modifications
+ into one place so all users can use the same toolkit code base while leaving
+ modifications visible to others, rather than hacking the official source
+ directly and individually which ends up in different code bases that cannot
+ be shared and are difficult to detect modifications to synchronize the implementations.
+
+ We encourage end users should first try this way if the default behavior of
+ NERV cannot be changed to suit your needs due to limited options or
+ generality. No matter how general your alternative approaches are, try this
+ at first to make sure your implementation works as expected without touching
+ the shared code base. After that, if your modifications are meaningful for
+ many other tasks, which means, general enough, please abstract out the
+ non-task-specific part and consider directly contribute to the shared code base
+ (take other approaches listed below).
+
+- Making a *LuaRocks package*: a LuaRocks package is meant to be shared among the
+ users who demand an extra common functionality:
+
+ - which is not generally needed by the majority (e.g., an unusual network
+ structure or training method, etc.), or
+ - which is experimental, so temporarily cannot be merged into NERV (due to
+ some implementation or stability issues), or
+ - which is naturally a self-contained or de-coupled extension for NERV (e.g,
+ I/O readers)
+ - contains modifications or feature enhancements written in not only Lua but
+ also C/C++ (e.g, efficient data processing or new layer computations).
+
+ Please note that making a hybrid LuaRocks package containing C/C++
+ implementations might be a little difficult for the contributors who are not
+ very familiar with writing ``Makefile`` or similar C/C++ auto building
+ scripts. However, it is extremely easy to write a pure LuaRocks package or to
+ convert a above-mentioned modification into a valid package.
+
+- Making a git *branch* from "master": this measure is usually taken by
+ developers or a contributor who knows well about the NERV internals. This
+ branching technique can be used under the following circumstances.
+
+ - Core developers make major changes to NERV that can possibly break the
+ existing functionalities.
+ - Core developers merge major changes from pull request.
+ - Contributors make contributions in C/C++ code.
+ - Contributors submit their LuaRocks packages.
+ - End users need to locally modify the C/C++ code to change the default behavior
+ (these branches will only exist in their local repositories and are less
+ likely to be merged into the official master branch unless they generalize
+ them and send pull requests to core developers).
+
+ Contributors should keep the changes in their branches clear and should not
+ make changes that can only run correctly on their own tasks or with
+ particular settings, nor should they break the existing functionalities of
+ NERV. The developers need to carefully review and qualify the changes by
+ understanding the meaning of each line of code as well as the possible
+ side-effects, if exist, leave comments to explain.
+
+- Duplicate the code: this is only for testing or personal use. It is *NOT* a
+ way to collaborate or contribute.
+
+When making a Lua modification or LuaRocks package as mentioned, end users or
+contributors should always keep in mind the following principle:
+
+- Try to disentangle the original issue by abstraction.
+- Try to consider whether the solution could be generalized to solve others' problems.
+- Try to override the default components (implemented by functions, classes) as
+ "high-level" as possible. For example, when there is an opportunity to
+ achieve your goal by hacking a trainer (scheduler), *DO NOT* change
+ implementations for layers or buffers or even CUDA implementation. When there
+ is a change of changing one function of a trainer, *DO NOT* re-implement the
+ whole trainer.
+- Try to follow the coding convention in the official code.
+
+Workflows
+---------
+
+- End users usually slightly adjust the behavior of NERV via *modifications* if
+ options do not help much. These mods are only for local use.
+
+- For a contributor, when there is a common need of an additional
+ functionality:
+
+ 1. Fork the ``nerv-speech``: make a local branch with a concise name consists
+ of only lower case alphadigits or hyphens (regex: ``[a-z][a-z0-9-]*``).
+
+ 2. Generalize your modifications into a LuaRocks package (naming convention:
+ ``[a-z][a-z0-9_]*``).
+
+ 3. Put the LuaRocks package as a new directory under the root directory of
+ ``nerv-speech``. Include possible tutorials in ``/tutorial`` if any.
+ Package documents should be located at ``doc`` directory of your
+ package. All documents should be in plain-text format, however,
+ human-readable lightweight markup formats are preferred, such as
+ Markdown or reStructuredText. *DO NOT* change other directories in
+ ``nerv-speech``.
+
+ 4. Commit your changes with a brief but meaningful message. Try to stash your
+ commits to a single commit if there are too many. Avoid meaningless
+ messages such as "...".
+
+ 5. Send a pull request of your branch to the developers.
+
+- For those contributors interested in contributing to core NERV:
+
+ 1. Fork the ``nerv``: make a local branch with a concise name consist
+ of only lower case alphabets, digits or hyphens (regex: ``[a-z][a-z0-9-]*``).
+
+ 2. Make changes.
+ 3. Commit your changes with a brief but meaningful message. Try to stash your
+ commits to a single one if there are too many. Avoid meaningless
+ messages such as "...".
+
+ 4. Send a pull request of your branch to the developers.
+
+- Developers could only merge the tested code written with appropriate coding
+ convention.
+
+- A stable release is denoted by a Git tag with version number as its name.
+- The version number is in the format of: ``<prefix>-<major number>.<minor
+ number>``, where the ``<prefix>-`` and ``.<minor number>`` are optional. Here
+ are some examples:
+
+ - ``alpha-1``
+ - ``alpha-1.1``
+ - ``alpha-4``
+ - ``beta-1.2``
+ - ``beta-1.21``
+ - ``1.0``
+
+- For a given version, the complete release is the commit tagged by the largest
+ version number which does not exceed the given number in both repositories,
+ i.e., ``nerv`` and ``nerv-speech``. End users should checkout the latest
+ version for general use by the tags with the largest version number in both
+ repositories, for checking out, please refer to ``README.rst`` in ``nerv``.
+
+- Developers must test major tasks on the version that is going to be tagged.