Merge branch 'master' of https://github.com/Determinant/nerv

author: Determinant <ted.sybil@gmail.com> 2015-06-16 12:46:41 +0800
committer: Determinant <ted.sybil@gmail.com> 2015-06-16 12:46:41 +0800
commit: 2ab9610a4fff798c1668cdc041515256fa813865 (patch)
tree: 3450e26ef7ea5eaeec870bbddb3c33c512320a6e
parent: 341b8b8c57cc4ee6f3fb940f00d9c8265e0b42a5 (diff)
parent: c3db7ffba45b7e4d0a1d76281e187b3f88129db9 (diff)
4 files changed, 167 insertions, 6 deletions
diff --git a/README.md b/README.md
index f825e57..6a0f1e4 100644
--- a/README.md
+++ b/README.md
@@ -36,7 +36,8 @@ The IO package is used to read and write parameters to file.
 The parameter package is used to store, read model parameters from file.
 * __[The Nerv Layer Package](doc/nerv_layer.md)__  
 The layer package is used to define propagation and backpropagation of different type of layers.
-
+* __[The Nerv NN Package](doc/nerv_nn.md)__  
+The nn package is for organizing a neural network, it contains __nerv.LayerRepo__, __nerv.ParamRepo__, and __nerv.DAGLayer__.
 [luaT]:https://github.com/torch/torch7/tree/master/lib/luaT
 [Torch]:https://github.com/torch
 [sync-help]:https://help.github.com/articles/syncing-a-fork/
diff --git a/doc/nerv_layer.md b/doc/nerv_layer.md
index dd991df..0425d5f 100644
--- a/doc/nerv_layer.md
+++ b/doc/nerv_layer.md
@@ -9,15 +9,17 @@ __nerv.Layer__ is the base class and most of its methods are abstract.
 	* `table dim_out` It specifies the dimensions of the outputs.  
 	* `string id` ID of this layer.
 	* `table gconf` Stores the `global_conf`.
-* __nerv.AffineLayer__ inherits __nerv.Layer__, both `#dim_in` and `#dim_out` are 1.
+* __nerv.AffineLayer__ inherits __nerv.Layer__, both `#dim_in` and `#dim_out` are 1. 
 	* `MatrixParam ltp` The liner transform parameter.
 	* `BiasParam bp` The bias parameter.
 * __nerv.BiasLayer__ inherits __nerv.Layer__, both `#dim_in` nad `#dim_out` are 1.
 	* `BiasParam bias` The bias parameter.
 * __nerv.SigmoidLayer__ inherits __nerv.Layer__, both `#dim_in` and `#dim_out` are 1.
-* __nerv.SoftmaxCELayer__ inherits __nerv.Layer__, `#dim_in` is 2 and `#dim_out` is 1.
-	* `float total_ce` 
-	* `int total_frams` Records how many frames have passed.
+* __nerv.SoftmaxCELayer__ inherits __nerv.Layer__, `#dim_in` is 2 and `#dim_out` is 0. `input[1]` is the input to the softmax layer, `input[2]` is the reference distribution.
+	* `float total_ce` Records the accumlated cross entropy value.
+	* `int total_frams` Records how many frames have passed.  
+	* `bool compressed` The reference distribution can be a one-hot format. This feature is enabled by `layer_conf.compressed`.
+
 ##Methods##
 * __void Layer.\_\_init(Layer self, string id, table global_conf, table layer_conf)__  
 Abstract method.  
@@ -41,3 +43,129 @@ Check whether `#self.dim_in == len_in` and `#self.dim_out == len_out`, if violat
 Abstract method.  
 The layer should return a list containing its parameters.
 
+##Examples##
+* a basic example using __Nerv__ layers to a linear classification.
+
+```
+require 'math'
+
+require 'layer.affine'
+require 'layer.softmax_ce'
+
+--[[Example using layers, a simple two-classification problem]]--
+
+function calculate_accurate(networkO, labelM)
+    sum = 0
+    for i = 0, networkO:nrow() - 1, 1 do
+        if (labelM[i][0] == 1 and networkO[i][0] >= 0.5) then
+            sum = sum + 1
+        end
+        if (labelM[i][1] == 1 and networkO[i][1] >= 0.5) then
+            sum = sum + 1
+        end 
+    end
+    return sum
+end
+
+--[[begin global setting and data generation]]--
+global_conf =  {lrate = 10, 
+                wcost = 1e-6,
+                momentum = 0.9,
+                cumat_type = nerv.CuMatrixFloat}
+
+input_dim = 5
+data_num = 100
+ansV = nerv.CuMatrixFloat(input_dim, 1)
+for i = 0, input_dim - 1, 1 do
+    ansV[i][0] = math.random() - 0.5
+end
+ansB = math.random() - 0.5
+print('displaying ansV')
+print(ansV)
+print('displaying ansB(bias)')
+print(ansB)
+
+dataM = nerv.CuMatrixFloat(data_num, input_dim)
+for i = 0, data_num - 1, 1 do
+    for j = 0, input_dim - 1, 1 do
+        dataM[i][j] = math.random() * 2 - 1
+    end
+end
+refM = nerv.CuMatrixFloat(data_num, 1)
+refM:fill(ansB)
+refM:mul(dataM, ansV, 1, 1) --refM = dataM * ansV + ansB
+
+labelM = nerv.CuMatrixFloat(data_num, 2)
+for i = 0, data_num - 1, 1 do
+    if (refM[i][0] > 0) then
+        labelM[i][0] = 1 
+        labelM[i][1] = 0
+    else
+        labelM[i][0] = 0
+        labelM[i][1] = 1
+    end
+end
+--[[global setting and data generation end]]--
+
+
+--[[begin network building]]--
+--parameters
+affineL_ltp = nerv.LinearTransParam('AffineL_ltp', global_conf)
+affineL_ltp.trans = nerv.CuMatrixFloat(input_dim, 2)
+for i = 0, input_dim - 1, 1 do
+    for j = 0, 1, 1 do
+        affineL_ltp.trans[i][j] = math.random() - 0.5 
+    end
+end
+affineL_bp = nerv.BiasParam('AffineL_bp', global_conf)
+affineL_bp.trans = nerv.CuMatrixFloat(1, 2)
+for j = 0, 1, 1 do
+    affineL_bp.trans[j] = math.random() - 0.5
+end
+
+--layers
+affineL = nerv.AffineLayer('AffineL', global_conf, {['ltp'] = affineL_ltp,
+                                                      ['bp'] = affineL_bp,
+                                                      dim_in = {input_dim},
+                                                      dim_out = {2}})
+softmaxL = nerv.SoftmaxCELayer('softmaxL', global_conf, {dim_in = {2, 2},
+                                                         dim_out = {}})
+print('layers initializing...')
+affineL:init()
+softmaxL:init()
+--[[network building end]]--
+
+
+--[[begin space allocation]]--
+print('network input&output&error space allocation...')
+affineI = {dataM} --input to the network is data
+affineO = {nerv.CuMatrixFloat(data_num, 2)}
+softmaxI = {affineO[1], labelM}
+softmaxO = {nerv.CuMatrixFloat(data_num, 2)} 
+
+affineE = {nerv.CuMatrixFloat(data_num, 2)}
+--[[space allocation end]]--
+
+
+--[[begin training]]--
+ce_last = 0
+for l = 0, 10, 1 do
+    affineL:propagate(affineI, affineO)
+    softmaxL:propagate(softmaxI, softmaxO)
+    softmaxO[1]:softmax(softmaxI[1])
+
+    softmaxL:back_propagate(affineE, nil, softmaxI, softmaxO)
+    
+    affineL:update(affineE, affineI, affineO) 
+
+    if (l % 5 == 0) then
+        nerv.utils.printf("training iteration %d finished\n", l)
+        nerv.utils.printf("cross entropy: %.8f\n", softmaxL.total_ce - ce_last)
+        ce_last = softmaxL.total_ce 
+        nerv.utils.printf("accurate labels: %d\n", calculate_accurate(softmaxO[1], labelM))
+        nerv.utils.printf("total frames processed: %.8f\n", softmaxL.total_frames)
+    end
+end
+--[[end training]]--
+
+```
+\ No newline at end of file
diff --git a/doc/nerv_nn.md b/doc/nerv_nn.md
new file mode 100644
index 0000000..54c7165
--- /dev/null
+++ b/doc/nerv_nn.md
@@ -0,0 +1,32 @@
+#The Nerv NN Package#
+Part of the [Nerv](../README.md) toolkit.
+
+##Description##
+###Class hierarchy###
+it contains __nerv.LayerRepo__, __nerv.ParamRepo__, and __nerv.DAGLayer__(inherits __nerv.Layer__).
+
+###Class hierarchy and their members###
+* __nerv.ParamRepo__ Get parameter object by ID.  
+	* `table param_table` Contains the mapping of parameter ID to parameter file(__nerv.ChunkFile__) 
+*  __nerv.LayerRepo__ Get layer object by ID.  
+	* `table layers` Contains the mapping of layer ID to layer object.
+objects.
+* __nerv.DAGLayer__ inherits __nerv.Layer__.  
+
+##Methods##
+###__nerv.ParamRepo__###
+* __void ParamRepo:\_\_init(table param_files)__  
+`param_files` is a list of file names that stores parameters, the newed __ParamRepo__ will read them from file and store the mapping for future fetching.  
+* __nerv.Param ParamRepo.get_param(ParamRepo self, string pid, table global_conf)__  
+__ParamRepo__ will find the __nerv.ChunkFile__ `pf` that contains parameter of ID `pid` and return `pf:read_chunk(pid, global_conf)`.
+
+###__nerv.LayerRepo__###
+* __void LayerRepo:\_\_init(table layer_spec, ParamRepo param_repo, table global_conf)__  
+__LayerRepo__ will construct the layers specified in `layer_spec`. Every entry in the `layer_spec` table should follow the format below:  
+```
+layer_spec : {[layer_type1] = llist1, [layer_type2] = llist2, ...}
+llist : {layer1, layer2, ...}
+layer : layerid = {param_config, layer_config}
+param_config : {param1 = paramID1, param2 = paramID2}
+```
+__LayerRepo__ will merge `param_config` into `layer_config` and construct a layer by calling `layer_type(layerid, global_conf, layer_config)`.
+\ No newline at end of file
diff --git a/matrix/init.lua b/matrix/init.lua
index 9637391..7bbc6a4 100644
--- a/matrix/init.lua
+++ b/matrix/init.lua
@@ -42,7 +42,7 @@ function nerv.CuMatrix:__sub__(b)
 end
 
 function nerv.CuMatrix:__mul__(b)
-    c = self:create()
+    c = nerv.get_type(self.__typename)(self:nrow(), b:ncol())
     c:mul(self, b, 1.0, 0.0, 'N', 'N')
     return c
 end
author	Determinant <ted.sybil@gmail.com>	2015-06-16 12:46:41 +0800
committer	Determinant <ted.sybil@gmail.com>	2015-06-16 12:46:41 +0800
commit	2ab9610a4fff798c1668cdc041515256fa813865 (patch)
tree	3450e26ef7ea5eaeec870bbddb3c33c512320a6e
parent	341b8b8c57cc4ee6f3fb940f00d9c8265e0b42a5 (diff)
parent	c3db7ffba45b7e4d0a1d76281e187b3f88129db9 (diff)