doc/nerv_nn.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256

#The Nerv NN Package#
Part of the [Nerv](../README.md) toolkit.

##Description##
###Class hierarchy###
it contains __nerv.LayerRepo__, __nerv.ParamRepo__, and __nerv.DAGLayer__(inherits __nerv.Layer__).

###Class hierarchy and their members###
####nerv.ParamRepo#### 
Get parameter object by ID.  
*	`table param_table` Contains the mapping of parameter ID to parameter file(__nerv.ChunkFile__)  
*  __nerv.LayerRepo__ Get layer object by ID.  
* 	`table layers` Contains the mapping of layer ID to layer object.
objects.

####__nerv.DAGLayer__####
Inherits __nerv.Layer__.  
* 	`layers`: __table__, a mapping from a layer ID to its "ref". A ref is a structure that contains reference to space allocations and other info of the layer.
* 	`inputs`: __table__, a mapping from the inputs ports of the DAG layer to the input ports of the sublayer, the key is the port number, the value is `{ref, port}`.
* 	`outputs`:__table__, the counterpart of `inputs`.
* 	`parsed_conn`: __table__, a list of parsed connections, each entry is of format `{{ref_from, port_from}, {ref_to, port_to}}`.
* 	`queue`: __table__, a list of "ref"s, the propagation of the DAGLayer will follow this order, and back-propagation will follow a reverse order.
	
##Methods##

###__nerv.ParamRepo__###

####nerv.ParamRepo:\_\_init(param\_files)####
* 	Parameters:  
	`param_files`: __table__
*	Description:  
	`param_files` is a list of file names that stores parameters, the newed __ParamRepo__ will read them from file and store the mapping for future fetching.  
    
####nerv.Param ParamRepo.get_param(ParamRepo self, string pid, table global_conf)####
*	Returns:  
	__nerv.Layer__  
*	Parameters:  
	`self`: __nerv.ParamRepo__.  
    `pid`: __string__.  
    `global_conf`: __table__.  
*	Description:  
	__ParamRepo__ will find the __nerv.ChunkFile__ `pf` that contains parameter of ID `pid` and return `pf:read_chunk(pid, global_conf)`.

###__nerv.LayerRepo__###
####nerv.LayerRepo:\_\_init(layer\_spec, param\_repo, global\_conf)####
* 	Returns:  
  	__nerv.LayerRepo__.  
* 	Parameters:  
  	`self`: __nerv.ParamRepo__.  
  	`layer_spec`: __table__.  
  	`param_repo`: __nerv.ParamRepo__.  
  	`global_conf`: __table__.  
* 	Description:  
  	__LayerRepo__ will construct the layers specified in `layer_spec`. Every entry in the `layer_spec` table should follow the format below:  
	
    > layer_spec : {[layer_type1] = llist1, [layer_type2] = llist2, ...}   
  	> llist : {layer1, layer2, ...}   
  	> layer : layerid = {param_config, layer_config}   
  	> param_config : {param1 = paramID1, param2 = paramID2}  	 
    
  	__LayerRepo__ will merge `param_config` into `layer_config` and construct a layer by calling `layer_type(layerid, global_conf, layer_config)`.    

####nerv.LayerRepo.get\_layer(self, lid)####
* 	Returns:  
	__nerv.LayerRepo__, the layer with ID `lid`.
* 	Parameters:  
	`self`:__nerv.LayerRepo__.  
	`lid`:__string__.  
*	Description:   
	Returns the layer with ID `lid`.
    
###nerv.DAGLayer###
####nerv.DAGLayer:\_\_init(id, global\_conf, layer\_conf)####
*	Returns:  
	__nerv.DAGLayer__  
*	Parameters:  
	`id`: __string__  
    `global_conf`: __table__  
    `layer_conf`: __table__  
*	Description:  
	The `layer_conf` should contain `layer_conf.sub_layers` which is a __nerv.LayerRepo__ storing the sub layers of the DAGLayer. It should also contain `layer_conf.connections`, which is a string-to-string mapping table describing the DAG connections. See an example below:
    
    ```
    dagL = nerv.DAGLayer("DAGL", global_conf, {["dim_in"] = {input_dim, 2}, ["dim_out"] = {}, ["sub_layers"] = layerRepo,
    	["connections"] = {
    	["<input>[1]"] = "AffineL[1]",
    	["AffineL[1]"] = "SoftmaxL[1]",
    	["<input>[2]"] = "SoftmaxL[2]",
  	}})
    ```
    
####nerv.DAGLayer.init(self, batch\_size)####
*	Parameters:  
	`self`: __nerv.DAGLayer__  
    `batch_size`: __int__
* 	Description:  
	This initialization method will allocate space for output and input matrice, and will call `init()` for each of its sub layers.
    

####nerv.DAGLayer.propagate(self, input, output)####
*	Parameters:  
	`self`: __nerv.DAGLayer__  
    `input`: __table__  
    `output`: __table__  
*	Description:  
	The same function as __nerv.Layer.propagate__, do propagation for each layer in the order of `self.queue`.

####nerv.DAGLayer.back\_propagate(self, next\_bp\_err, bp\_err, input, output)####
*	Parameters:  
	`self`: __nerv.DAGLayer__  
    `next_bp_err`: __table__  
    `bp_err`: __table__  
    `input`: __table__  
    `output`: __table__  
*	Description:  
	The same function as __nerv.Layer.back_propagate__, do back-propagation for each layer in the reverse order of `self.queue`.

####nerv.DAGLayer.update(self, bp\_err, input, output)####
*	Parameters:  
	`self`: __nerv.DAGLayer__  
    `bp_err`: __table__  
    `input`: __table__  
    `output`: __table__  
*	Description:  
	The same function as __nerv.Layer.update__, do update for each layer in the order of `self.queue`.
    
##Examples##
*	aaa
	
```
require 'math'

require 'layer.affine'
require 'layer.softmax_ce'

--[[Example using DAGLayer, a simple two-classification problem]]--

--[[begin global setting and data generation]]--
global_conf =  {lrate = 10, 
                wcost = 1e-6,
                momentum = 0.9,
                cumat_type = nerv.CuMatrixFloat,
               }

input_dim = 5
data_num = 100
param_fn = "../tmp"
ansV = nerv.CuMatrixFloat(input_dim, 1)
for i = 0, input_dim - 1, 1 do
    ansV[i][0] = math.random() - 0.5
end
ansB = math.random() - 0.5
print('displaying ansV')
print(ansV)
print('displaying ansB(bias)')
print(ansB)

dataM = nerv.CuMatrixFloat(data_num, input_dim)
for i = 0, data_num - 1, 1 do
    for j = 0, input_dim - 1, 1 do
        dataM[i][j] = math.random() * 2 - 1
    end
end
refM = nerv.CuMatrixFloat(data_num, 1)
refM:fill(ansB)
refM:mul(dataM, ansV, 1, 1) --refM = dataM * ansV + ansB

labelM = nerv.CuMatrixFloat(data_num, 2)
for i = 0, data_num - 1, 1 do
    if (refM[i][0] > 0) then
        labelM[i][0] = 1 
        labelM[i][1] = 0
    else
        labelM[i][0] = 0
        labelM[i][1] = 1
    end
end
--[[global setting and data generation end]]--


--[[begin network building]]--
--parameters
do
    local affineL_ltp = nerv.LinearTransParam('AffineL_ltp', global_conf)
    affineL_ltp.trans = nerv.CuMatrixFloat(input_dim, 2)
    for i = 0, input_dim - 1, 1 do
        for j = 0, 1, 1 do
            affineL_ltp.trans[i][j] = math.random() - 0.5 
        end
    end
    local affineL_bp = nerv.BiasParam('AffineL_bp', global_conf)
    affineL_bp.trans = nerv.CuMatrixFloat(1, 2)
    for j = 0, 1, 1 do
        affineL_bp.trans[j] = math.random() - 0.5
    end

    local chunk = nerv.ChunkFile(param_fn, 'w')
    chunk:write_chunk(affineL_ltp)
    chunk:write_chunk(affineL_bp)
    chunk:close()

    paramRepo = nerv.ParamRepo({param_fn})
end

--layers
layerRepo = nerv.LayerRepo({
        ["nerv.AffineLayer"] = 
        {
            ["AffineL"] = {{["ltp"] = "AffineL_ltp", ["bp"] = "AffineL_bp"}, {["dim_in"] = {input_dim}, ["dim_out"] = {2}}},
        },
        ["nerv.SoftmaxCELayer"] = 
        {
            ["SoftmaxL"] = {{}, {["dim_in"] = {2, 2}, ["dim_out"] = {}}}
        },
        }, paramRepo, global_conf)
affineL = layerRepo:get_layer("AffineL")
softmaxL = layerRepo:get_layer("SoftmaxL")
print('layers initializing...')
dagL = nerv.DAGLayer("DAGL", global_conf, {["dim_in"] = {input_dim, 2}, ["dim_out"] = {}, ["sub_layers"] = layerRepo,
        ["connections"] = {
           ["<input>[1]"] = "AffineL[1]",
           ["AffineL[1]"] = "SoftmaxL[1]",
           ["<input>[2]"] = "SoftmaxL[2]",
        }})
dagL:init(data_num)
--affineL:init()
--softmaxL:init()
--[[network building end]]--


--[[begin space allocation]]--
print('network input&output&error space allocation...')
dagL_input = {dataM, labelM}
dagL_output = {}
dagL_err = {}
dagL_ierr = {nerv.CuMatrixFloat(data_num, input_dim), nerv.CuMatrixFloat(data_num, 2)}
--[[space allocation end]]--


--[[begin training]]--
ce_last = 0
for l = 0, 10, 1 do
    dagL:propagate(dagL_input, dagL_output)
    dagL:back_propagate(dagL_ierr, dagL_err, dagL_input, dagL_output)
    dagL:update(dagL_err, dagL_input, dagL_output)
    
    if (l % 2 == 0) then
        nerv.utils.printf("training iteration %d finished\n", l)
        nerv.utils.printf("cross entropy: %.8f\n", softmaxL.total_ce - ce_last)
        --nerv.utils.printf("accurate labels: %d\n", calculate_accurate(output, labelM))
        nerv.utils.printf("total frames processed: %.8f\n", softmaxL.total_frames)
    end
    ce_last = softmaxL.total_ce 
end
--[[end training]]--
```