'Why the initialization of weights in darknet?
there!
I am studying Mr. Redmon's darknet code from https://github.com/pjreddie/darknet
I found the initialization of weights of a connected layer is like below:
// file: src/connected_layer.c
// function: make_connected_layer
float scale = sqrt(2./inputs);
for(i = 0; i < outputs*inputs; ++i){
l.weights[i] = scale*rand_uniform(-1, 1);
}
and the initialization of weights of a convolutional layer is like below:
// file: src/convolutional_layer.c
// function: make_convolutional_layer
float scale = sqrt(2./(size*size*c/l.groups));
for(i = 0; i < l.nweights; ++i) {
l.weights[i] = scale*rand_normal();
}
Could you tell me what the principle is behind these code, please? Links to resources such as related papers are also OK.
Thank you a lot!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|