'Why the initialization of weights in darknet?

there!

I am studying Mr. Redmon's darknet code from https://github.com/pjreddie/darknet

I found the initialization of weights of a connected layer is like below:

// file: src/connected_layer.c
// function: make_connected_layer
    float scale = sqrt(2./inputs);
    for(i = 0; i < outputs*inputs; ++i){
        l.weights[i] = scale*rand_uniform(-1, 1);
    }

and the initialization of weights of a convolutional layer is like below:

// file: src/convolutional_layer.c
// function: make_convolutional_layer
    float scale = sqrt(2./(size*size*c/l.groups));
    for(i = 0; i < l.nweights; ++i) {
        l.weights[i] = scale*rand_normal();
    }

Could you tell me what the principle is behind these code, please? Links to resources such as related papers are also OK.

Thank you a lot!



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source