General Feed Forward for an ANN#

The mathematical calculations for a deep artificial neural network (ANN) with \(n\) hidden layers and one output layer can be generalized as follows. Let’s denote the inputs as \(x_1, x_2, \ldots, x_m\) (where \(m\) is the number of input features), and the weights and biases for each layer as \(w^{(k)}\) and \(b^{(k)}\) respectively, where \(k\) represents the layer index.

The calculations for the nodes in each hidden layer and the output layer are as follows:

  1. For the first hidden layer (\(k = 1\)): $\( h^{(1)}_j = \text{ReLU}\left(\sum_{i=1}^{m} w^{(1)}_{ij}x_i + b^{(1)}_j\right) \quad \text{for } j = 1, 2, \ldots, n_1 \)$

  2. For subsequent hidden layers (\(k = 2, 3, \ldots, n\)): $\( h^{(k)}_j = \text{ReLU}\left(\sum_{i=1}^{n_{k-1}} w^{(k)}_{ij}h^{(k-1)}_i + b^{(k)}_j\right) \quad \text{for } j = 1, 2, \ldots, n_k \)$

  3. For the output layer (\(k = n+1\)): $\( y = \text{Sigmoid}\left(\sum_{i=1}^{n_n} w^{(n+1)}_i h^{(n)}_i + b^{(n+1)}\right) \)$

In the above equations:

  • \(n\) is the total number of hidden layers.

  • \(n_k\) represents the number of nodes in the \(k\)-th hidden layer.

  • \(\text{ReLU}(x) = \max(0, x)\) is the Rectified Linear Unit activation function for hidden layers.

  • \(\text{Sigmoid}(x) = \frac{1}{1 + e^{-x}}\) is the Sigmoid activation function for the output layer.

During training, the weights (\(w^{(k)}\)) and biases (\(b^{(k)}\)) are adjusted using optimization algorithms like gradient descent to minimize the difference between the predicted output \(y\) and the actual target values.

This generalizes the calculations for a deep neural network with \(n\) hidden layers. The specific values of \(n_k\) and the weights and biases would depend on the architecture of your neural network and the specific problem you’re solving.

import numpy as np

# Define the neural network architecture
input_size = 4  # Number of input features
hidden_size = 5  # Number of nodes in the hidden layer
output_size = 1  # Number of output nodes

# Initialize random weights and biases
np.random.seed(0)
input_layer_size = input_size
hidden_layer_size = hidden_size
output_layer_size = output_size

# Initialize weights and biases with random values
weights_input_hidden = np.random.randn(input_layer_size, hidden_layer_size)
bias_hidden = np.zeros((1, hidden_layer_size))
weights_hidden_output = np.random.randn(hidden_layer_size, output_layer_size)
bias_output = np.zeros((1, output_size))

# Define the sigmoid activation function
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# Define the ReLU activation function
def ReLU(x):
    return np.maximum(0,x)


# Define the forward propagation function
def forward_propagation(X):
    # Calculate the values for the hidden layer
    hidden_input = np.dot(X.T, weights_input_hidden) + bias_hidden
    hidden_output = ReLU(hidden_input)
    
    # Calculate the values for the output layer
    output_input = np.dot(hidden_output, weights_hidden_output) + bias_output
    output = sigmoid(output_input)
    
    return hidden_input, hidden_output, output_input, output

# Example input data
X = np.array([[0, 0, 1, 1], [0, 1, 0, 1]]).T  # Two input samples

# Perform forward propagation
hidden_input, hidden_output, output_input, output = forward_propagation(X)

# Print the output
print("Output:")
print(output)
Output:
[[0.70377943]
 [0.80873491]]
import numpy as np

# Define the neural network architecture
input_size = 4  # Number of input features
hidden_layer_sizes = [5, 4, 3]  # List of hidden layer sizes
output_size = 1  # Number of output nodes

# Initialize random weights and biases
np.random.seed(0)
layer_sizes = [input_size] + hidden_layer_sizes + [output_size]
num_layers = len(layer_sizes)

# Initialize weights and biases with random values
weights = [np.random.randn(layer_sizes[i], layer_sizes[i+1]) for i in range(num_layers - 1)]
biases = [np.zeros((1, layer_sizes[i+1])) for i in range(num_layers - 1)]

# Define the sigmoid activation function
def sigmoid(x):
    return 1 / (1 + np.exp(-x))


# Define the forward propagation function
def forward_propagation(X):
    layer_outputs = []
    for i in range(num_layers - 1):
        if i == 0:
            layer_input = X.T
        else:
            layer_input = layer_outputs[-1]
        
        # Calculate the values for the current layer
        layer_input = np.dot(layer_input, weights[i]) + biases[i]
        layer_output = sigmoid(layer_input)
        layer_outputs.append(layer_output)
    
    return layer_outputs

# Example input data
X = np.array([[0, 0, 1, 1], [0, 1, 0, 1]]).T  # Two input samples

# Perform forward propagation
output_layers = forward_propagation(X)

# Print the output of each layer
for i, output in enumerate(output_layers):
    print(f"Output of Layer {i + 1}:\n{output}")

# The final output is in output_layers[-1]
print("Final Output:")
print(output_layers[-1])
Output of Layer 1:
[[0.61720885 0.95018557 0.63549859 0.60700562 0.39885637]
 [0.34443241 0.92013388 0.41180333 0.55227054 0.39090796]]
Output of Layer 2:
[[0.81859503 0.31702007 0.57683967 0.3962107 ]
 [0.86187865 0.24380832 0.5150207  0.42459047]]
Output of Layer 3:
[[0.25982904 0.22612967 0.09002841]
 [0.23800412 0.20950305 0.09584056]]
Output of Layer 4:
[[0.40077544]
 [0.40814945]]
Final Output:
[[0.40077544]
 [0.40814945]]