General Feed Forward for an ANN#
The mathematical calculations for a deep artificial neural network (ANN) with \(n\) hidden layers and one output layer can be generalized as follows. Let’s denote the inputs as \(x_1, x_2, \ldots, x_m\) (where \(m\) is the number of input features), and the weights and biases for each layer as \(w^{(k)}\) and \(b^{(k)}\) respectively, where \(k\) represents the layer index.
The calculations for the nodes in each hidden layer and the output layer are as follows:
For the first hidden layer (\(k = 1\)): $\( h^{(1)}_j = \text{ReLU}\left(\sum_{i=1}^{m} w^{(1)}_{ij}x_i + b^{(1)}_j\right) \quad \text{for } j = 1, 2, \ldots, n_1 \)$
For subsequent hidden layers (\(k = 2, 3, \ldots, n\)): $\( h^{(k)}_j = \text{ReLU}\left(\sum_{i=1}^{n_{k-1}} w^{(k)}_{ij}h^{(k-1)}_i + b^{(k)}_j\right) \quad \text{for } j = 1, 2, \ldots, n_k \)$
For the output layer (\(k = n+1\)): $\( y = \text{Sigmoid}\left(\sum_{i=1}^{n_n} w^{(n+1)}_i h^{(n)}_i + b^{(n+1)}\right) \)$
In the above equations:
\(n\) is the total number of hidden layers.
\(n_k\) represents the number of nodes in the \(k\)-th hidden layer.
\(\text{ReLU}(x) = \max(0, x)\) is the Rectified Linear Unit activation function for hidden layers.
\(\text{Sigmoid}(x) = \frac{1}{1 + e^{-x}}\) is the Sigmoid activation function for the output layer.
During training, the weights (\(w^{(k)}\)) and biases (\(b^{(k)}\)) are adjusted using optimization algorithms like gradient descent to minimize the difference between the predicted output \(y\) and the actual target values.
This generalizes the calculations for a deep neural network with \(n\) hidden layers. The specific values of \(n_k\) and the weights and biases would depend on the architecture of your neural network and the specific problem you’re solving.
import numpy as np
# Define the neural network architecture
input_size = 4 # Number of input features
hidden_size = 5 # Number of nodes in the hidden layer
output_size = 1 # Number of output nodes
# Initialize random weights and biases
np.random.seed(0)
input_layer_size = input_size
hidden_layer_size = hidden_size
output_layer_size = output_size
# Initialize weights and biases with random values
weights_input_hidden = np.random.randn(input_layer_size, hidden_layer_size)
bias_hidden = np.zeros((1, hidden_layer_size))
weights_hidden_output = np.random.randn(hidden_layer_size, output_layer_size)
bias_output = np.zeros((1, output_size))
# Define the sigmoid activation function
def sigmoid(x):
return 1 / (1 + np.exp(-x))
# Define the ReLU activation function
def ReLU(x):
return np.maximum(0,x)
# Define the forward propagation function
def forward_propagation(X):
# Calculate the values for the hidden layer
hidden_input = np.dot(X.T, weights_input_hidden) + bias_hidden
hidden_output = ReLU(hidden_input)
# Calculate the values for the output layer
output_input = np.dot(hidden_output, weights_hidden_output) + bias_output
output = sigmoid(output_input)
return hidden_input, hidden_output, output_input, output
# Example input data
X = np.array([[0, 0, 1, 1], [0, 1, 0, 1]]).T # Two input samples
# Perform forward propagation
hidden_input, hidden_output, output_input, output = forward_propagation(X)
# Print the output
print("Output:")
print(output)
Output:
[[0.70377943]
[0.80873491]]
import numpy as np
# Define the neural network architecture
input_size = 4 # Number of input features
hidden_layer_sizes = [5, 4, 3] # List of hidden layer sizes
output_size = 1 # Number of output nodes
# Initialize random weights and biases
np.random.seed(0)
layer_sizes = [input_size] + hidden_layer_sizes + [output_size]
num_layers = len(layer_sizes)
# Initialize weights and biases with random values
weights = [np.random.randn(layer_sizes[i], layer_sizes[i+1]) for i in range(num_layers - 1)]
biases = [np.zeros((1, layer_sizes[i+1])) for i in range(num_layers - 1)]
# Define the sigmoid activation function
def sigmoid(x):
return 1 / (1 + np.exp(-x))
# Define the forward propagation function
def forward_propagation(X):
layer_outputs = []
for i in range(num_layers - 1):
if i == 0:
layer_input = X.T
else:
layer_input = layer_outputs[-1]
# Calculate the values for the current layer
layer_input = np.dot(layer_input, weights[i]) + biases[i]
layer_output = sigmoid(layer_input)
layer_outputs.append(layer_output)
return layer_outputs
# Example input data
X = np.array([[0, 0, 1, 1], [0, 1, 0, 1]]).T # Two input samples
# Perform forward propagation
output_layers = forward_propagation(X)
# Print the output of each layer
for i, output in enumerate(output_layers):
print(f"Output of Layer {i + 1}:\n{output}")
# The final output is in output_layers[-1]
print("Final Output:")
print(output_layers[-1])
Output of Layer 1:
[[0.61720885 0.95018557 0.63549859 0.60700562 0.39885637]
[0.34443241 0.92013388 0.41180333 0.55227054 0.39090796]]
Output of Layer 2:
[[0.81859503 0.31702007 0.57683967 0.3962107 ]
[0.86187865 0.24380832 0.5150207 0.42459047]]
Output of Layer 3:
[[0.25982904 0.22612967 0.09002841]
[0.23800412 0.20950305 0.09584056]]
Output of Layer 4:
[[0.40077544]
[0.40814945]]
Final Output:
[[0.40077544]
[0.40814945]]