Problem Sheet 8#
Question 1#
Suppose we have a cost function for producing a product that depends on two variables, \(x\) and \(y\), given by:
Find the values of \(x\) and \(y\) that minimize the cost function using the Newton-Raphson method. Choose an initial guess for the root, (x0, y0) = (1, 1).
Question 2#
a. Suppose we have a two variables loss function, \(\theta_1\) and \(\theta_2\), given by:
Find the values of \(x\) and \(y\) that minimize the cost function using the Newton-Raphson method, with the initial conditions \(\theta_1 = 2.0\) and \(\theta_2 = 2.0\).
b. Suppose we have a cost function for producing a product that depends on two variables, \(x\) and \(y\), given by:
Find the values of \(x\) and \(y\) that minimize the cost function using the Newton-Raphson method. Choose an initial guess for the root, (x0, y0) = (1, 1).
c. Describe in your own word the steps of the multi-variable Newton-Raphson method.
Question 4#
Describe in your own words the perceptron and the role of different activation functions.
Question 5#
a. Describe in your own words the McCulloch-Pitts Neuron.
b. State the mathematical formula for the feed-forward calculation of an artificial neural network (ANN) with three inputs, two hidden layers, each having three ReLU (Rectified Linear Unit) nodes, and an output layer with a Sigmoid activation function.
c. State the mathematical formula for the feed-forward calculation of an general artificial neural network (ANN) with \(n_{inputs}\) inputs, \(k\) hidden layers, each having \(n_k\) ReLU (Rectified Linear Unit) nodes, and an output layer with a Sigmoid activation function.
Question 6#
a. i. Describe in your own words the gradient descent algorithm and state the strengths and weaknesses of the algorithm.
ii. Find the mimimum of simple quadratic cost function
using gradient descent, with the inital guess of \(\theta_0=10\) and a learning rate of \(\alpha = 0.1\), for two iterations.
b. In your own words outline the proof of the theorem:
Let \( f: \mathbb{R}^n \rightarrow \mathbb{R} \) be a convex and continuously differentiable function. Assume the gradient of \( f \), denoted by \( \nabla f(x) \), is Lipschitz continuous with Lipschitz constant \( L > 0 \). If the learning rate \( \alpha \) satisfies \( 0 < \alpha \leq \frac{1}{L} \), then the sequence \( \{x_k\} \) generated by the gradient descent update rule
converges to the global minimum of \( f \).
c. Find the mimimum of simple cost function with two variables:
using the gradient descent, with the inital guesses
\(\theta_0 = 1.0\)
\(\theta_1 = 2.0\)
and a learning rate of \(\alpha = 0.1\), for two iterations.
d. 9. Given the cost function with three variables:
Minize using gradient descent, given the initial conditions
\(\theta_1 = 1.0\), \(\theta_2 = 2.0\), \(\theta_3 = 3.0\) and learning rate of \(\alpha = 0.1\) for two iterations.
Question 7#
Describe the relevance and different types of cost functions for an artificial neural networks.
Question 8#
a. Outline the back-propagation algorithm for an artificial neural networks.