﻿ solving a sparse non linear system of equations using scipy.optimize.root

# solving a sparse non linear system of equations using scipy.optimize.root

I want to solve the following non-linear system of equations.

### Notes

• the `dot` between `a_k` and `x` represents `dot product`.
• the `0` in the first equation represents `0 vector` and `0` in the second equation is `scaler 0`
• all the matrices are sparse if that matters.

### Known

• `K` is an `n x n` (positive definite) matrix
• each `A_k` is a known (symmetric) matrix
• each `a_k` is a known n x 1 vector
• `N` is known (let's say N = 50). But I need a method where I can easily change N.

### Unknown (trying to solve for)

• `x` is an `n x 1` a vector.
• each `alpha_k` for `1 <= k <= N` a scaler

### My thinking.

I am thinking of using scipy root to find x and each alpha_k. We essentially have `n` equations from each row of the first equation and another `N` equations from the constraint equations to solve for our `n + N` variables. Therefore we have the required number of equations to have a solution.

I also have a reliable initial guess for `x` and the `alpha_k's`.

### Toy example.

``````n = 4
N = 2
K = np.matrix([[0.5, 0, 0, 0], [0, 1, 0, 0],[0,0,1,0], [0,0,0,0.5]])
A_1 = np.matrix([[0.98,0,0.46,0.80],[0,0,0.56,0],[0.93,0.82,0,0.27],[0,0,0,0.23]])
A_2 = np.matrix([[0.23, 0,0,0],[0.03,0.01,0,0],[0,0.32,0,0],[0.62,0,0,0.45]])
a_1 = np.matrix(scipy.rand(4,1))
a_2 = np.matrix(scipy.rand(4,1))
``````

We are trying to solve for

`````` x = [x1, x2, x3, x4] and alpha_1, alpha_2
``````

### Questions:

1. I can actually brute force this toy problem and feed it to the solver. But how do I do I solve this toy problem in such a way that I can extend it easily to the case when I have let's say `n=50` and `N=50`
2. I will probably have to explicitly compute the Jacobian for larger matrices??.

Can anyone give me any pointers?

I think the `scipy.optimize.root` approach holds water, but steering clear of the trivial solution might be the real challenge for this system of equations.

In any event, this function uses `root` to solve the system of equations.

``````def solver(x0, alpha0, K, A, a):
'''
x0     - nx1 numpy array. Initial guess on x.
alpha0 - nx1 numpy array. Initial guess on alpha.
K      - nxn numpy.array.
A      - Length N List of nxn numpy.arrays.
a      - Length N list of nx1 numpy.arrays.
'''

# Establish the function that produces the rhs of the system of equations.
n = K.shape[0]
N = len(A)
def lhs(x_alpha):
'''
x_alpha is a concatenation of x and alpha.
'''

x = np.ravel(x_alpha[:n])
alpha = np.ravel(x_alpha[n:])
lhs_top = np.ravel(K.dot(x))
for k in xrange(N):
lhs_top += alpha[k]*(np.ravel(np.dot(A[k], x)) + np.ravel(a[k]))

lhs_bottom = [0.5*x.dot(np.ravel(A[k].dot(x))) + np.ravel(a[k]).dot(x)
for k in xrange(N)]

lhs = np.array(lhs_top.tolist() + lhs_bottom)

return lhs

# Solve the system of equations.
x0.shape = (n, 1)
alpha0.shape = (N, 1)
x_alpha_0 = np.vstack((x0, alpha0))
sol = root(lhs, x_alpha_0)
x_alpha_root = sol['x']

# Compute norm of residual.
res = sol['fun']
res_norm = np.linalg.norm(res)

# Break out the x and alpha components.
x_root = x_alpha_root[:n]
alpha_root = x_alpha_root[n:]

return x_root, alpha_root, res_norm
``````

Running on the toy example, however, only produces the trivial solution.

``````# Toy example.
n = 4
N = 2
K = np.matrix([[0.5, 0, 0, 0], [0, 1, 0, 0],[0,0,1,0], [0,0,0,0.5]])
A_1 = np.matrix([[0.98,0,0.46,0.80],[0,0,0.56,0],[0.93,0.82,0,0.27],
[0,0,0,0.23]])
A_2 = np.matrix([[0.23, 0,0,0],[0.03,0.01,0,0],[0,0.32,0,0],
[0.62,0,0,0.45]])
a_1 = np.matrix(scipy.rand(4,1))
a_2 = np.matrix(scipy.rand(4,1))
A = [A_1, A_2]
a = [a_1, a_2]
x0 = scipy.rand(n, 1)
alpha0 = scipy.rand(N, 1)

print 'x0 =', x0
print 'alpha0 =', alpha0

x_root, alpha_root, res_norm = solver(x0, alpha0, K, A, a)

print 'x_root =', x_root
print 'alpha_root =', alpha_root
print 'res_norm =', res_norm
``````

Output is

``````x0 = [[ 0.00764503]
[ 0.08058471]
[ 0.88300129]
[ 0.85299622]]
alpha0 = [[ 0.67872815]
[ 0.69693346]]
x_root = [  9.88131292e-324  -4.94065646e-324   0.00000000e+000
0.00000000e+000]
alpha_root = [ -4.94065646e-324   0.00000000e+000]
res_norm = 0.0
``````