superluComplex

Computes the LU factorization of a general complex sparse matrix by a column method and solves the complex sparse linear system of equations \(Ax=b\) .

Synopsis

superluComplex (a, b)

Required Arguments

a[] (Input)
Array of length nz containing the location and value of each nonzero entry in the matrix. See the explanation of the sparse element array in the section Matrix Storage Modes in the “Introduction” chapter of this manual.
complex b[] (Input)
Array of length n containing the right-hand side.

Return Value

The solution \(x\) of the sparse linear system \(Ax=b\) . If no solution was computed, then None is returned.

Optional Arguments

equilibrate, int (Inputs)

Specifies if the input matrix A should be equilibrated before factorization.

equilibrate Description
0 Do not equilibrate A before factorization
1 Equilibrate A before factorization.

Default: equilibrate = 0

columnOrderingMethod (Input)

The column ordering method used to preserve sparsity prior to the factorization process. Select the ordering method by setting method to one of the following:

method Description
NATURAL Natural ordering, i.e., the column ordering of the input matrix.
MMD_ATA Minimum degree ordering on the structure of \(A^TA\) .
MMD_AT_PLUS_A Minimum degree ordering on the structure of \(A^T+A\) .
COLAMD Column approximate minimum degree ordering.
PERMC Use ordering given in permutation vector colpermVector, which is input by the user through optional argument colpermVector. Vector colpermVector is a permutation of the numbers 0,1,…,n-1.

Default: method = COLAMD

colpermVector (Input)
Array of length n which defines the permutation matrix \(P_c\) before postordering. This argument is required if columnOrderingMethod with method = PERMC is used. Otherwise, it is ignored.
transpose (Input)

Indicates if the problem \(Ax=b\) or one of the transposed problems \(A^Tx=b\) or \(A^H x = b\) is to be solved.

transpose Description
0 Solve \(Ax=b\) .
1 Solve \(A^Tx=b\) .This option can be used in conjunction with either of the options that supply the factorization.
2 Solve \(A^H x = b\) .This option can be used in conjunction with either of the options that supply the factorization.

Default: transpose = 0

iterativeRefinement (Input)

Indicates if iterative refinement is desired.

iterativeRefinement Description
0 No iterative refinement.
1 Do iterative refinement.

Default: iterativeRefinement = 1

factorSolve, int (Input)

Indicates if the LU factorization, the solution of a linear system or both are to be computed.

factorSolve Description
0 Compute the LU factorization of the input matrix A and solve the system \(Ax=b\) .
1

Only compute the LU factorization of the input matrix and return.

The LU factorization is returned via optional argument returnSparseLuFactor .

Input argument b is ignored.

2

Only solve \(Ax=b\) given the LU factorization of \(A\) .

The LU factorization of \(A\) must be supplied via optional argument supplySparseLuFactor.

Input argument a is ignored unless iterative refinement, computation of the condition number or computation of the reciprocal pivot growth factor is required.

Default: factorSolve = 0

diagPivotThresh, float (Input)

Specifies the threshold used for a diagonal entry to be an acceptable pivot, 0.0 \(\leq\) diagPivotThresh \(\leq\) 1.0.

Default: diagPivotThresh = 1.0.

symmetricMode, int (Input)

Indicates if the symmetric mode option is to be used. This mode should be applied if the input matrix \(A\) is diagonally dominant or nearly so. The user should then define a small diagonal pivot threshold (e.g., 0.0 or 0.01) via optional argument diagPivotThresh and choose an \(\left( A^T + A \right)\) -based column permutation algorithm (e.g., column permutation method MMD_AT_PLUS_A).

symmetricMode Description
0 Do not use symmetric mode option.
1 Use symmetric mode option.

Default: symmetricMode = 0

performanceTuning, int (Input)
Vector of length 6 containing positive parameters that allow the user to tune the performance of the matrix factorization algorithm.
i Description of performanceTuning[i]
0

The panel size.

Default: performanceTuning[0] = 10

1

The relaxation parameter to control supernode amalgamation.

Default: performanceTuning[1] = 5

2

The maximum allowable size for a supernode.

Default: performanceTuning[2] = 100

3

The minimum row dimension to be used for 2D blocking.

Default: performanceTuning[3] = 200

4

The minimum column dimension to be used for 2D blocking.

Default: performanceTuning[4] = 40

5

The estimated fill factor for L and U, compared to A.

Default: performanceTuning[5] = 20

cscFormat, int hbColPtr[], int hbRowInd[], complex hbValues[] (Input)
Accept the coefficient matrix in Compressed Sparse Column (CSC) Format in the main Introduction chapter of this manual for a discussion of this storage scheme.
supplySparseLuFactor, structure (Input)
A structure containing the LU factorization of the input matrix computed with the returnSparseLuFactor option.
returnSparseLuFactor (Output)
A structure containing the LU factorization of the input matrix.
condition (Output)
The estimate of the reciprocal condition number of matrix A after equilibration (if done).
pivotGrowthFactor (Output)

The reciprocal pivot growth factor

\[\min_j \left\{ \| \left(P_rD_rAD_cP_c\right)_j \|_{\infty} / \| U_j \|_{\infty} \right\}\]

If pivotGrowthFactor is much less than 1, the stability of the LU factorization could be poor.

forwardErrorBound (Output)
The estimated forward error bound for the solution vector x. This option requires argument iterativeRefinement set to 1.
backwardError (Output)
The componentwise relative backward error of the solution vector x. This option requires argument iterativeRefinement set to 1.

Description

Consider the sparse linear system of equations

\[Ax = b\]

Here, \(A\) is a general square, nonsingular \(n\) by \(n\) sparse matrix, and \(x\) and \(b\) are vectors of length \(n\) . All entries in \(A\), \(x\) and \(b\) are of complex type.

Gaussian elimination, applied to the system above, can be shortly described as follows:

  1. Compute a triangular factorization \(P_r D_r AD_c P_c = LU\) . Here, \(D_r\) and \(D_c\) are positive definite diagonal matrices to equilibrate the system and \(P_r\) and \(P_c\) are permutation matrices to ensure numerical stability and preserve sparsity. \(L\) is a unit lower triangular matrix and \(U\) is an upper triangular matrix.

  2. Solve \(Ax=b\) by evaluating

    \[x = A^{-1}b = D_c\left(P_c\left(U^{-1}\left(L^{-1}\left(P_r\left(D_rb\right)\right)\right)\right)\right)\]

    This is done efficiently by multiplying from right to left in the last expression: Scale the rows of \(b\) by \(D_r\). Multiplying \(P_r \left( D_r b \right)\) means permuting the rows of \(D_rb\) .

    Multiplying \(L^{-1} \left( P_r D_r b \right)\) means solving the triangular system of equations with matrix \(L\) by substitution. Similarly, multiplying \(U^{-1} \left( L^{-1} \left( P_r D_r b \right) \right)\) means solving the triangular system with \(U\).

Function superluComplex handles step 1 above by default or if optional argument factorSolve is used and set to 1. More precisely, before \(Ax=b\) is solved, the following steps are performed:

  1. Equilibrate matrix \(A\), i.e. compute diagonal matrices \(D_r\) and \(D_c\) so that \(\hat{A} = D_r A D_c\) is “better conditioned” than \(A\) , i.e. \(\hat{A}^{-1}\) is less sensitive to perturbations in \(\hat{A}\) than \(A^{-1}\) is to perturbations in \(A\) .

  2. Order the columns of \(\hat{A}\) to increase the sparsity of the computed \(L\) and \(U\) factors, i.e. replace \(\hat{A}\) by \(\hat{A}P_c\) where \(P_c\) is a column permutation matrix.

  3. Compute the LU factorization of \(\hat{A}P_c\). For numerical stability, the rows of \(\hat{A}P_c\) are eventually permuted through the factorization process by scaled partial pivoting, leading to the decomposition \(\tilde{A} := P_r \hat{A} P_c = LU\). The LU factorization is done by a left looking supernode-panel algorithm with 2-D blocking. See Demmel, Eisenstat, Gilbert et al. (1999) for further information on this technique.

  4. Compute the reciprocal pivot growth factor

    \[\min_{1 \leq j \leq n} \frac{\|\tilde{A}_j\|_{\infty}}{\|U_j\|_{\infty}}\]

    where \(\tilde{A}_j\) and \(U_j\) denote the \(j\) -th column of matrices \(\tilde{A}\) and \(U\), respectively.

  5. Estimate the reciprocal of the condition number of matrix \(\tilde{A}\) .

During the solution process, this information is used to perform the following steps:

  1. Solve the system \(Ax=b\) using the computed triangular L and U factors.
  2. Iteratively refine the solution, again using the computed triangular factors. This is equivalent to Newton’s method.
  3. Compute forward and backward error bounds for the solution vector \(x\).

Some of the steps mentioned above are optional. Their settings can be controlled by the appropriate optional arguments of function superluComplex.

Function superluComplex uses a supernodal storage scheme for the LU factorization of matrix A. The factorization is contained in a returned structure and two sub-structures.

For a definition of supernodes and its use in sparse unsymmetric LU factorization, see the SuperLU Users’ guide (1999) and J.W. Demmel, S. C. Eisenstat et al. (1999).

As an example, consider the matrix

\[\begin{split}A = \begin{bmatrix} 1-i & 0 & 1-i & 1-i & 0 \\ 2 & 1-i & 0 & 0 & 0 \\ 0 & 1+i & 1-i & 0 & 0 \\ 0 & 0 & 0 & 1+i & 1-i \\ 2 & 1+i & 0 & 0 & 2-i \\ \end{bmatrix}\end{split}\]

Factorization of this matrix via superlu using natural column ordering, no equilibration, setting performanceTuning[1] from its default value 5 to 1 and reducing the diagonal pivot thresh factor to 0.5 results in the following LU decomposition:

\[\begin{split}A = LU = \begin{bmatrix} 1 & & & & \\ 1+i & 1 & & & \\ & i & 1 & & \\ & & & 1 & \\ 1+i & i & 2i & 2 & 1 \\ \end{bmatrix} \begin{bmatrix} 1-i & & 1-i & 1-i & \\ & 1-i & -2 & -2 & \\ & & 1+i & 2i & \\ & & & 1+i & 1-i \\ & & & & i \\ \end{bmatrix}\end{split}\]

Considering the filled matrix \(F\) ( \(I\) denoting the identity matrix),

\[\begin{split}F = L+U-I = \begin{bmatrix} 1-i & & 1-i & 1-i & \\ 1+i & 1-i & -2 & -2 & \\ & i & 1+i & 2i & \\ & & & 1+i & 1-i \\ 1+i & i & 2i & 2 & i \\ \end{bmatrix}\end{split}\]

the supernodal structure of the factors of matrix A can be described by

\[\begin{split}\begin{bmatrix} s_1 & & u_3 & u_4 & \\ s_1 & s_2 & s_2 & u_4 & \\ & s_2 & s_2 & u_4 & \\ & & & s_3 & s_3 \\ s_1 & s_2 & s_2 & s_3 & s_3 \\ \end{bmatrix}\end{split}\]

where \(s_i\) denotes a nonzero entry in the \(i\) th supernode and \(u_i\) denotes a nonzero entry in the i‑th column of \(U\) outside the supernodal block.

Therefore, in a supernodal storage scheme the supernodal part of matrix F is stored as the lower block-diagonal matrix

\[\begin{split}L_{\mathit{snode}} \begin{bmatrix} 1-i & & & & \\ 1+i & 1-i & -2 & & \\ & i & 1+i & & \\ & & & 1+i & 1-i \\ 1+i & i & 2i & 2 & i \\ \end{bmatrix}\end{split}\]

and the part outside the supernodes as the upper triangular matrix

\[\begin{split}U_{\mathit{snode}} = \begin{bmatrix} * & & 1-i & 1-i & \\ & * & & -2 & \\ & & * & 2i & \\ & & & * & \\ & & & & * \\ \end{bmatrix}\end{split}\]

This is in accordance with the output for structure super_lu_factor:

Equilibration method: 0

Scale vectors:
rowscale:   1.000000 1.000000 1.000000 1.000000 1.000000
columnscale: 1.000000 1.000000 1.000000 1.000000 1.000000
Permutation vectors:
colperm: 0 1 2 3 4
rowperm: 0 1 2 3 4
Harwell-Boeing matrix U:
nrow 5, ncol 5, nnz 11
nzval: (1.000000,-1.000000) (-2.000000,0.000000) (0.000000,2.000000)
      (1.000000,-1.000000)
rowind: 0 1 2 0
colptr: 0 0 0 1 4 4

Supernodal matrix L:
nrow 5, ncol 5, nnz 11, nsuper 2
nzval:
0      0      (1.000000,-1.000000)
1      0      (1.000000,1.000000)
4      0      (1.000000,1.000000)
1      1      (1.000000,-1.000000)
2      1      (0.000000,1.000000)
4      1      (0.000000,1.000000)
1      2      (-2.000000,0.000000)
2      2      (1.000000,1.000000)
4      2      (0.000000,2.000000)
3      3      (1.000000,1.000000)
4      3      (2.000000,0.000000)
3      4      (1.000000,-1.000000)
4      4      (0.000000,1.000000)

nzval_colptr: 0 3 6 9 11 13
rowind: 0 1 4 1 2 4 3 4
rowind_colptr: 0 3 6 6 8 8
col_to_sup: 0 1 1 2 2
sup_to_col: 0 1 3 5

Function superlu is based on the SuperLU code written by Demmel, Gilbert, Li et al. For more detailed explanations of the factorization and solve steps, see the SuperLU User’s Guide (1999).

Copyright (c) 2003, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from U.S. Dept. of Energy)

All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

(1) Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

(2) Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

(3) Neither the name of Lawrence Berkeley National Laboratory, U.S. Dept. of Energy nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Examples

Example 1

The LU factorization of the sparse complex 6×6 matrix

\[\begin{split}A = \begin{bmatrix} 10+7i & 0 & 0 & 0 & 0 & 0 \\ 0 & 3+2i & -3 & -1+2i & 0 & 0 \\ 0 & 0 & 4+2i & 0 & 0 & 0 \\ -2-4i & 0 & 0 & 1+6i & -1+3i & 0 \\ -5+4i & 0 & 0 & -5 & 12+2i & -7+7i \\ -1+12i & -2+8i & 0 & 0 & 0 & 3+7i \\ \end{bmatrix}\end{split}\]

is computed. Let

\[y: = (1+i, 2 + 2i, 3 + 3i, 4 + 4i, 5 + 5i, 6 + 6i)^T\]

so that

\[b: = Ay = (3 + 17i, -19 + 5i, 6 + 18i, -38 + 32i, -63 + 49i, -57 + 83i)^T\]
\[b_1: = A^Ty = (-112 + 54i, -58 + 46i, 12i, -51 + 5i, 34 + 78i, -94 + 60i)^T\]

and

\[b_2: = A^Hy = (54 - 112i,  46 - 58i, 12,  5 - 51i, 78 + 34i, 60 - 94i^T)\]

The LU factorization of A is used to solve the sparse complex linear systems \(Ax = b\), \(A^Tx = b_1\) and \(A^Hx = b_2\).

from numpy import *
from pyimsl.math.superluSmpComplex import superluSmpComplex
from pyimsl.math.writeMatrixComplex import writeMatrixComplex

a = [[0, 0, (10.0 + 7.0j)],
     [1, 1, (3.0 + 2.0j)],
     [1, 2, (-3.0 + 0.0j)],
     [1, 3, (-1.0 + 2.0j)],
     [2, 2, (4.0 + 2.0j)],
     [3, 0, (-2.0 - 4.0j)],
     [3, 3, (1.0 + 6.0j)],
     [3, 4, (-1.0 + 3.0j)],
     [4, 0, (-5.0 + 4.0j)],
     [4, 3, (-5.0 + 0.0j)],
     [4, 4, (12.0 + 2.0j)],
     [4, 5, (-7.0 + 7.0j)],
     [5, 0, (-1.0 + 12.0j)],
     [5, 1, (-2.0 + 8.0j)],
     [5, 5, (3.0 + 7.0j)]]
b = array([(3.0 + 17.0j), (-19.0 + 5.0j), (6.0 + 18.0j),
           (-38.0 + 32.0j), (-63.0 + 49.0j), (-57.0 + 83.0j)])
b1 = array([(-112.0 + 54.0j), (-58.0 + 46.0j), (0.0 + 12.0j),
            (-51.0 + 5.0j), (34.0 + 78.0j), (-94.0 + 60.0j)])
b2 = array([(54.0 - 112.0j), (46.0 - 58.0j), (12.0 + 0.0j),
            (5.0 - 51.0j), (78.0 + 34.0j), (60.0 - 94.0j)])
x = superluSmpComplex(a, b)
writeMatrixComplex("solution to A*x = b", x, column=True)

x = superluSmpComplex(a, b1, transpose=1)
writeMatrixComplex("solution to A^T*x = b1", x, column=True)

x = superluSmpComplex(a, b2, transpose=2)
writeMatrixComplex("solution to A^H*x = b2", x, column=True)

Output

 
     solution to A*x = b
1  (          1,          1)
2  (          2,          2)
3  (          3,          3)
4  (          4,          4)
5  (          5,          5)
6  (          6,          6)
 
   solution to A^T*x = b1
1  (          1,          1)
2  (          2,          2)
3  (          3,          3)
4  (          4,          4)
5  (          5,          5)
6  (          6,          6)
 
   solution to A^H*x = b2
1  (          1,          1)
2  (          2,          2)
3  (          3,          3)
4  (          4,          4)
5  (          5,          5)
6  (          6,          6)

Example 2

This example uses the matrix \(A = E(1000,10)\) to show how the LU factorization of A can be used to solve a linear system with the same coefficient matrix A but different right-hand side. Maximum absolute errors are printed.

from __future__ import print_function
from numpy import *
from pyimsl.math.ctime import ctime
from pyimsl.math.generateTestCoordinateComplex import generateTestCoordinateComplex
from pyimsl.math.superluComplex import superluComplex
from pyimsl.math.superluFactorFreeComplex import superluFactorFreeComplex
from pyimsl.math.matMulRectCoordinateComplex import matMulRectCoordinateComplex
from pyimsl.math.vectorNormComplex import vectorNormComplex

n = 1000
c = 10

# Get the coefficient matrix
a = generateTestCoordinateComplex(n, c)

# Set two different predetermined solutions
mod_five = empty(n, dtype=complex)
mod_ten = empty(n, dtype=complex)
for i in range(0, n, 1):
    mod_five[i] = ((i % 5) + 0.0j)
    mod_ten[i] = ((i % 10) + 0.0j)

# Choose b so that x will approximate mod_five
b = matMulRectCoordinateComplex("A*x",
                                aMatrix={'nrowa': n, 'ncola': n, 'a': a},
                                xVector=mod_five)


# Time the factor/solve
time_factor_solve = ctime()
lu_factor = []
x = superluComplex(a, b,
                   returnSparseLuFactor=lu_factor)

time_factor_solve = ctime() - time_factor_solve

# Compute max abolute error
index = []
error_factor_solve = vectorNormComplex(x,
                                       secondVector=mod_five,
                                       infNorm=index)

# Get new right hand side -- b = A * mod_ten
b = matMulRectCoordinateComplex("A*x",
                                aMatrix={'nrowa': n, 'ncola': n, 'a': a},
                                xVector=mod_ten)

# Use the previously computed factorization
# to solve Ax = b
time_solve = ctime()
x = superluComplex(a, b,
                   supplySparseLuFactor=lu_factor[0],
                   factorSolve=2)

time_solve = ctime() - time_solve
error_solve = vectorNormComplex(x,
                                secondVector=mod_ten,
                                infNorm=index)

# Print errors and ratio of execution times
print("absolute error (factor/solve) = %e" % error_factor_solve)
print("absolute error (solve)        = %e" % error_solve)

superluFactorFreeComplex(lu_factor[0])

Output

absolute error (factor/solve) = 2.721484e-15
absolute error (solve)        = 5.397709e-15

Warning Errors

IMSL_ILL_CONDITIONED

The input matrix is too ill-conditioned. An estimate of the reciprocal of its \(L_1\) condition number is “rcond” = #.

The solution might not be accurate.

Fatal Errors

IMSL_SINGULAR_MATRIX The input matrix is singular.