linSolGenCoordinate¶

Solves a sparse system of linear equations $Ax = b$ . Using optional arguments, any of several related computations can be performed. These extra tasks include returning the LU factorization of A, computing the solution of $Ax = b$ given an LU factorization, setting drop tolerances, and controlling iterative refinement.

Synopsis¶

linSolGenCoordinate (a, b)

Required Arguments¶

a (Input): Vector of length nz containing the location and value of each nonzero entry in the matrix.
float b (Input): Vector of length n containing the right-hand side.

Return Value¶

The solution x of the sparse linear system $Ax = b$ . If no solution was computed, then None is returned.

Optional Arguments¶

returnSparseLuFactor (Output)

A structure. The contents are initialized to point to the LU factorization.

supplySparseLuFactor, structure (Input)

A structure. This structure contains the LU factorization of the input matrix computed by linSolGenCoordinate with the returnSparseLuFactor option.

freeSparseLuFactor

Before returning, free the linked list data structure containing the LU factorization of A. Use this option only if the factors are no longer required.

returnSparseLuInCoord, luCoordinate[[]], rowPivots, colPivots (Output)

The LU factorization is returned in coordinate form in an array of length nz in luCoordinate. This is more compact than the internal representation encapsulated in sparseLu. The disadvantage is that during a solveOnly call, the internal representation of the factor must be reconstructed. If however, the factor is to be stored after the program exits, and loaded again at some subsequent run, the combination of returnLuInCoord and supplyLuInCoord is probably the best choice, since the factors are in a format that is simple to store and read.

supplySparseLuInCoord, luCoordinate[[]], rowPivots, colPivots (Input)

Supply the LU factorization in coordinate form. See returnSparseLuInCoord for a description.

factorOnly,

Compute the LU factorization of the input matrix and return. The argument b is ignored.

solveOnly,

Solve

$Ax = b$ given the LU factorization of A. This option requires the use of option supplySparseLuFactor or supplySparseLuInCoord.

transpose,

Solve the problem

$A^Tx = b$ . This option can be used in conjunction with either of the options that supply the factorization.

condition,

Estimate the

$L_1$ condition number of A and return in the variable condition.

pivotingStrategy, int (Input)

Select the pivoting strategy by setting method to one of the following: IMSL_ROW_MARKOWITZ, IMSL_COLUMN_MARKOWITZ, or IMSL_SYMMETRIC_MARKOWITZ.

Default: IMSL_SYMMETRIC_MARKOWITZ.

numberOfSearchRows, int (Input)

The number of rows which have the least number of nonzero elements that will be searched for a pivot element.

Default: numberOfSearchRows = 3.

iterativeRefinement,

Select this option if iterative refinement is desired.

dropTolerance, float tolerance (Input)

Possible fill-in is checked against tolerance. If the absolute value of the new element is less than dropTolerance, it will be discarded.

Default: dropTolerance = 0.0.

hybridFactorization

A list with two elements, hybridFactorization[0]=density, and hybridFactorization[1]=orderBound. Enable the function to switch to a dense factorization method when the density of the active submatrix reaches 0.0 ≤ density ≤ 1.0 and the order of the active submatrix is less than or equal to orderBound.

stabilityFactor, float (Input)

The absolute value of the pivot element must be bigger than the largest element in absolute value in its row divided by stabilityFactor.

Default: stabilityFactor = 10.0.

growthFactorLimit, float (Input)

The computation stops if the growth factor exceeds growthFactorLimit.

Default: growthFactorLimit = 1.0e16.

growthFactor (Output)

Argument growthFactor is calculated as the largest element in absolute value at any stage of the Gaussian elimination divided by the largest element in absolute value in A.

smallestPivot (Output)

The value of the pivot element of smallest magnitude that occurred during the factorization.

numNonzerosInFactor (Output)

A scalar containing the total number of nonzeros in the factor.

cscFormat, int colPtr, int rowInd, float values (Input)

Accept the coefficient matrix in Compressed Sparse Column (CSC) Format. See the main “Introduction” chapter of this manual for a discussion of this storage scheme.

memoryBlockSize, int (Input)

If space must be allocated for fill-in, allocate enough space for memoryBlockSize new nonzero elements.

Default: memoryBlockSize = nz.

Description¶

The function linSolGenCoordinate solves a system of linear equations $Ax = b$ , where A is sparse. In its default use, it solves the so-called one off problem, by first performing an LU factorization of A using the improved generalized symmetric Markowitz pivoting scheme. The factor L is not stored explicitly because the saxpy operations performed during the elimination are extended to the right-hand side, along with any row interchanges. Thus, the system $Ly = b$ is solved implicitly. The factor U is then passed to a triangular solver which computes the solution x from $Ux = y$ .

If a sequence of systems $Ax = b$ are to be solved where A is unchanged, it is usually more efficient to compute the factorization once, and perform multiple forward and back solves with the various right-hand sides. In this case, the factor L is explicitly stored and a record of all row as well as column interchanges is made. The solve step then solves the two triangular systems $Ly = b$ and $Ux = y$ . The user specifies either the returnSparseLuFactor or the returnLuInCoord option to retrieve the factorization, then calls the function subsequently with different right-hand sides, passing the factorization back in using either supplySparseLuFactor or supplySparseLuInCoord in conjunction with solveOnly. If returnSparseLuFactor is used, the final call to linSolGenCoordinate should include freeSparseLuFactor to release the heap used to store L and U.

If the solution to $A^Tx = b$ is required, specify the option transpose. This keyword only alters the forward elimination and back substitution so that the operations $U^Ty = b$ and $L^Tx = y$ are performed to obtain the solution. So, with one call to produce the factorization, solutions to both $Ax = b$ and $A^Tx = b$ can be obtained.

The option condition is used to calculate and return an estimation of the $L_1$ condition number of A. The algorithm used is due to Higham. Specification of condition causes a complete L to be computed and stored, even if a one off problem is being solved. This is due to the fact that Higham’s method requires solution to problems of the form $Az = r$ and $A^Tz = r$ .

The default pivoting strategy is symmetric Markowitz. If a row or column oriented problem is encountered, there may be some reduction in fill-in by selecting either IMSL_ROW_MARKOWITZ or IMSL_COLUMN_MARKOWITZ. The Markowitz strategy will search a pre-elected number of row or columns for pivot candidates. The default number is three, but this can be changed by using numberOfSearchRows.

The option dropTolerance can be used to set a tolerance which can reduce fill-in. This works by preventing any new fill element which has magnitude less than the specified drop tolerance from being added to the factorization. Since this can introduce substantial error into the factorization, it is recommended that iterativeRefinement be used to recover more accuracy in the final solution. The trade-off is between space savings from the drop tolerance and the extra time needed in repeated solve steps needed for refinement.

The function linSolGenCoordinate provides the option of switching to a dense factorization method at some point during the decomposition. This option is enabled by choosing hybridFactorization. One of the two values required by this option, density, specifies a minimum density for the active submatrix before a format switch will occur. A density of 1.0 indicates complete fill-in. The other value, orderBound, places an upper bound on the order of the active submatrix which will be converted to dense format. This is used to prevent a switch from occurring too early, possibly when the $\mathcal{O}(n^3)$ nature of the dense factorization will cause performance degradation. Note that this option can significantly increase heap storage requirements.

Examples¶

Example 1¶

As an example, consider the following matrix:

$\begin{split}A = \begin{bmatrix} 10 & 0 & 0 & 0 & 0 & 0 \\ 0 & 10 & -3 & -1 & 0 & 0 \\ 0 & 0 & 15 & 0 & 0 & 0 \\ -2 & 0 & 0 & 10 & -1 & 0 \\ -1 & 0 & 0 & -5 & 1 & -3 \\ -1 & -2 & 0 & 0 & 0 & 6 \end{bmatrix}\end{split}$

Let $x^T = (1, 2, 3, 4, 5, 6)$ so that $Ax = (10, 7, 45, 33, -34, 31)^T$ . The number of nonzeros in A is nz = 15.

from numpy import *
from pyimsl.math.linSolGenCoordinate import linSolGenCoordinate
from pyimsl.math.writeMatrix import writeMatrix

a = [[0, 0, 10.0],
     [1, 1, 10.0],
     [1, 2, -3.0],
     [1, 3, -1.0],
     [2, 2, 15.0],
     [3, 0, -2.0],
     [3, 3, 10.0],
     [3, 4, -1.0],
     [4, 0, -1.0],
     [4, 3, -5.0],
     [4, 4, 1.0],
     [4, 5, -3.0],
     [5, 0, -1.0],
     [5, 1, -2.0],
     [5, 5, 6.0]]
b = (10.0, 7.0, 45.0, 33.0, -34.0, 31.0)
x = linSolGenCoordinate(a, b)
writeMatrix("solution", x)

Output¶

 
                                  solution
          1            2            3            4            5            6
          1            2            3            4            5            6

Example 2¶

This examples sets $A = E(1000, 10)$ . A linear system is solved and the LU factorization returned. Then a second linear system is solved, using the same coefficient matrix A just factored. Maximum absolute errors and execution time ratios are printed, showing that forward and back solves take approximately 10 percent of the computation time of a factor and solve. This ratio can vary greatly, depending on the order of the coefficient matrix, the initial number of nonzeros, and especially on the amount of fill-in produced during the elimination. Be aware that timing results are highly machine dependent.

from __future__ import print_function
from numpy import *
from pyimsl.math.ctime import ctime
from pyimsl.math.generateTestCoordinate import generateTestCoordinate
from pyimsl.math.linSolGenCoordinate import linSolGenCoordinate
from pyimsl.math.matMulRectCoordinate import matMulRectCoordinate
from pyimsl.math.vectorNorm import vectorNorm

n = 1000
c = 10

# Get the coefficient matrix
a = generateTestCoordinate(n, c)

# Set two different predetermined solutions
mod_five = empty((n), dtype=double)
mod_ten = empty((n), dtype=double)
for i in range(0, n, 1):
    mod_five[i] = (float)(i % 5)
    mod_ten[i] = (float)(i % 10)

# Choose b so that x will approximate mod_five

b = matMulRectCoordinate("A*x",
                         aMatrix={'nrowa': n, 'ncola': n, 'a': a},
                         xVector=mod_five)

# Time the factor/solve
time_factor_solve = ctime()
lu_factor = []
x = linSolGenCoordinate(a, b, returnSparseLuFactor=lu_factor)
time_factor_solve = ctime() - time_factor_solve

# Compute max abolute error
index = []
error_factor_solve = vectorNorm(x, secondVector=mod_five, infNorm=index)

# Get new right hand side -- b = A * mod_ten
b = matMulRectCoordinate("A*x",
                         aMatrix={'nrowa': n, 'ncola': n, 'a': a},
                         xVector=mod_ten)

# Use the previously computed factorization to solve Ax = b
time_solve = ctime()
x = linSolGenCoordinate(a, b,
                        supplySparseLuFactor=lu_factor[0],
                        freeSparseLuFactor=True,
                        solveOnly=True)
time_solve = ctime() - time_solve
index = []
error_solve = vectorNorm(x,
                         secondVector=mod_ten,
                         infNorm=index)

# Print errors and ratio of execution times
print("absolute error (factor/solve) = %e" % error_factor_solve)
print("absolute error (solve)        = %e" % error_solve)
try:
    ratio = time_solve / time_factor_solve
except ZeroDivisionError:
    ratio = 1.0
print("time_solve/time_factor_solve  = %f" % ratio)

Output¶

absolute error (factor/solve) = 3.286260e-14
absolute error (solve)        = 1.119105e-13
time_solve/time_factor_solve  = 0.295454

Example 3¶

This example solves a system $Ax = b$ , where $A = E(500, 50)$ . Then, the same system is solved using a large drop tolerance. Finally, using the factorization just computed, the same linear system is solved with iterative refinement. Be aware that timing results are highly machine dependent.

from __future__ import print_function
from numpy import *
from pyimsl.math.ctime import ctime
from pyimsl.math.generateTestCoordinate import generateTestCoordinate
from pyimsl.math.linSolGenCoordinate import linSolGenCoordinate
from pyimsl.math.matMulRectCoordinate import matMulRectCoordinate
from pyimsl.math.vectorNorm import vectorNorm

n = 500
c = 50

# Get the coefficient matrix
a = generateTestCoordinate(n, c)
for i in range(0, len(a)):
    a[i][2] *= 0.05

# Set a predetermined solution
mod_five = empty((n), dtype=double)
for i in range(0, n, 1):
    mod_five[i] = (float)(i % 5)

# Choose b so that x will approximate mod_five
b = matMulRectCoordinate("A*x",
                         aMatrix={'nrowa': n, 'ncola': n, 'a': a},
                         xVector=mod_five)

# Time the factor/solve
time_zero_drop_tol = ctime()
nz_zero_drop_tol = []
x = linSolGenCoordinate(a, b,
                        numNonzerosInFactor=nz_zero_drop_tol)
time_zero_drop_tol = max(
    ctime() - time_zero_drop_tol, .001)  # make sure non-zero

# Compute max abolute error
index = []
error_zero_drop_tol = vectorNorm(x,
                                 secondVector=mod_five,
                                 infNorm=index)

# Solve the same problem, with drop
# tolerance = 0.005
time_nonzero_drop_tol = ctime()
lu_factor = []
nz_nonzero_drop_tol = []
x = linSolGenCoordinate(a, b,
                        returnSparseLuFactor=lu_factor,
                        dropTolerance=0.005,
                        numNonzerosInFactor=nz_nonzero_drop_tol)
time_nonzero_drop_tol = ctime() - time_nonzero_drop_tol

# Compute max abolute error
error_nonzero_drop_tol = vectorNorm(x, secondVector=mod_five,
                                    infNorm=index)

# Solve the same problem with IR, use last factorization
time_nonzero_drop_tol_IR = ctime()
x = linSolGenCoordinate(a, b, supplySparseLuFactor=lu_factor[0],
                        freeSparseLuFactor=True,
                        solveOnly=True,
                        iterativeRefinement=True)
time_nonzero_drop_tol_IR = ctime() - time_nonzero_drop_tol_IR

# Compute max abolute error
error_nonzero_drop_tol_IR = vectorNorm(x,
                                       secondVector=mod_five,
                                       infNorm=index)

# Print errors and ratio of execution times
print("drop tolerance = 0.0")
print("\tabsolute error = %e" % error_zero_drop_tol)
print("\tfillin         = %d\n" % nz_zero_drop_tol[0])
print("drop tolerance = 0.005")
print("\tabsolute error = %e" % error_nonzero_drop_tol)
print("\tfillin         = %d\n" % nz_nonzero_drop_tol[0])
print("drop tolerance = 0.005 (with IR)")
print("\tabsolute error = %e" % error_nonzero_drop_tol_IR)
print("\tfillin         = %d\n" % nz_nonzero_drop_tol[0])
try:
    ratio = time_nonzero_drop_tol / time_zero_drop_tol
except ZeroDivisionError:
    ratio = 1.0
print("time_nonzero_drop_tol / time_zero_drop_tol  = %f" % ratio)
try:
    ratio = time_nonzero_drop_tol_IR / time_zero_drop_tol
except ZeroDivisionError:
    ratio = 1.0
print("time_nonzero_drop_tol_IR / time_zero_drop_tol  = %f" % ratio)

Output¶

drop tolerance = 0.0
	absolute error = 1.820766e-14
	fillin         = 9530

drop tolerance = 0.005
	absolute error = 2.699567e+00
	fillin         = 8656

drop tolerance = 0.005 (with IR)
	absolute error = 5.329071e-15
	fillin         = 8656

time_nonzero_drop_tol / time_zero_drop_tol  = 1.035835
time_nonzero_drop_tol_IR / time_zero_drop_tol  = 1.207803

Notice the absolute error when iterative refinement is not used. Also note that iterative refinement itself can be quite expensive. In this case, for example, the IR solve took approximately as much time as the factorization. For this problem the use of a drop high drop tolerance and iterative refinement was able to reduce fill-in by 10 percent at a time cost double that of the default usage. In tight memory situations, such a trade-off may be acceptable. Users should be aware that a drop tolerance can be chosen large enough, introducing large errors into LU, to prevent convergence of iterative refinement.