linSolGenCoordinate¶
Solves a sparse system of linear equations \(Ax = b\). Using optional arguments, any of several related computations can be performed. These extra tasks include returning the LU factorization of A, computing the solution of \(Ax = b\) given an LU factorization, setting drop tolerances, and controlling iterative refinement.
Synopsis¶
linSolGenCoordinate (a, b)
Required Arguments¶
a
(Input)- Vector of length
nz
containing the location and value of each nonzero entry in the matrix. - float
b
(Input) - Vector of length
n
containing the right-hand side.
Return Value¶
The solution x of the sparse linear system \(Ax = b\). If no solution was
computed, then None
is returned.
Optional Arguments¶
returnSparseLuFactor
(Output)- A structure. The contents are initialized to point to the LU
factorization
. supplySparseLuFactor
, structure (Input)- A structure. This structure contains the LU factorization of the input
matrix computed by
linSolGenCoordinate
with thereturnSparseLuFactor
option. freeSparseLuFactor
- Before returning, free the linked list data structure containing the LU factorization of A. Use this option only if the factors are no longer required.
returnSparseLuInCoord
,luCoordinate[[]]
,rowPivots
,colPivots
(Output)- The LU factorization is returned in coordinate form in an array of
length
nz
inluCoordinate
. This is more compact than the internal representation encapsulated insparseLu
. The disadvantage is that during asolveOnly
call, the internal representation of the factor must be reconstructed. If however, the factor is to be stored after the program exits, and loaded again at some subsequent run, the combination ofreturnLuInCoord
andsupplyLuInCoord
is probably the best choice, since the factors are in a format that is simple to store and read. supplySparseLuInCoord
,luCoordinate[[]]
,rowPivots
,colPivots
(Input)- Supply the LU factorization in coordinate form. See
returnSparseLuInCoord
for a description. factorOnly
,- Compute the LU factorization of the input matrix and return. The
argument
b
is ignored. solveOnly
,- Solve \(Ax = b\) given the LU factorization of A. This option
requires the use of option
supplySparseLuFactor
orsupplySparseLuInCoord
. transpose
,- Solve the problem \(A^Tx = b\). This option can be used in conjunction with either of the options that supply the factorization.
condition
,- Estimate the \(L_1\) condition number of A and return in the
variable
condition
. pivotingStrategy
, int (Input)Select the pivoting strategy by setting method to one of the following:
IMSL_ROW_MARKOWITZ
,IMSL_COLUMN_MARKOWITZ
, orIMSL_SYMMETRIC_MARKOWITZ
.Default:
IMSL_SYMMETRIC_MARKOWITZ
.numberOfSearchRows
, int (Input)The number of rows which have the least number of nonzero elements that will be searched for a pivot element.
Default:
numberOfSearchRows
= 3.iterativeRefinement
,- Select this option if iterative refinement is desired.
dropTolerance
, floattolerance
(Input)Possible fill-in is checked against tolerance. If the absolute value of the new element is less than
dropTolerance
, it will be discarded.Default:
dropTolerance
= 0.0.hybridFactorization
- A list with two elements,
hybridFactorization[0]
=density
, andhybridFactorization[1]
=orderBound
. Enable the function to switch to a dense factorization method when the density of the active submatrix reaches 0.0 ≤density
≤ 1.0 and the order of the active submatrix is less than or equal toorderBound
. stabilityFactor
, float (Input)The absolute value of the pivot element must be bigger than the largest element in absolute value in its row divided by
stabilityFactor
.Default:
stabilityFactor
= 10.0.growthFactorLimit
, float (Input)The computation stops if the growth factor exceeds
growthFactorLimit
.Default:
growthFactorLimit
= 1.0e16.growthFactor
(Output)- Argument
growthFactor
is calculated as the largest element in absolute value at any stage of the Gaussian elimination divided by the largest element in absolute value in A. smallestPivot
(Output)- The value of the pivot element of smallest magnitude that occurred during the factorization.
numNonzerosInFactor
(Output)- A scalar containing the total number of nonzeros in the factor.
cscFormat
, intcolPtr
, introwInd
, floatvalues
(Input)- Accept the coefficient matrix in Compressed Sparse Column (CSC) Format. See the main “Introduction” chapter of this manual for a discussion of this storage scheme.
memoryBlockSize
, int (Input)If space must be allocated for fill-in, allocate enough space for
memoryBlockSize
new nonzero elements.Default:
memoryBlockSize
=nz
.
Description¶
The function linSolGenCoordinate
solves a system of linear equations
\(Ax = b\), where A is sparse. In its default use, it solves the so-called
one off problem, by first performing an LU factorization of A using
the improved generalized symmetric Markowitz pivoting scheme. The factor L
is not stored explicitly because the saxpy
operations performed during
the elimination are extended to the right-hand side, along with any row
interchanges. Thus, the system \(Ly = b\) is solved implicitly. The factor
U is then passed to a triangular solver which computes the solution x
from \(Ux = y\).
If a sequence of systems \(Ax = b\) are to be solved where A is unchanged,
it is usually more efficient to compute the factorization once, and perform
multiple forward and back solves with the various right-hand sides. In this
case, the factor L is explicitly stored and a record of all row as well as
column interchanges is made. The solve step then solves the two triangular
systems \(Ly = b\) and \(Ux = y\). The user specifies either the
returnSparseLuFactor
or the returnLuInCoord
option to retrieve the
factorization, then calls the function subsequently with different
right-hand sides, passing the factorization back in using either
supplySparseLuFactor
or supplySparseLuInCoord
in conjunction with
solveOnly
. If returnSparseLuFactor
is used, the final call to
linSolGenCoordinate
should include freeSparseLuFactor
to release the
heap used to store L and U.
If the solution to \(A^Tx = b\) is required, specify the option
transpose
. This keyword only alters the forward elimination and back
substitution so that the operations \(U^Ty = b\) and \(L^Tx = y\) are
performed to obtain the solution. So, with one call to produce the
factorization, solutions to both \(Ax = b\) and \(A^Tx = b\) can be
obtained.
The option condition
is used to calculate and return an estimation of the
\(L_1\) condition number of A. The algorithm used is due to Higham.
Specification of condition
causes a complete L to be computed and
stored, even if a one off problem is being solved. This is due to the fact
that Higham’s method requires solution to problems of the form \(Az = r\)
and \(A^Tz = r\).
The default pivoting strategy is symmetric Markowitz. If a row or column
oriented problem is encountered, there may be some reduction in fill-in by
selecting either IMSL_ROW_MARKOWITZ
or IMSL_COLUMN_MARKOWITZ
. The
Markowitz strategy will search a pre-elected number of row or columns for
pivot candidates. The default number is three, but this can be changed by
using numberOfSearchRows
.
The option dropTolerance
can be used to set a tolerance which can reduce
fill-in. This works by preventing any new fill element which has magnitude
less than the specified drop tolerance from being added to the
factorization. Since this can introduce substantial error into the
factorization, it is recommended that iterativeRefinement
be used to
recover more accuracy in the final solution. The trade-off is between space
savings from the drop tolerance and the extra time needed in repeated solve
steps needed for refinement.
The function linSolGenCoordinate
provides the option of switching to a
dense factorization method at some point during the decomposition. This
option is enabled by choosing hybridFactorization
. One of the two values
required by this option, density
, specifies a minimum density for the
active submatrix before a format switch will occur. A density of 1.0
indicates complete fill-in. The other value, orderBound
, places an upper
bound on the order of the active submatrix which will be converted to dense
format. This is used to prevent a switch from occurring too early, possibly
when the \(\mathcal{O}(n^3)\) nature of the dense factorization will cause
performance degradation. Note that this option can significantly increase
heap storage requirements.
Examples¶
Example 1¶
As an example, consider the following matrix:
Let \(x^T = (1, 2, 3, 4, 5, 6)\) so that \(Ax = (10, 7, 45, 33, -34,
31)^T\). The number of nonzeros in A is nz
= 15.
from numpy import *
from pyimsl.math.linSolGenCoordinate import linSolGenCoordinate
from pyimsl.math.writeMatrix import writeMatrix
a = [[0, 0, 10.0],
[1, 1, 10.0],
[1, 2, -3.0],
[1, 3, -1.0],
[2, 2, 15.0],
[3, 0, -2.0],
[3, 3, 10.0],
[3, 4, -1.0],
[4, 0, -1.0],
[4, 3, -5.0],
[4, 4, 1.0],
[4, 5, -3.0],
[5, 0, -1.0],
[5, 1, -2.0],
[5, 5, 6.0]]
b = (10.0, 7.0, 45.0, 33.0, -34.0, 31.0)
x = linSolGenCoordinate(a, b)
writeMatrix("solution", x)
Output¶
solution
1 2 3 4 5 6
1 2 3 4 5 6
Example 2¶
This examples sets \(A = E(1000, 10)\). A linear system is solved and the LU factorization returned. Then a second linear system is solved, using the same coefficient matrix A just factored. Maximum absolute errors and execution time ratios are printed, showing that forward and back solves take approximately 10 percent of the computation time of a factor and solve. This ratio can vary greatly, depending on the order of the coefficient matrix, the initial number of nonzeros, and especially on the amount of fill-in produced during the elimination. Be aware that timing results are highly machine dependent.
from __future__ import print_function
from numpy import *
from pyimsl.math.ctime import ctime
from pyimsl.math.generateTestCoordinate import generateTestCoordinate
from pyimsl.math.linSolGenCoordinate import linSolGenCoordinate
from pyimsl.math.matMulRectCoordinate import matMulRectCoordinate
from pyimsl.math.vectorNorm import vectorNorm
n = 1000
c = 10
# Get the coefficient matrix
a = generateTestCoordinate(n, c)
# Set two different predetermined solutions
mod_five = empty((n), dtype=double)
mod_ten = empty((n), dtype=double)
for i in range(0, n, 1):
mod_five[i] = (float)(i % 5)
mod_ten[i] = (float)(i % 10)
# Choose b so that x will approximate mod_five
b = matMulRectCoordinate("A*x",
aMatrix={'nrowa': n, 'ncola': n, 'a': a},
xVector=mod_five)
# Time the factor/solve
time_factor_solve = ctime()
lu_factor = []
x = linSolGenCoordinate(a, b, returnSparseLuFactor=lu_factor)
time_factor_solve = ctime() - time_factor_solve
# Compute max abolute error
index = []
error_factor_solve = vectorNorm(x, secondVector=mod_five, infNorm=index)
# Get new right hand side -- b = A * mod_ten
b = matMulRectCoordinate("A*x",
aMatrix={'nrowa': n, 'ncola': n, 'a': a},
xVector=mod_ten)
# Use the previously computed factorization to solve Ax = b
time_solve = ctime()
x = linSolGenCoordinate(a, b,
supplySparseLuFactor=lu_factor[0],
freeSparseLuFactor=True,
solveOnly=True)
time_solve = ctime() - time_solve
index = []
error_solve = vectorNorm(x,
secondVector=mod_ten,
infNorm=index)
# Print errors and ratio of execution times
print("absolute error (factor/solve) = %e" % error_factor_solve)
print("absolute error (solve) = %e" % error_solve)
try:
ratio = time_solve / time_factor_solve
except ZeroDivisionError:
ratio = 1.0
print("time_solve/time_factor_solve = %f" % ratio)
Output¶
absolute error (factor/solve) = 3.286260e-14
absolute error (solve) = 1.119105e-13
time_solve/time_factor_solve = 0.295454
Example 3¶
This example solves a system \(Ax = b\), where \(A = E(500, 50)\). Then, the same system is solved using a large drop tolerance. Finally, using the factorization just computed, the same linear system is solved with iterative refinement. Be aware that timing results are highly machine dependent.
from __future__ import print_function
from numpy import *
from pyimsl.math.ctime import ctime
from pyimsl.math.generateTestCoordinate import generateTestCoordinate
from pyimsl.math.linSolGenCoordinate import linSolGenCoordinate
from pyimsl.math.matMulRectCoordinate import matMulRectCoordinate
from pyimsl.math.vectorNorm import vectorNorm
n = 500
c = 50
# Get the coefficient matrix
a = generateTestCoordinate(n, c)
for i in range(0, len(a)):
a[i][2] *= 0.05
# Set a predetermined solution
mod_five = empty((n), dtype=double)
for i in range(0, n, 1):
mod_five[i] = (float)(i % 5)
# Choose b so that x will approximate mod_five
b = matMulRectCoordinate("A*x",
aMatrix={'nrowa': n, 'ncola': n, 'a': a},
xVector=mod_five)
# Time the factor/solve
time_zero_drop_tol = ctime()
nz_zero_drop_tol = []
x = linSolGenCoordinate(a, b,
numNonzerosInFactor=nz_zero_drop_tol)
time_zero_drop_tol = max(
ctime() - time_zero_drop_tol, .001) # make sure non-zero
# Compute max abolute error
index = []
error_zero_drop_tol = vectorNorm(x,
secondVector=mod_five,
infNorm=index)
# Solve the same problem, with drop
# tolerance = 0.005
time_nonzero_drop_tol = ctime()
lu_factor = []
nz_nonzero_drop_tol = []
x = linSolGenCoordinate(a, b,
returnSparseLuFactor=lu_factor,
dropTolerance=0.005,
numNonzerosInFactor=nz_nonzero_drop_tol)
time_nonzero_drop_tol = ctime() - time_nonzero_drop_tol
# Compute max abolute error
error_nonzero_drop_tol = vectorNorm(x, secondVector=mod_five,
infNorm=index)
# Solve the same problem with IR, use last factorization
time_nonzero_drop_tol_IR = ctime()
x = linSolGenCoordinate(a, b, supplySparseLuFactor=lu_factor[0],
freeSparseLuFactor=True,
solveOnly=True,
iterativeRefinement=True)
time_nonzero_drop_tol_IR = ctime() - time_nonzero_drop_tol_IR
# Compute max abolute error
error_nonzero_drop_tol_IR = vectorNorm(x,
secondVector=mod_five,
infNorm=index)
# Print errors and ratio of execution times
print("drop tolerance = 0.0")
print("\tabsolute error = %e" % error_zero_drop_tol)
print("\tfillin = %d\n" % nz_zero_drop_tol[0])
print("drop tolerance = 0.005")
print("\tabsolute error = %e" % error_nonzero_drop_tol)
print("\tfillin = %d\n" % nz_nonzero_drop_tol[0])
print("drop tolerance = 0.005 (with IR)")
print("\tabsolute error = %e" % error_nonzero_drop_tol_IR)
print("\tfillin = %d\n" % nz_nonzero_drop_tol[0])
try:
ratio = time_nonzero_drop_tol / time_zero_drop_tol
except ZeroDivisionError:
ratio = 1.0
print("time_nonzero_drop_tol / time_zero_drop_tol = %f" % ratio)
try:
ratio = time_nonzero_drop_tol_IR / time_zero_drop_tol
except ZeroDivisionError:
ratio = 1.0
print("time_nonzero_drop_tol_IR / time_zero_drop_tol = %f" % ratio)
Output¶
drop tolerance = 0.0
absolute error = 1.820766e-14
fillin = 9530
drop tolerance = 0.005
absolute error = 2.699567e+00
fillin = 8656
drop tolerance = 0.005 (with IR)
absolute error = 5.329071e-15
fillin = 8656
time_nonzero_drop_tol / time_zero_drop_tol = 1.035835
time_nonzero_drop_tol_IR / time_zero_drop_tol = 1.207803
Notice the absolute error when iterative refinement is not used. Also note that iterative refinement itself can be quite expensive. In this case, for example, the IR solve took approximately as much time as the factorization. For this problem the use of a drop high drop tolerance and iterative refinement was able to reduce fill-in by 10 percent at a time cost double that of the default usage. In tight memory situations, such a trade-off may be acceptable. Users should be aware that a drop tolerance can be chosen large enough, introducing large errors into LU, to prevent convergence of iterative refinement.