Solves a sparse system of linear equations Ax = b. Using optional arguments, any of several related computations can be performed. These extra tasks include returning the LU factorization of A computing the solution of Ax = b given an LU factorization setting drop tolerances, and controlling iterative refinement.
#include <imsl.h>
float *imsl_f_lin_sol_gen_coordinate (int n, int nz, Imsl_f_sparse_elem *a, float *b, ..., 0)
The type double function is imsl_d_lin_sol_gen_coordinate.
int n
(Input)
Number of rows in the matrix.
int nz
(Input)
Number of nonzeros in the matrix.
Imsl_f_sparse_elem *a
(Input)
Vector of length nz containing the
location and value of each nonzero entry in the matrix.
float *b
(Input)
Vector of length n containing the
right-hand side.
A pointer to the solution x of the sparse linear system Ax = b. To release this space, use free. If no solution was computed, then NULL is returned.
#include <imsl.h>
float
*imsl_f_lin_sol_gen_coordinate (int n, int nz,
Imsl_f_sparse_elem *a, float *b,
IMSL_RETURN_SPARSE_LU_FACTOR,
Imsl_f_sparse_lu_factor *lu_factor,
IMSL_SUPPLY_SPARSE_LU_FACTOR,
Imsl_f_sparse_lu_factor
*lu_factor,
IMSL_FREE_SPARSE_LU_FACTOR,
IMSL_RETURN_SPARSE_LU_IN_COORD,
Imsl_f_sparse_elem **lu_coordinate,
int **row_pivots, int
**col_pivots,
IMSL_SUPPLY_SPARSE_LU_IN_COORD,
Imsl_f_sparse_elem *lu_coordinate, int *row_pivots,
int
*col_pivots,
IMSL_FACTOR_ONLY,
IMSL_SOLVE_ONLY,
IMSL_RETURN_USER,
float x[],
IMSL_TRANSPOSE,
IMSL_CONDITION, float
*condition,
IMSL_PIVOTING_STRATEGY, Imsl_pivot
method,
IMSL_NUM_OF_SEARCH_ROWS, int
num_search_row,
IMSL_ITERATIVE_REFINEMENT,
IMSL_DROP_TOLERANCE,
float tolerance,
IMSL_HYBRID_FACTORIZATION, float density,
int order_bound,
IMSL_STABILITY_FACTOR, float
s_factor,
IMSL_GROWTH_FACTOR_LIMIT, float
gf_limit,
IMSL_GROWTH_FACTOR, float *gf,
IMSL_SMALLEST_PIVOT,
float *small_pivot
IMSL_NUM_NONZEROS_IN_FACTOR, int
*num_nonzeros,
IMSL_CSC_FORMAT, int *col_ptr, int *row_ind,
float *values,
IMSL_MEMORY_BLOCK_SIZE, int
block_size,
0)
IMSL_RETURN_SPARSE_LU_FACTOR,
Imsl_f_sparse_lu_factor
*lu_factor (Output)
The address of a structure of type
Imsl_f_sparse_lu_factor. The pointers within the structure are
initialized to point to the LU factorization by imsl_f_lin_sol_gen_coordinate.
IMSL_SUPPLY_SPARSE_LU_FACTOR,
Imsl_f_sparse_lu_factor *lu_factor (Input)
The address of a structure of type Imsl_f_sparse_lu_factor. This
structure contains the LU factorization of the input matrix computed by
imsl_f_lin_sol_gen_coordinate
with the IMSL_RETURN_SPARSE_LU_FACTOR
option.
IMSL_FREE_SPARSE_LU_FACTOR,
Before returning, free the linked list data structure containing the
LU factorization of A. Use this option only if the factors are
no longer required.
IMSL_RETURN_SPARSE_LU_IN_COORD,
Imsl_f_sparse_elem **lu_coordinate, int **row_pivots, int **col_pivots
(Output)
The LU factorization is returned in coordinate form. This is
more compact than the internal representation encapsulated in Imsl_f_sparse_lu. The
disadvantage is that during a SOLVE_ONLY call, the
internal representation of the factor must be reconstructed. If however, the
factor is to be stored after the program exits, and loaded again at some
subsequent run, the combination of IMSL_RETURN_LU_IN_COORD
and IMSL_SUPPLY_LU_IN_COORD
is probably the best choice, since the factors are in a format that is simple to
store and read.
IMSL_SUPPLY_SPARSE_LU_IN_COORD,
Imsl_f_sparse_elem *lu_coordinate, int *row_pivots, int *col_pivots
(Output)
Supply the LU factorization in coordinate form. See IMSL_RETURN_SPARSE_LU_IN_COORD
for a description.
IMSL_FACTOR_ONLY,
Compute
the LU factorization of the input matrix and return. The argument b is ignored.
IMSL_SOLVE_ONLY,
Solve
Ax = b given the LU factorization of A. This option
requires the use of option IMSL_SUPPLY_SPARSE_LU_FACTOR
or IMSL_SUPPLY_SPARSE_LU_IN_COORD.
IMSL_RETURN_USER, float x[]
(Output)
A user-allocated array of length n containing the solution
x.
IMSL_TRANSPOSE,
Solve
the problem ATx = b. This option can be
used in conjunction with either of the options that supply the
factorization.
IMSL_CONDITION, float
*condition,
Estimate the L1 condition number of A and return
in the variable condition.
IMSL_PIVOTING_STRATEGY,
Imsl_pivot method
(Input)
Select the pivoting strategy by setting method to one of the
following: IMSL_ROW_MARKOWITZ,
IMSL_COLUMN_MARKOWITZ,
or IMSL_SYMMETRIC_MARKOWITZ.
Default:
IMSL_SYMMETRIC_MARKOWITZ.
IMSL_NUM_OF_SEARCH_ROWS, int
num_search_row (Input)
The number of rows which have the
least number of nonzero elements that will be searched for a pivot
element.
Default: num_search_row = 3
IMSL_ITERATIVE_REFINEMENT,
Select
this option if iterative refinement is desired.
IMSL_DROP_TOLERANCE, float tolerance
(Input)
Possible fill-in is checked against tolerance. If the absolute value
of the new element is less than tolerance, it will be
discarded.
Default: tolerance = 0.0
IMSL_HYBRID_FACTORIZATION, float density, int
order_bound,
Enable the function to switch to a dense factorization
method when the density of the active submatrix reaches 0.0 £ density £ 1.0 and the order of the active submatrix
is less than or equal to order_bound.
IMSL_STABILITY_FACTOR, float s_factor
(Input)
The absolute value of the pivot element must be bigger than the
largest element in absolute value in its row divided by s_factor.
Default:
s_factor =
10.0
IMSL_GROWTH_FACTOR_LIMIT, float gf_limit
(Input)
The computation stops if the growth factor exceeds gf_limit.
Default:
gf_limit =
1.0e16
IMSL_GROWTH_FACTOR, float *gf
(Output)
Argument gf is calculated as
the largest element in absolute value at any stage of the Gaussian elimination
divided by the largest element in absolute value in A.
IMSL_SMALLEST_PIVOT, float
*small_pivot (Output)
A pointer to the value of the pivot
element of smallest magnitude that occurred during the factorization.
IMSL_NUM_NONZEROS_IN_FACTOR, int
*num_nonzeros (Output)
A pointer to a scalar containing
the total number of nonzeros in the factor.
IMSL_CSC_FORMAT, int *col_ptr, int *row_ind, float *values
(Input)
Accept the coefficient matrix in compressed sparse column (CSC)
format.
See the main “Introduction” chapter of this manual
for a discussion of this storage scheme.
IMSL_MEMORY_BLOCKSIZE, int blocksize
(Input)
If space must be allocated for fill-in, allocate enough space for
blocksize new
nonzero elements.
Default: blocksize = nz
The function imsl_f_lin_sol_gen_coordinate solves a system of linear equations Ax = b, where A is sparse. In its default use, it solves the so-called one off problem, by first performing an LU factorization of A using the improved generalized symmetric Markowitz pivoting scheme. The factor L is not stored explicitly because the saxpy operations performed during the elimination are extended to the right-hand side, along with any row interchanges. Thus, the system Ly = b is solved implicitly. The factor U is then passed to a triangular solver which computes the solution x from Ux = y.
If a sequence of systems Ax = b are to be
solved where A is unchanged, it is usually more efficient to compute
the factorization once, and perform multiple forward and back solves with the
various right-hand sides. In this case, the factor L is explicitly
stored and a record of all row as well as column interchanges is made. The solve
step then solves the two triangular systems Ly = b and Ux
= y. The user specifies either the IMSL_RETURN_SPARSE_LU_FACTOR
or the IMSL_RETURN_LU_IN_COORD
option to retrieve the factorization, then calls the function subsequently with
different right-hand sides, passing the factorization back in using either
IMSL_SUPPLY_SPARSE_LU_FACTOR
or IMSL_SUPPLY_SPARSE_LU_IN_COORD
in conjunction with IMSL_SOLVE_ONLY.
If IMSL_RETURN_SPARSE_LU_FACTOR
is used, the final call to imsl_lin_sol_gen_coordinate
should include IMSL_FREE_SPARSE_LU_FACTOR
to release the heap used to store L and U.
If the solution to ATx = b is required, specify the option IMSL_TRANSPOSE. This keyword only alters the forward elimination and back substitution so that the operations UTy = b and LTx = y are performed to obtain the solution. So, with one call to produce the factorization, solutions to both Ax = b and ATx = b can be obtained.
The option IMSL_CONDITION
is used to calculate and return an estimation of the
L1 condition number
of A. The algorithm used is due to Higham. Specification of IMSL_CONDITION
causes a complete L to be computed and stored, even if a one off
problem is being solved. This is due to the fact that Higham’s method requires
solution to problems of the form Az = r and ATz =
r.
The default pivoting strategy is symmetric Markowitz. If a row or column oriented problem is encountered, there may be some reduction in fill-in by selecting either IMSL_ROW_MARKOWITZ or IMSL_COLUMN_MARKOWITZ. The Markowitz strategy will search a pre-elected number of row or columns for pivot candidates. The default number is three, but this can be changed by using IMSL_NUM_OF_SEARCH_ROWS.
The option IMSL_DROP_TOLERANCE can be used to set a tolerance which can reduce fill-in. This works by preventing any new fill element which has magnitude less than the specified drop tolerance from being added to the factorization. Since this can introduce substantial error into the factorization, it is recommended that IMSL_ITERATIVE_REFINEMENT be used to recover more accuracy in the final solution. The trade-off is between space savings from the drop tolerance and the extra time needed in repeated solve steps needed for refinement.
The function imsl_f_lin_sol_gen_coordinate provides the option of switching to a dense factorization method at some point during the decomposition. This option is enabled by choosing IMSL_HYBRID_FACTORIZATION. One of the two parameters required by this option, density, specifies a minimum density for the active submatrix before a format switch will occur. A density of 1.0 indicates complete fill-in. The other parameter, order_bound, places an upper bound of the order of the active submatrix which will be converted to dense format. This is used to prevent a switch from occurring too early, possibly when the O(n3) nature of the dense factorization will cause performance degradation. Note that this option can significantly increase heap storage requirements.
As an example, consider the following matrix:

Let xT = (1, 2, 3, 4, 5, 6) so that Ax = (10, 7, 45, 33, −34, 31)T. The number of nonzeros in A is nz = 15.
#include <imsl.h>
#include <stdlib.h>
main()
{
Imsl_f_sparse_elem a[] = {0, 0, 10.0,
1, 1, 10.0,
1, 2, -3.0,
1, 3, -1.0,
2, 2, 15.0,
3, 0, -2.0,
3, 3, 10.0,
3, 4, -1.0,
4, 0, -1.0,
4, 3, -5.0,
4, 4, 1.0,
4, 5, -3.0,
5, 0, -1.0,
5, 1, -2.0,
5, 5, 6.0};
float b[] =
{10.0, 7.0, 45.0, 33.0, -34.0, 31.0};
int n = 6;
int nz = 15;
float *x;
x = imsl_f_lin_sol_gen_coordinate (n, nz, a, b, 0);
imsl_f_write_matrix ("solution", 1, n, x, 0);
free (x);
}
solution
1 2 3 4 5 6
1 2 3 4 5 6
This examples sets A = E(1000, 10). A linear system is solved and the LU factorization returned. Then a second linear system is solved, using the same coefficient matrix A just factored. Maximum absolute errors and execution time ratios are printed, showing that forward and back solves take approximately 10 percent of the computation time of a factor and solve. This ratio can vary greatly, depending on the order of the coefficient matrix, the initial number of nonzeros, and especially on the amount of fill-in produced during the elimination. Be aware that timing results are highly machine dependent.
#include <imsl.h>
#include <stdlib.h>
main()
{
Imsl_f_sparse_elem *a;
Imsl_f_sparse_lu_factor lu_factor;
float *b;
float *x;
float *mod_five;
float *mod_ten;
float error_factor_solve;
float error_solve;
double time_factor_solve;
double time_solve;
int n = 1000;
int c = 10;
int i;
int nz;
int index;
/* Get the coefficient matrix */
a =
imsl_f_generate_test_coordinate (n, c, &nz, 0);
/* Set two different predetermined solutions */
mod_five =
(float*) malloc (n*sizeof(*mod_five));
mod_ten = (float*) malloc (n*sizeof(*mod_ten));
for (i=0; i<n; i++) {
mod_five[i] = (float) (i % 5);
mod_ten[i] = (float) (i % 10);
}
/* Choose b so that x will approximate mod_five */
b =
imsl_f_mat_mul_rect_coordinate ("A*x",
IMSL_A_MATRIX, n, n, nz, a,
IMSL_X_VECTOR, n, mod_five,
0);
/* Time the factor/solve */
time_factor_solve = imsl_ctime();
x = imsl_f_lin_sol_gen_coordinate (n, nz, a, b,
IMSL_RETURN_SPARSE_LU_FACTOR, &lu_factor,
0);
time_factor_solve = imsl_ctime() - time_factor_solve;
/* Compute max abolute error */
error_factor_solve = imsl_f_vector_norm (n, x,
IMSL_SECOND_VECTOR, mod_five,
IMSL_INF_NORM, &index,
0);
free (mod_five);
free (b);
free (x);
/* Get new right hand side -- b = A * mod_ten */
b =
imsl_f_mat_mul_rect_coordinate ("A*x",
IMSL_A_MATRIX, n, n, nz, a,
IMSL_X_VECTOR, n, mod_ten,
0);
/* Use the previously computed factorization
to solve Ax = b */
time_solve =
imsl_ctime();
x = imsl_f_lin_sol_gen_coordinate (n, nz, a, b,
IMSL_SUPPLY_SPARSE_LU_FACTOR, &lu_factor,
IMSL_SOLVE_ONLY,
0);
time_solve = imsl_ctime() - time_solve;
error_solve = imsl_f_vector_norm (n, x,
IMSL_SECOND_VECTOR, mod_ten,
IMSL_INF_NORM, &index,
0);
free (mod_ten);
free (b);
free (x);
/* Print errors and ratio of execution times */
printf
("absolute error (factor/solve) = %e\n",
error_factor_solve);
printf ("absolute error (solve) = %e\n", error_solve);
printf ("time_solve/time_factor_solve = %f\n",
time_solve/time_factor_solve);
}
absolute error (factor/solve) = 9.179115e-05
absolute error (solve) = 2.160072e-04
time_solve/time_fator_solve = 0.093750
This example solves a system Ax = b, where A = E (500, 50). Then, the same system is solved using a large drop tolerance. Finally, using the factorization just computed, the same linear system is solved with iterative refinement. Be aware that timing results are highly machine dependent.
#include <imsl.h>
#include <stdlib.h>
main()
{
Imsl_f_sparse_elem *a;
Imsl_f_sparse_lu_factor lu_factor;
float *b;
float *x;
float *mod_five;
float error_zero_drop_tol;
float error_nonzero_drop_tol;
float error_nonzero_drop_tol_IR;
double time_zero_drop_tol;
double time_nonzero_drop_tol;
double time_nonzero_drop_tol_IR;
int nz_nonzero_drop_tol;
int nz_zero_drop_tol;
int n = 500;
int c = 50;
int i;
int nz;
int index;
/* Get the coefficient matrix */
a =
imsl_f_generate_test_coordinate (n, c, &nz, 0);
for (i=0; i<nz; i++) a[i].val *= 0.05;
/* Set a predetermined solution */
mod_five =
(float*) malloc (n*sizeof(*mod_five));
for (i=0; i<n; i++)
mod_five[i] = (float) (i % 5);
/* Choose b so that x will approximate mod_five */
b =
imsl_f_mat_mul_rect_coordinate ("A*x",
IMSL_A_MATRIX, n, n, nz, a,
IMSL_X_VECTOR, n, mod_five,
0);
/* Time the factor/solve */
time_zero_drop_tol = imsl_ctime();
x = imsl_f_lin_sol_gen_coordinate (n, nz, a, b,
IMSL_NUM_NONZEROS_IN_FACTOR, &nz_zero_drop_tol,
0);
time_zero_drop_tol = imsl_ctime() - time_zero_drop_tol;
/* Compute max abolute error */
error_zero_drop_tol = imsl_f_vector_norm (n, x,
IMSL_SECOND_VECTOR, mod_five,
IMSL_INF_NORM, &index,
0);
free (x);
/* Solve the same problem, with drop
tolerance = 0.005 */
time_nonzero_drop_tol = imsl_ctime();
x = imsl_f_lin_sol_gen_coordinate (n, nz, a, b,
IMSL_RETURN_SPARSE_LU_FACTOR, &lu_factor,
IMSL_DROP_TOLERANCE, 0.005,
IMSL_NUM_NONZEROS_IN_FACTOR, &nz_nonzero_drop_tol,
0);
time_nonzero_drop_tol = imsl_ctime() - time_nonzero_drop_tol;
/* Compute max abolute error */
error_nonzero_drop_tol = imsl_f_vector_norm (n, x,
IMSL_SECOND_VECTOR, mod_five,
IMSL_INF_NORM, &index,
0);
free (x);
/* Solve the same problem with IR, use last
factorization */
time_nonzero_drop_tol_IR = imsl_ctime();
x = imsl_f_lin_sol_gen_coordinate (n, nz, a, b,
IMSL_SUPPLY_SPARSE_LU_FACTOR, &lu_factor,
IMSL_SOLVE_ONLY,
IMSL_ITERATIVE_REFINEMENT,
0);
time_nonzero_drop_tol_IR = imsl_ctime() - time_nonzero_drop_tol_IR;
/* Compute max abolute error */
error_nonzero_drop_tol_IR = imsl_f_vector_norm (n, x,
IMSL_SECOND_VECTOR, mod_five,
IMSL_INF_NORM, &index,
0);
free (x);
free (b);
/* Print errors and ratio of execution times */
printf ("drop
tolerance = 0.0\n");
printf ("\tabsolute error = %e\n", error_zero_drop_tol);
printf ("\tfillin = %d\n\n", nz_zero_drop_tol);
printf ("drop
tolerance = 0.005\n");
printf ("\tabsolute error = %e\n", error_nonzero_drop_tol);
printf ("\tfillin = %d\n\n", nz_nonzero_drop_tol);
printf ("drop
tolerance = 0.005 (with IR)\n");
printf ("\tabsolute error = %e\n", error_nonzero_drop_tol_IR);
printf ("\tfillin = %d\n\n", nz_nonzero_drop_tol);
printf
("time_nonzero_drop_tol/time_zero_drop_tol = %f\n",
time_nonzero_drop_tol/time_zero_drop_tol);
printf ("time_nonzero_drop_tol_IR/time_zero_drop_tol = %f\n",
time_nonzero_drop_tol_IR/time_zero_drop_tol);
}
drop tolerance = 0.0
absolute error = 3.814697e-06
fillin = 9530
drop tolerance = 0.005
absolute error = 2.699481e+00
fillin = 8656
drop tolerance = 0.005 (with IR)
absolute error = 1.907349e-06
fillin = 8656
time_nonzero_drop_tol/time_zero_drop_tol =
1.086957
time_nonzero_drop_tol_IR/time_zero_drop_tol = 0.840580
Notice the absolute error when iterative refinement is not used. Also note that iterative refinement itself can be quite expensive. In this case, for example, the IR solve took approximately as much time as the factorization. For this problem the use of a drop high drop tolerance and iterative refinement was able to reduce fill-in by 10 percent at a time cost double that of the default usage. In tight memory situations, such a trade-off may be acceptable. Users should be aware that a drop tolerance can be chosen large enough, introducing large errors into LU, to prevent convergence of iterative refinement.
|
Visual Numerics, Inc. PHONE: 713.784.3131 FAX:713.781.9260 |