lin_sol_gen_coordinate

CNLMath : Linear Systems : lin_sol_gen_coordinate

Synopsis

Required Arguments

Return Value

Synopsis with Optional Arguments

Optional Arguments

Description

Examples

more...

Solves a sparse system of linear equations Ax = b. Using optional arguments, any of several related computations can be performed. These extra tasks include returning the LU factorization of A, computing the solution of Ax = b given an LU factorization, setting drop tolerances, and controlling iterative refinement.

Synopsis

#include <imsl.h>

float *imsl_f_lin_sol_gen_coordinate (int n, int nz, Imsl_f_sparse_elem *a, float *b, ..., 0)

The type double function is imsl_d_lin_sol_gen_coordinate.

Required Arguments

int n (Input)
Number of rows in the matrix.

int nz (Input)
Number of nonzeros in the matrix.

Imsl_f_sparse_elem *a (Input)
Vector of length nz containing the location and value of each nonzero entry in the matrix.

float *b (Input)
Vector of length n containing the right-hand side.

Return Value

A pointer to the solution x of the sparse linear system Ax = b. To release this space, use imsl_free. If no solution was computed, then NULL is returned.

Synopsis with Optional Arguments

#include <imsl.h>

float *imsl_f_lin_sol_gen_coordinate (int n, int nz, Imsl_f_sparse_elem *a, float *b,

IMSL_RETURN_SPARSE_LU_FACTOR, Imsl_f_sparse_lu_factor *lu_factor,

IMSL_SUPPLY_SPARSE_LU_FACTOR, Imsl_f_sparse_lu_factor *lu_factor,

IMSL_FREE_SPARSE_LU_FACTOR,

IMSL_RETURN_SPARSE_LU_IN_COORD, Imsl_f_sparse_elem **lu_coordinate, int **row_pivots, int **col_pivots,

IMSL_SUPPLY_SPARSE_LU_IN_COORD, int nzlu, Imsl_f_sparse_elem *lu_coordinate, int *row_pivots, int *col_pivots,

IMSL_FACTOR_ONLY,

IMSL_SOLVE_ONLY,

IMSL_RETURN_USER, float x[],

IMSL_TRANSPOSE,

IMSL_CONDITION, float *condition,

IMSL_PIVOTING_STRATEGY, Imsl_pivot method,

IMSL_NUMBER_OF_SEARCH_ROWS, int num_search_row,

IMSL_ITERATIVE_REFINEMENT,

IMSL_DROP_TOLERANCE, float tolerance,

IMSL_HYBRID_FACTORIZATION, float density, int order_bound,

IMSL_STABILITY_FACTOR, float s_factor,

IMSL_GROWTH_FACTOR_LIMIT, float gf_limit,

IMSL_GROWTH_FACTOR, float *gf,

IMSL_SMALLEST_PIVOT, float *small_pivot

IMSL_NUM_NONZEROS_IN_FACTOR, int *num_nonzeros,

IMSL_CSC_FORMAT, int *col_ptr, int *row_ind, float *values,

IMSL_MEMORY_BLOCK_SIZE, int block_size,

Optional Arguments

IMSL_RETURN_SPARSE_LU_FACTOR, Imsl_f_sparse_lu_factor *lu_factor (Output)
The address of a structure of type Imsl_f_sparse_lu_factor. The pointers within the structure are initialized to point to the LU factorization by imsl_f_lin_sol_gen_coordinate.

IMSL_SUPPLY_SPARSE_LU_FACTOR, Imsl_f_sparse_lu_factor *lu_factor (Input)
The address of a structure of type Imsl_f_sparse_lu_factor. This structure contains the LU factorization of the input matrix computed by imsl_f_lin_sol_gen_coordinate with the IMSL_RETURN_SPARSE_LU_FACTOR option.

IMSL_FREE_SPARSE_LU_FACTOR,
Before returning, free the linked list data structure containing the LU factorization of A. Use this option only if the factors are no longer required.

IMSL_RETURN_SPARSE_LU_IN_COORD, Imsl_f_sparse_elem **lu_coordinate, int **row_pivots, int **col_pivots (Output)
The LU factorization is returned in coordinate form in an array of length nz in lu_coordinate. This is more compact than the internal representation encapsulated in Imsl_f_sparse_lu_factor. The disadvantage is that during a SOLVE_ONLY call, the internal representation of the factor must be reconstructed. If however, the factor is to be stored after the program exits, and loaded again at some subsequent run, the combination of IMSL_RETURN_LU_IN_COORD and IMSL_SUPPLY_LU_IN_COORD is probably the best choice, since the factors are in a format that is simple to store and read.

IMSL_SUPPLY_SPARSE_LU_IN_COORD, int nzlu, Imsl_f_sparse_elem *lu_coordinate, int *row_pivots, int *col_pivots (Input)
Supply the LU factorization in coordinate form. See IMSL_RETURN_SPARSE_LU_IN_COORD for a description.

IMSL_FACTOR_ONLY,
Compute the LU factorization of the input matrix and return. The argument b is ignored.

IMSL_SOLVE_ONLY,
Solve Ax = b given the LU factorization of A. This option requires the use of option IMSL_SUPPLY_SPARSE_LU_FACTOR or IMSL_SUPPLY_SPARSE_LU_IN_COORD.

IMSL_RETURN_USER, float x[] (Output)
A user-allocated array of length n containing the solution x.

IMSL_TRANSPOSE,
Solve the problem ATx = b. This option can be used in conjunction with either of the options that supply the factorization.

IMSL_CONDITION, float *condition,
Estimate the L1 condition number of A and return in the variable condition.

IMSL_PIVOTING_STRATEGY, Imsl_pivot method (Input)
Select the pivoting strategy by setting method to one of the following: IMSL_ROW_MARKOWITZ, IMSL_COLUMN_MARKOWITZ, or IMSL_SYMMETRIC_MARKOWITZ.
Default: IMSL_SYMMETRIC_MARKOWITZ.

IMSL_NUMBER_OF_SEARCH_ROWS, int num_search_row (Input)
The number of rows which have the least number of nonzero elements that will be searched for a pivot element.
Default: num_search_row = 3.

IMSL_ITERATIVE_REFINEMENT,
Select this option if iterative refinement is desired.

IMSL_DROP_TOLERANCE, float tolerance (Input)
Possible fill-in is checked against tolerance. If the absolute value of the new element is less than tolerance, it will be discarded.
Default: tolerance = 0.0.

IMSL_HYBRID_FACTORIZATION, float density, int order_bound,
Enable the function to switch to a dense factorization method when the density of the active submatrix reaches 0.0 ≤ density ≤ 1.0 and the order of the active submatrix is less than or equal to order_bound.

IMSL_STABILITY_FACTOR, float s_factor (Input)
The absolute value of the pivot element must be bigger than the largest element in absolute value in its row divided by s_factor.
Default: s_factor = 10.0.

IMSL_GROWTH_FACTOR_LIMIT, float gf_limit (Input)
The computation stops if the growth factor exceeds gf_limit.
Default: gf_limit = 1.0e16.

IMSL_GROWTH_FACTOR, float *gf (Output)
Argument gf is calculated as the largest element in absolute value at any stage of the Gaussian elimination divided by the largest element in absolute value in A.

IMSL_SMALLEST_PIVOT, float *small_pivot (Output)
A pointer to the value of the pivot element of smallest magnitude that occurred during the factorization.

IMSL_NUM_NONZEROS_IN_FACTOR, int *num_nonzeros (Output)
A pointer to a scalar containing the total number of nonzeros in the factor.

IMSL_CSC_FORMAT, int *col_ptr, int *row_ind, float *values (Input)
Accept the coefficient matrix in Compressed Sparse Column (CSC) Format. See the main “Introduction” chapter of this manual for a discussion of this storage scheme.

IMSL_MEMORY_BLOCK_SIZE, int blocksize (Input)
If space must be allocated for fill-in, allocate enough space for blocksize new nonzero elements.
Default: blocksize = nz.

Description

The function imsl_f_lin_sol_gen_coordinate solves a system of linear equations Ax = b, where A is sparse. In its default use, it solves the so-called one off problem, by first performing an LU factorization of A using the improved generalized symmetric Markowitz pivoting scheme. The factor L is not stored explicitly because the saxpy operations performed during the elimination are extended to the right-hand side, along with any row interchanges. Thus, the system Ly = b is solved implicitly. The factor U is then passed to a triangular solver which computes the solution x from Ux = y.

If a sequence of systems Ax = b are to be solved where A is unchanged, it is usually more efficient to compute the factorization once, and perform multiple forward and back solves with the various right-hand sides. In this case, the factor L is explicitly stored and a record of all row as well as column interchanges is made. The solve step then solves the two triangular systems Ly = b and Ux = y. The user specifies either the IMSL_RETURN_SPARSE_LU_FACTOR or the IMSL_RETURN_LU_IN_COORD option to retrieve the factorization, then calls the function subsequently with different right-hand sides, passing the factorization back in using either IMSL_SUPPLY_SPARSE_LU_FACTOR or IMSL_SUPPLY_SPARSE_LU_IN_COORD in conjunction with IMSL_SOLVE_ONLY. If IMSL_RETURN_SPARSE_LU_FACTOR is used, the final call to imsl_lin_sol_gen_coordinate should include IMSL_FREE_SPARSE_LU_FACTOR to release the heap used to store L and U.

If the solution to ATx = b is required, specify the option IMSL_TRANSPOSE. This keyword only alters the forward elimination and back substitution so that the operations UTy = b and LTx = y are performed to obtain the solution. So, with one call to produce the factorization, solutions to both Ax = b and ATx = b can be obtained.

The option IMSL_CONDITION is used to calculate and return an estimation of the L1 condition number of A. The algorithm used is due to Higham. Specification of IMSL_CONDITION causes a complete L to be computed and stored, even if a one off problem is being solved. This is due to the fact that Higham’s method requires solution to problems of the form Az = r and ATz = r.

The default pivoting strategy is symmetric Markowitz. If a row or column oriented problem is encountered, there may be some reduction in fill-in by selecting either IMSL_ROW_MARKOWITZ or IMSL_COLUMN_MARKOWITZ. The Markowitz strategy will search a pre-elected number of row or columns for pivot candidates. The default number is three, but this can be changed by using IMSL_NUM_OF_SEARCH_ROWS.

The option IMSL_DROP_TOLERANCE can be used to set a tolerance which can reduce fill-in. This works by preventing any new fill element which has magnitude less than the specified drop tolerance from being added to the factorization. Since this can introduce substantial error into the factorization, it is recommended that IMSL_ITERATIVE_REFINEMENT be used to recover more accuracy in the final solution. The trade-off is between space savings from the drop tolerance and the extra time needed in repeated solve steps needed for refinement.

The function imsl_f_lin_sol_gen_coordinate provides the option of switching to a dense factorization method at some point during the decomposition. This option is enabled by choosing IMSL_HYBRID_FACTORIZATION. One of the two parameters required by this option, density, specifies a minimum density for the active submatrix before a format switch will occur. A density of 1.0 indicates complete fill-in. The other parameter, order_bound, places an upper bound on the order of the active submatrix which will be converted to dense format. This is used to prevent a switch from occurring too early, possibly when the O(n3) nature of the dense factorization will cause performance degradation. Note that this option can significantly increase heap storage requirements.

Examples

Example 1

As an example, consider the following matrix:

Let xT = (1, 2, 3, 4, 5, 6) so that Ax = (10, 7, 45, 33, -34, 31)T. The number of nonzeros in A is nz = 15.

#include <imsl.h>

int main()

{

Imsl_f_sparse_elem a[] =

{0, 0, 10.0,

1, 1, 10.0,

1, 2, -3.0,

1, 3, -1.0,

2, 2, 15.0,

3, 0, -2.0,

3, 3, 10.0,

3, 4, -1.0,

4, 0, -1.0,

4, 3, -5.0,

4, 4, 1.0,

4, 5, -3.0,

5, 0, -1.0,

5, 1, -2.0,

5, 5, 6.0};

float b[] = {10.0, 7.0, 45.0, 33.0, -34.0, 31.0};

int n = 6;

int nz = 15;

float *x;

x = imsl_f_lin_sol_gen_coordinate (n, nz, a, b,

0);

imsl_f_write_matrix ("solution", 1, n, x,

0);

imsl_free (x);

}

Output

solution

1 2 3 4 5 6

Example 2

This examples sets A = E(1000, 10). A linear system is solved and the LU factorization returned. Then a second linear system is solved, using the same coefficient matrix A just factored. Maximum absolute errors and execution time ratios are printed, showing that forward and back solves take approximately 10 percent of the computation time of a factor and solve. This ratio can vary greatly, depending on the order of the coefficient matrix, the initial number of nonzeros, and especially on the amount of fill-in produced during the elimination. Be aware that timing results are highly machine dependent.

#include <imsl.h>

#include <stdio.h>

#include <stdlib.h>

int main()

{

Imsl_f_sparse_elem *a;

Imsl_f_sparse_lu_factor lu_factor;

float *b;

float *x;

float *mod_five;

float *mod_ten;

float error_factor_solve;

float error_solve;

double time_factor_solve;

double time_solve;

int n = 1000;

int c = 10;

int i;

int nz;

int index;

/* Get the coefficient matrix */

a = imsl_f_generate_test_coordinate (n, c, &nz, 0);

/* Set two different predetermined solutions */

mod_five = (float*) malloc (n*sizeof(*mod_five));

mod_ten = (float*) malloc (n*sizeof(*mod_ten));

for (i=0; i<n; i++) {

mod_five[i] = (float) (i % 5);

mod_ten[i] = (float) (i % 10);

}

/* Choose b so that x will approximate mod_five */

b = (float *) imsl_f_mat_mul_rect_coordinate ("A*x",

IMSL_A_MATRIX, n, n, nz, a,

IMSL_X_VECTOR, n, mod_five,

0);

/* Time the factor/solve */

time_factor_solve = imsl_ctime();

x = imsl_f_lin_sol_gen_coordinate (n, nz, a, b,

IMSL_RETURN_SPARSE_LU_FACTOR, &lu_factor,

0);

time_factor_solve = imsl_ctime() - time_factor_solve;

/* Compute max absolute error */

error_factor_solve = imsl_f_vector_norm (n, x,

IMSL_SECOND_VECTOR, mod_five,

IMSL_INF_NORM, &index,

0);

free (mod_five);

imsl_free (b);

imsl_free (x);

/* Get new right hand side -- b = A * mod_ten */

b = (float *) imsl_f_mat_mul_rect_coordinate ("A*x",

IMSL_A_MATRIX, n, n, nz, a,

IMSL_X_VECTOR, n, mod_ten,

0);

/* Use the previously computed factorization

to solve Ax = b */

time_solve = imsl_ctime();

x = imsl_f_lin_sol_gen_coordinate (n, nz, a, b,

IMSL_SUPPLY_SPARSE_LU_FACTOR, &lu_factor,

IMSL_SOLVE_ONLY,

0);

time_solve = imsl_ctime() - time_solve;

error_solve = imsl_f_vector_norm (n, x,

IMSL_SECOND_VECTOR, mod_ten,

IMSL_INF_NORM, &index,

0);

free (mod_ten);

imsl_free (b);

imsl_free (x);

/* Print errors and ratio of execution times */

printf ("absolute error (factor/solve) = %e\n",

error_factor_solve);

printf ("absolute error (solve) = %e\n", error_solve);

printf ("time_solve/time_factor_solve = %f\n",

time_solve/time_factor_solve);

}

Output

absolute error (factor/solve) = 9.179115e-05

absolute error (solve) = 2.160072e-04

time_solve/time_fator_solve = 0.093750

Example 3

This example solves a system Ax = b, where A = E (500, 50). Then, the same system is solved using a large drop tolerance. Finally, using the factorization just computed, the same linear system is solved with iterative refinement. Be aware that timing results are highly machine dependent.

#include <imsl.h>

#include <stdio.h>

#include <stdlib.h>

int main()

{

Imsl_f_sparse_elem *a;

Imsl_f_sparse_lu_factor lu_factor;

float *b;

float *x;

float *mod_five;

float error_zero_drop_tol;

float error_nonzero_drop_tol;

float error_nonzero_drop_tol_IR;

double time_zero_drop_tol;

double time_nonzero_drop_tol;

double time_nonzero_drop_tol_IR;

int nz_nonzero_drop_tol;

int nz_zero_drop_tol;

int n = 500;

int c = 50;

int i;

int nz;

int index;

/* Get the coefficient matrix */

a = imsl_f_generate_test_coordinate (n, c, &nz, 0);

for (i=0; i<nz; i++) a[i].val *= 0.05;

/* Set a predetermined solution */

mod_five = (float*) malloc (n*sizeof(*mod_five));

for (i=0; i<n; i++)

mod_five[i] = (float) (i % 5);

/* Choose b so that x will approximate mod_five */

b = imsl_f_mat_mul_rect_coordinate ("A*x",

IMSL_A_MATRIX, n, n, nz, a,

IMSL_X_VECTOR, n, mod_five,

0);

/* Time the factor/solve */

time_zero_drop_tol = imsl_ctime();

x = imsl_f_lin_sol_gen_coordinate (n, nz, a, b,

IMSL_NUM_NONZEROS_IN_FACTOR, &nz_zero_drop_tol,

0);

time_zero_drop_tol = imsl_ctime() - time_zero_drop_tol;

/* Compute max abolute error */

error_zero_drop_tol = imsl_f_vector_norm (n, x,

IMSL_SECOND_VECTOR, mod_five,

IMSL_INF_NORM, &index,

0);

imsl_free (x);

/* Solve the same problem, with drop

tolerance = 0.005 */

time_nonzero_drop_tol = imsl_ctime();

x = imsl_f_lin_sol_gen_coordinate (n, nz, a, b,

IMSL_RETURN_SPARSE_LU_FACTOR, &lu_factor,

IMSL_DROP_TOLERANCE, 0.005,

IMSL_NUM_NONZEROS_IN_FACTOR, &nz_nonzero_drop_tol,

0);

time_nonzero_drop_tol = imsl_ctime() - time_nonzero_drop_tol;

/* Compute max abolute error */

error_nonzero_drop_tol = imsl_f_vector_norm (n, x,

IMSL_SECOND_VECTOR, mod_five,

IMSL_INF_NORM, &index,

0);

imsl_free (x);

/* Solve the same problem with IR, use last

factorization */

time_nonzero_drop_tol_IR = imsl_ctime();

x = imsl_f_lin_sol_gen_coordinate (n, nz, a, b,

IMSL_SUPPLY_SPARSE_LU_FACTOR, &lu_factor,

IMSL_SOLVE_ONLY,

IMSL_ITERATIVE_REFINEMENT,

0);

time_nonzero_drop_tol_IR = imsl_ctime() - time_nonzero_drop_tol_IR;

/* Compute max abolute error */

error_nonzero_drop_tol_IR = imsl_f_vector_norm (n, x,

IMSL_SECOND_VECTOR, mod_five,

IMSL_INF_NORM, &index,

0);

imsl_free (x);

imsl_free (b);

/* Print errors and ratio of execution times */

printf ("drop tolerance = 0.0\n");

printf ("\tabsolute error = %e\n", error_zero_drop_tol);

printf ("\tfillin = %d\n\n", nz_zero_drop_tol);

printf ("drop tolerance = 0.005\n");

printf ("\tabsolute error = %e\n", error_nonzero_drop_tol);

printf ("\tfillin = %d\n\n", nz_nonzero_drop_tol);

printf ("drop tolerance = 0.005 (with IR)\n");

printf ("\tabsolute error = %e\n", error_nonzero_drop_tol_IR);

printf ("\tfillin = %d\n\n", nz_nonzero_drop_tol);

printf ("time_nonzero_drop_tol/time_zero_drop_tol = %f\n",

time_nonzero_drop_tol/time_zero_drop_tol);

printf ("time_nonzero_drop_tol_IR/time_zero_drop_tol = %f\n",

time_nonzero_drop_tol_IR/time_zero_drop_tol);

}

Output

drop tolerance = 0.0

absolute error = 3.814697e-06

fillin = 9530

drop tolerance = 0.005

absolute error = 2.699481e+00

fillin = 8656

drop tolerance = 0.005 (with IR)

absolute error = 1.907349e-06

fillin = 8656

time_nonzero_drop_tol/time_zero_drop_tol = 1.086957

time_nonzero_drop_tol_IR/time_zero_drop_tol = 0.840580

Notice the absolute error when iterative refinement is not used. Also note that iterative refinement itself can be quite expensive. In this case, for example, the IR solve took approximately as much time as the factorization. For this problem the use of a drop high drop tolerance and iterative refinement was able to reduce fill-in by 10 percent at a time cost double that of the default usage. In tight memory situations, such a trade-off may be acceptable. Users should be aware that a drop tolerance can be chosen large enough, introducing large errors into LU, to prevent convergence of iterative refinement.