superlu

Chapter 1: Linear Systems > superlu

superlu

Computes the LU factorization of a general sparse matrix by a column method and solves the real sparse linear system of equations .

Synopsis

#include <imsl.h>

float *imsl_f_superlu (int n, int nz, Imsl_f_sparse_elem a[], float b[],…,0)

void imsl_f_superlu_factor_free (Imsl_f_super_lu_factor *factor)

The type double functions are imsl_d_superlu and imsl_d_superlu_factor_free.

Required Arguments

int n (Input)
The order of the input matrix.

int nz (Input)
Number of nonzeros in the matrix.

Imsl_f_sparse_elem a[] (Input)
Array of length nz containing the location and value of each nonzero entry in the matrix. See the main “Introduction” chapter of this manual for an explanation of the Imsl_f_sparse_elem structure.

float b[] (Input)
Array of length n containing the right-hand side.

Return Value

A pointer to the solution of the sparse linear system . To release this space, use imsl_free. If no solution was computed, then NULL is returned.

Synopsis with Optional Arguments

#include <imsl.h>

float *imsl_f_superlu (int n, int nz, Imsl_f_sparse_elem a[], float b[],

IMSL_EQUILIBRATE, int equilibrate,

IMSL_COLUMN_ORDERING_METHOD, Imsl_col_ordering method,

IMSL_COLPERM_VECTOR, int permc[],

IMSL_TRANSPOSE, int transpose,

IMSL_ITERATIVE_REFINEMENT, int refine,

IMSL_FACTOR_SOLVE, int factsol,

IMSL_DIAG_PIVOT_THRESH, double diag_pivot_thresh,

IMSL_SYMMETRIC_MODE, int symm_mode,

IMSL_PERFORMANCE_TUNING, int sp_ienv[],

IMSL_CSC_FORMAT, int HB_col_ptr[], int HB_row_ind [],float HB_values[],

IMSL_CSC_FORMAT, int HB_col_ptr[], int HB_row_ind[], float HB_values[],

IMSL_SUPPLY_SPARSE_LU_FACTOR, Imsl_f_super_lu_factor lu_factor_supplied,

IMSL_RETURN_SPARSE_LU_FACTOR, Imsl_f_super_lu_factor *lu_factor_returned,

IMSL_CONDITION, float *condition,

IMSL_PIVOT_GROWTH_FACTOR, float *recip_pivot_growth,

IMSL_FORWARD_ERROR_BOUND, float *ferr,

IMSL_BACKWARD_ERROR, float *berr,

IMSL_RETURN_USER, float x[],

Optional Arguments

IMSL_EQUILIBRATE, int equilibrate (Input)
Specifies if the input matrix A should be equilibrated before factorization.

equilibrate	Description
0	Do not equilibrate A before factorization
1	Equilibrate A before factorization.

Default: equilibrate = 0

IMSL_COLUMN_ORDERING_METHOD, Imsl_col_ordering method (Input)
The column ordering method used to preserve sparsity prior to the factorization process. Select the ordering method by setting method to one of the following:

method	Description
IMSL_NATURAL	Natural ordering, i.e.the column ordering of the input matrix.
IMSL_MMD_ATA	Minimum degree ordering on the structure of .
IMSL_MMD_AT_PLUS_A	Minimum degree ordering on the structure of .
IMSL_COLAMD	Column approximate minimum degree ordering.
IMSL_PERMC	Use ordering given in permutation vector permc, which is input by the user through optional argument IMSL_COLPERM_VECTOR. Vector permc is a permutation of the numbers 0,1,…,n-1.

Default: method = IMSL_COLAMD

IMSL_COLPERM_VECTOR, int permc[] (Input)
Array of length n which defines the permutation matrix before postordering. This argument is required if IMSL_COLUMN_ORDERING_METHOD with method = IMSL_PERMC is used. Otherwise, it is ignored.

IMSL_TRANSPOSE, int transpose (Input)
Indicates if the transposed problem is to be solved. This option can be used in conjunction with either of the options that supply the factorization.

transpose	Description
0	Solve .
1	Solve .

Default: transpose = 0

IMSL_ITERATIVE_REFINEMENT, int refine (Input)
Indicates if iterative refinement is desired.

refine	Description
0	No iterative refinement.
1	Do iterative refinement.

Default: refine = 1

IMSL_FACTOR_SOLVE, int factsol (Input)
Indicates if the LU factorization, the solution of a linear system or both are to be computed.

factsol	Description
0	Compute the LU factorization of the input matrix A and solve the system .
1	Only compute the LU factorization of the input matrix and return. The LU factorization is returned via optional argument IMSL_RETURN_SPARSE_LU_FACTOR. Input argument b is ignored.
2	Only solve given the LU factorization of . The LU factorization of must be supplied via optional argument IMSL_SUPPLY_SPARSE_LU_FACTOR. Input argument a is ignored unless iterative refinement, computation of the condition number or computation of the reciprocal pivot growth factor is required.

Default: factsol = 0

IMSL_DIAG_PIVOT_THRESH, double diag_pivot_thresh (Input)
Specifies the threshold used for a diagonal entry to be an acceptable pivot, 0.0 diag_pivot_thresh 1.0.
Default: diag_pivot_thresh = 1.0

IMSL_SYMMETRIC_MODE, int symm_mode (Input)
Indicates if the symmetric mode option is to be used. This mode should be applied if the input matrix is diagonally dominant or nearly so. The user should then define a small diagonal pivot threshold (e.g. 0.0 or 0.01) via option IMSL_DIAG_PIVOT_THRESH and choose an -based column permutation algorithm (e.g. column permutation method IMSL_MMD_AT_PLUS_A).

symm_mode	Description
0	Do not use symmetric mode option.
1	Use symmetric mode option.

Default: symm_mode = 0

IMSL_PERFORMANCE_TUNING, int sp_ienv[] (Input)
Array of length 6 containing positive parameters that allow the user to tune the performance of the matrix factorization algorithm.

i	Description of sp_ienv[i]
0	The panel size. Default: sp_ienv[0] = 10
1	The relaxation parameter to control supernode amalgamation. Default: sp_ienv[1] = 5
2	The maximum allowable size for a supernode. Default: sp_ienv[2] = 100
3	The minimum row dimension to be used for 2D blocking. Default: sp_ienv[3] = 200
4	The minimum column dimension to be used for 2D blocking. Default: sp_ienv[4] = 40
5	The estimated fill factor for L and U, compared to A. Default: sp_ienv[5] = 20

IMSL_CSC_FORMAT, int HB_col_ptr[], int HB_row_ind[], float HB_values[] (Input)
Accept the coefficient matrix in compressed sparse column (CSC) format. See the main “Introduction” chapter of this manual for a discussion of this storage scheme.

IMSL_SUPPLY_SPARSE_LU_FACTOR, Imsl_f_super_lu_factor lu_factor_supplied (Input)
A structure of type Imsl_f_super_lu_factor containing the LU factorization of the input matrix computed with the IMSL_RETURN_SPARSE_LU_FACTOR option. See the Description section for a definition of this structure. To free the memory allocated within this structure, use function imsl_f_superlu_factor_free.

IMSL_RETURN_SPARSE_LU_FACTOR, Imsl_f_super_lu_factor *lu_factor_returned (Output)
The address of a structure of type Imsl_f_super_lu_factor containing the LU factorization of the input matrix. See the Description section for a definition of this structure. To free the memory allocated within this structure, use function imsl_f_superlu_factor_free.

IMSL_CONDITION, float *condition (Output)
The estimate of the reciprocal condition number of matrix a after equilibration (if done).

IMSL_PIVOT_GROWTH_FACTOR, float *recip_pivot_growth (Output)
The reciprocal pivot growth factor

If recip_pivot_growth is much less than 1, the stability of the LU factorization could be poor.

IMSL_FORWARD_ERROR_BOUND, float *ferr (Output)
The estimated forward error bound for the solution vector x. This option requires argument IMSL_ITERATIVE_REFINEMENT set to 1.

IMSL_BACKWARD_ERROR, float *berr (Output)
The componentwise relative backward error of the solution vector x. This option requires argument IMSL_ITERATIVE_REFINEMENT set to 1.

IMSL_RETURN_USER, float x[] (Output)
A user-allocated array of length n containing the solution x of the linear system.

Description

Consider the sparse linear system of equations

Here, is a general square, nonsingular by sparse matrix, and and are vectors of length . All entries in , and are of real type.

Gaussian elimination, applied to the system above, can be shortly described as follows:

1. Compute a triangular factorization . Here, and are positive definite diagonal matrices to equilibrate the system and and are permutation matrices to ensure numerical stability and preserve sparsity. is a unit lower triangular matrix and is an upper triangular matrix.

2. Solve by evaluating

This is done efficiently by multiplying from right to left in the last expression: Scale the rows of by . Multiplying means permuting the rows of . Multiplying means solving the triangular system of equations with matrix by substitution. Similarly, multiplying means solving the triangular system with .

Function imsl_f_superlu handles step 1 above by default or if optional argument IMSL_FACTOR_SOLVE is used and set to 1. More precisely, before is solved, the following steps are performed:

1. Equilibrate matrix , i.e. compute diagonal matrices and so that is “better conditioned” than , i.e. is less sensitive to perturbations in than is to perturbations in .

2. Order the columns of to increase the sparsity of the computed and factors, i.e. replace by where is a column permutation matrix.

3. Compute the LU factorization of . For numerical stability, the rows of are eventually permuted through the factorization process by scaled partial pivoting, leading to the decomposition . The LU factorization is done by a left looking supernode-panel algorithm with 2-D blocking. See Demmel, Eisenstat, Gilbert et al. (1999) for further information on this technique.

4. Compute the reciprocal pivot growth factor

where and denote the -th column of matrices and , respectively.

5. Estimate the reciprocal of the condition number of matrix .

During the solution process, this information is used to perform the following steps:

1. Solve the system using the computed triangular L and U factors.

2. Iteratively refine the solution, again using the computed triangular factors. This is equivalent to Newton’s method.

3. Compute forward and backward error bounds for the solution vector .

Some of the steps mentioned above are optional. Their settings can be controlled by the appropriate optional arguments of function imsl_f_superlu.

Function imsl_f_superlu uses a supernodal storage scheme for the LU factorization of matrix A. The factorization is contained in structure Imsl_f_super_lu_factor and two sub-structures. Following is a short description of these structures:

typedef struct{

int nnz; /* Number of nonzeros in the matrix */

float *nzval; /* Array of nonzero values packed by column
*/

int *rowind; /* Array of row indices of the nonzeros */

   int *colptr;              /* colptr[j] stores the location in nzval[]
                                and rowind[] which starts column j. It
                                has ncol+1 entries, and colptr[ncol]
                                points to the first free location in
                                arrays nzval[] and rowind[]. */

} Imsl_f_hb_format;

typedef struct{

int nnz; /* Number of nonzeros in the supernodal
matrix */

int nsuper; /* Index of the last supernode */

float *nzval; /* Array of nonzero values packed by column
*/

   int *nzval_colptr;        /* Array of length ncol+1; nzval_colptr[j]
                                stores the location in nzval which starts
                                column j. nzval_colptr[ncol] points to
                                the first free location in arrays
                                nzval[] and nzval_colptr[]. */

int *rowind; /* Array of compressed row indices of
rectangular supernodes */

   int *rowind_colptr;       /* Array of length ncol+1;
                                rowind_colptr[sup_to_col[s]] stores the
                                location in rowind[] which starts
                                all columns in supernode s, and
                                rowind_colptr[ncol] points to the first
                                free location in rowind[]. */

   int *col_to_sup;          /* Array of length ncol+1; col_to_sup[j] is
                                the supernode number to which column j
                                belongs. Only the first ncol entries in
                                col_to_sup[] are defined. */

   int *sup_to_col;          /* Array of length ncol+1; sup_to_col[s]
                                points to the starting column of the s-th
                                supernode. Only the first nsuper+2
                                entries in sup_to_col[] are defined, and
                                sup_to_col[nsuper+1] = ncol+1. */

} Imsl_f_sc_format;

typedef struct{

int nrow; /* number of rows of matrix A */

int ncol; /* number of columns of matrix A */

int equilibration_method; /* The method used to equilibrate A:

0 – No equilibration

1 – Row equilibration.

2 – Column equilibration

3 – Both row and column equilibration */

float *rowscale; /* Array of length nrow containing the row

scale factors for A */

float *columnscale; /* Array of length ncol containing the
column scale factors for A */

   int *rowperm;             /* Row permutation array of length nrow
                                describing the row permutation matrix Pr
                              */

int *colperm; /* Column permutation array of length ncol

describing the column permutation matrix
Pc */

   Imsl_f_hb_format *U;      /* The part of the U factor of A outside the
                                supernodal blocks, stored in Harwell-
                                Boeing format */

   Imsl_f_sc_format *L;      /* The L factor of A, stored in supernodal
                                format as block lower triangular matrix
                              */

} Imsl_f_super_lu_factor;

Structure Imsl_d_super_lu_factor and its two sub-structures are defined similarly by replacing float by double, Imsl_f_hb_format by Imsl_d_hb_format and Imsl_f_sc_format by Imsl_d_sc_format in their definitions.

For a definition of supernodes and its use in sparse LU factorization, see the SuperLU Users’ guide (1999) and J.W. Demmel, S. C. Eisenstat et al. (1999).

As an example, consider the matrix

taken from the SuperLU Users’ guide (1999).

Factorization of this matrix via imsl_f_superlu using natural column ordering, no equilibration and setting sp_ienv[1] from its default value 5 to 1 results in the following LU decomposition:

Considering the filled matrix F (I denoting the identity matrix)

the supernodal structure of the factors of matrix A can be described by

where denotes a nonzero entry in the th supernode and denotes a nonzero entry in the th column of outside the supernodal block.

Therefore, in a supernodal storage scheme the supernodal part of matrix F is stored as the lower block-diagonal matrix

and the part outside the supernodes as the upper triangular matrix

This is in accordance with the output for structure Imsl_f_super_lu_factor:

Equilibration method: 0

Scale vectors:

rowscale: 1.000000 1.000000 1.000000 1.000000 1.000000
columnscale: 1.000000 1.000000 1.000000 1.000000 1.000000

Permutation vectors:

colperm: 0 1 2 3 4
rowperm: 0 1 2 3 4

Harwell-Boeing matrix U:

nrow 5, ncol 5, nnz 11

nzval: 21.000000 -13.263157 7.578947 21.000000

rowind: 0 1 2 0

colptr: 0 0 0 1 4 4

Supernodal matrix L:

nrow 5, ncol 5, nnz 11, nsuper 2

nzval:

0 0 1.900000e+001

1 0 6.315789e-001

4 0 6.315789e-001

1 1 2.100000e+001

2 1 5.714286e-001

4 1 5.714286e-001

1 2 -1.326316e+001

2 2 2.357895e+001

4 2 -2.410714e-001

3 3 5.000000e+000

4 3 -7.714285e-001

3 4 2.100000e+001

4 4 3.420000e+001

nzval_colptr: 0 3 6 9 11 13

rowind: 0 1 4 1 2 4 3 4

rowind_colptr: 0 3 6 6 8 8

col_to_sup: 0 1 1 2 2

sup_to_col: 0 1 3 5

Function imsl_f_superlu is based on the SuperLU code written by Demmel, Gilbert, Li et al. For more detailed explanations of the factorization and solve steps, see the SuperLU User’s Guide (1999).

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

(1) Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

(2) Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

(3) Neither the name of Lawrence Berkeley National Laboratory, U.S. Dept. of Energy nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Examples

Example 1

The LU factorization of the sparse 6×6 matrix

is computed.

Let , so that and

The LU factorization of A is used to solve the sparse linear systems and .

#include <imsl.h>

int main(){

Imsl_f_sparse_elem a[] = { 0, 0, 10.0,

1, 1, 10.0,

1, 2, -3.0,

1, 3, -1.0,

2, 2, 15.0,

3, 0, -2.0,

3, 3, 10.0,

3, 4, -1.0,

4, 0, -1.0,

4, 3, -5.0,

4, 4, 1.0,

4, 5, -3.0,

5, 0, -1.0,

5, 1, -2.0,

5, 5, 6.0};

float b1[] = {10.0, 7.0, 45.0, 33.0, -34.0, 31.0};

float b2[] = { -9.0, 8.0, 39.0, 13.0, 1.0, 21.0 };

int n = 6, nz = 15;

float *x = NULL;

x = imsl_f_superlu (n, nz, a, b1, 0);

imsl_f_write_matrix ("solution to A*x = b1", 1, n, x, 0);

imsl_free (x);

x = imsl_f_superlu (n, nz, a, b2, IMSL_TRANSPOSE, 1, 0);

imsl_f_write_matrix ("solution to A^T*x = b2", 1, n, x, 0);

imsl_free (x);

}

Output

solution to A*x = b

1 2 3 4 5 6

solution to A^T*x = b2

1 2 3 4 5 6

Example 2

This example uses the matrix A = E(1000,10) to show how the LU factorization of A can be used to solve a linear system with the same coefficient matrix A but different right-hand side. Maximum absolute errors are printed. After the computations, the space allocated for the LU factorization is freed via function imsl_f_superlu_factor_free.

#include <imsl.h>

int main(){

Imsl_f_sparse_elem *a;

Imsl_f_super_lu_factor lu_factor;

float *b, *x, *mod_five, *mod_ten;

float error_factor_solve, error_solve;

int n = 1000, c = 10;

int i, nz, index;

/* Get the coefficient matrix */

a = imsl_f_generate_test_coordinate (n, c, &nz, 0);

/* Set two different predetermined solutions */

mod_five = (float*) malloc (n*sizeof(*mod_five));

mod_ten = (float*) malloc (n*sizeof(*mod_ten));

for (i=0; i<n; i++) {

mod_five[i] = (float) (i % 5);

mod_ten[i] = (float) (i % 10);

}

/* Choose b so that x will approximate mod_five */

b = imsl_f_mat_mul_rect_coordinate ("A*x",

IMSL_A_MATRIX, n, n, nz, a,

IMSL_X_VECTOR, n, mod_five, 0);

/* Solve Ax = b */

x = imsl_f_superlu (n, nz, a, b,

IMSL_RETURN_SPARSE_LU_FACTOR, &lu_factor, 0);

/* Compute max absolute error */

error_factor_solve = imsl_f_vector_norm (n, x,

IMSL_SECOND_VECTOR, mod_five,

IMSL_INF_NORM, &index,

0);

imsl_free (mod_five);

imsl_free (b);

imsl_free (x);

/* Get new right hand side -- b = A * mod_ten */

b = imsl_f_mat_mul_rect_coordinate ("A*x",

IMSL_A_MATRIX, n, n, nz, a,

IMSL_X_VECTOR, n, mod_ten,

0);

/* Use the previously computed factorization

to solve Ax = b */

x = imsl_f_superlu (n, nz, a, b,

IMSL_SUPPLY_SPARSE_LU_FACTOR, lu_factor,

IMSL_FACTOR_SOLVE, 2,

0);

error_solve = imsl_f_vector_norm (n, x,

IMSL_SECOND_VECTOR, mod_ten,

IMSL_INF_NORM, &index,

0);

imsl_free (mod_ten);

imsl_free (b);

imsl_free (x);

imsl_free (a);

/* Free sparse LU structure */

imsl_f_superlu_factor_free (&lu_factor);

/* Print errors */

printf ("absolute error (factor/solve) = %e\n",

error_factor_solve);

printf ("absolute error (solve) = %e\n", error_solve);

}

Output

absolute error (factor/solve) = 1.835823e-005

absolute error (solve) = 2.002716e-005

Warning Errors

IMSL_ILL_CONDITIONED The input matrix is too ill-conditioned. An estimate of the reciprocal of its L1 condition number is “rcond” = #. The solution might not be accurate.

Fatal Errors

IMSL_SINGULAR_MATRIX The input matrix is singular.

Contact Support