Implementation
Basic Linear Algebra Subprograms
IMSL C Numerical Library incorporates the use of many Basic Linear Algebra Subprograms (BLAS) throughout the product. These functions are named using IMSL conventions and used internally. They are not accessible directly by the user.
NVIDIA Corp. implemented certain Level 1, 2 and 3 BLAS in the NVIDIA CUDA Toolkit. The NVIDIA external names and argument protocols are different from those used by the IMSL C Numerical Library. Wrappers have been written to allow for the IMSL C Numerical Library to access selected routines in the NVIDIA CUDA Toolkit.
In
Table 12.9, we document an
enumeration that includes those BLAS for which a CUDA Toolkit implementation is provided in the IMSL C Numerical Library. The naming convention used is the name of the BLAS function prefaced by ‘
IMSL_CUDA_’.
Transforms
NVIDIA CUDA Toolkit implementations of complex two-dimensional FFT (Fast Fourier Transform) functions can be accessed when using functions
imsl_c_fft_2d_complex and
imsl_z_fft_2d_complex. The enumerations defined to enable the user to manipulate the parameters used by these function are documented in
Table 12.9.
Utility Functions
There are three utility functions provided in the IMSL C Math Library that can be used to help manage the use of
NVIDIA CUDA Toolkit. These utilities appear in
Table 12.10 and are described in more detail in their corresponding function descriptions.
Note: Some NVIDIA hardware does not provide double precision arithmetic. Since the double precision functions are included in the
NVIDIA CUDA Toolkit library, those functions will appear to execute correctly even though they do not return correct results. When the IMSL software detects that the correct results are not returned, a warning error message will be printed and the IMSL equivalent of the function which does not use the GPU will be used. The user can eliminate this error by using function
imsl_cuda_set to set the threshold value to zero.
Table 12.9 — Enumerations of NVIDIA Toolkit-Enabled Functions
IMSL_CUDA_SGEMV | IMSL_CUDA_DGER | IMSL_CUDA_STRSM |
IMSL_CUDA_SGER | IMSL_CUDA_DSYR | IMSL_CUDA_DTRSM |
IMSL_CUDA_SSYR | IMSL_CUDA_DGEMM | IMSL_CUDA_C_FFT_2D_COMPLEX |
IMSL_CUDA_SGEMM | IMSL_CUDA_SGBMV | IMSL_CUDA_Z_FFT_2D_COMPLEX |
IMSL_CUDA_DGEMV | IMSL_CUDA_DGBMV | |
Table 12.10 — NVIDIA CUDA Toolkit Utilities
|
|
|
Required NVIDIA Copyright Notice:
© 2005–2011 by NVIDIA Corporation. All rights reserved.
Portions of the NVIDIA SGEMM and DGEMM library routines were written by Vasily Volkov and are subject to the Modified Berkeley Software Distribution License as follows:
Copyright (©) 2007-09, Regents of the University of California
All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. (See CUDA Toolkit 4.0, CUBLAS Library, April, 2011, for these remaining conditions.)