10.3 1D Texturing
For illustrative purposes, we will deal with 1D textures in detail and then expand the discussion to include 2D and 3D textures.
10.3.1 Texture Setup
The data in textures can consist of 1, 2 or 4 elements of any of the following types:
- Signed or unsigned 8-, 16-, or 32-bit integers,
- 16-bit floating point values, or
- 32-bit floating point values.
In the .cu file, (whether using the CUDA runtime or the driver API), the texture reference is declared as follows:
texture<ReturnType, Dimension, ReadMode> Name;
where ReturnType is the value returned by the texture intrinsic; Dimension is 1, 2 or 3 for 1D, 2D or 3D, respectively; and ReadMode is an optional parameter type that defaults to cudaReadModeElementType. The read mode only affects integer-valued texture data: by default, the texture passes back integers when the texture data is integer-valued, promoting them to 32-bit if necessary. But when cudaReadModeNormalizedFloat is specified as the read mode, 8- or 16-bit integers can be promoted to floating point values in the range [0.0, 1.0] according to the formulas below.
Format |
Conversion Formula To Float |
char c |
float z); |
1D layered texture |
tex1DLayered(float x, int layer); |
2D layered texture |
tex2DLayered(float x, float y, int layer); |
Table 9-2. Texture intrinsics
Texture references have file scope and behave similarly to global variables. They cannot be created, destroyed, or passed as parameters, so wrapping them in higher-level abstractions must be undertaken with care.
CUDA Runtime
Before invoking a kernel that uses a texture, the texture must be bound to a CUDA array or device memory by calling cudaBindTexture(),cudaBindTexture2D(), or cudaBindTextureToArray(). Due to the language integration of the CUDA runtime, the texture can be referenced by name, e.g.:
texture<float, 2, cudaReadModeElementType> tex; ... CUDART_CHECK(cudaBindTextureToArray(tex, texArray));
Once the texture is bound, kernels that use that texture reference will read from the bound memory until the texture binding is changed.
Driver API
When a texture is declared in a .cu file, driver applications must query it using cuModuleGetTexRef(). In the driver API, the immutable attributes of the texture must be set explicitly, and they must agree with the assumptions used by the compiler to generate the code. For most textures, this just means the format must agree with the format declared in the .cu file; the exception is when textures are set up to promote integers or 16-bit floating point values to normalized 32-bit floating point values.
The cuTexRefSetFormat() function is used to specify the format of the data in the texture:
CUresult CUDAAPI cuTexRefSetFormat(CUtexref hTexRef, CUarray_format fmt, int NumPackedComponents);
The array formats are as follows:
Enumeration Value |
Type |
CU_AD_FORMAT_UNSIGNED_INT8 |
unsigned char |
CU_AD_FORMAT_UNSIGNED_INT16 |
unsigned short |
CU_AD_FORMAT_UNSIGNED_INT32 |
unsigned int |
CU_AD_FORMAT_SIGNED_INT8 |
signed char |
CU_AD_FORMAT_SIGNED_INT16 |
short |
CU_AD_FORMAT_SIGNED_INT32 |
int |
CU_AD_FORMAT_SIGNED_HALF |
half (IEEE 754 “binary16” format) |
CU_AD_FORMAT_SIGNED_FLOAT |
float |
NumPackedComponents specifies the number of components in each texture element. It may be 1, 2 or 4.
16-bit floats are a special data type that are well-suited to representing image data with high integrity; with 10 bits of floating point mantissa (effectively 11 bits of precision for normalized numbers), there is enough precision to represent data generated by most sensors, and 5 bits of exponent gives enough dynamic range to represent starlight and sunlight in the same image. Most floating point architectures do not include native instructions to process 16-bit floats, and CUDA is no exception. The texture hardware promotes 16-bit floats to 32-bit floats automatically, and CUDA kernels can convert between 16- and 32-bit floats with the __float2half_rn() and __half2float() intrinsics.