1. topsdnn API Reference

This is the API Reference documentation for topsDNN library. This API Reference lists the datatyes and functions per library.

1.1. OpsInferLibrary

This entity contains the routines related to topsDNN context creation and destruction, tensor descriptor management, tensor utility routines, and the inference portion of common machine learning algorithms.

1.1.1. OpsInferEnumeration

These are the enumeration types in the topsdnn_ops_infer library.

enum topsdnnStatus_t

Values:

enumerator TOPSDNN_STATUS_SUCCESS
enumerator TOPSDNN_STATUS_NOT_INITIALIZED
enumerator TOPSDNN_STATUS_ALLOC_FAILED
enumerator TOPSDNN_STATUS_BAD_PARAM
enumerator TOPSDNN_STATUS_INTERNAL_ERROR
enumerator TOPSDNN_STATUS_INVALID_VALUE
enumerator TOPSDNN_STATUS_ARCH_MISMATCH
enumerator TOPSDNN_STATUS_MAPPING_ERROR
enumerator TOPSDNN_STATUS_EXECUTION_FAILED
enumerator TOPSDNN_STATUS_NOT_SUPPORTED
enumerator TOPSDNN_STATUS_LICENSE_ERROR
enumerator TOPSDNN_STATUS_RUNTIME_PREREQUISITE_MISSING
enumerator TOPSDNN_STATUS_RUNTIME_IN_PROGRESS
enumerator TOPSDNN_STATUS_RUNTIME_FP_OVERFLOW
enumerator TOPSDNN_STATUS_VERSION_MISMATCH
enum topsdnnDataType_t

Values:

enumerator TOPSDNN_DATA_FLOAT
enumerator TOPSDNN_DATA_EF32
enumerator TOPSDNN_DATA_DOUBLE
enumerator TOPSDNN_DATA_HALF
enumerator TOPSDNN_DATA_BFLOAT16
enumerator TOPSDNN_DATA_INT8
enumerator TOPSDNN_DATA_INT32
enumerator TOPSDNN_DATA_END
enum topsdnnTensorFormat_t

Values:

enumerator TOPSDNN_TENSOR_NCHW
enumerator TOPSDNN_TENSOR_NHWC
enum topsdnnNanPropagation_t

topsdnnNanPropagation_t is an enumerated type used to indicate if a given routine should propagate Nan numbers.

Values:

enumerator TOPSDNN_NOT_PROPAGATE_NAN

Nan numbers are not propagated.

enumerator TOPSDNN_PROPAGATE_NAN

Nan numbers are propagated.

enum topsdnnMathType_t

Values:

enumerator TOPSDNN_DEFAULT_MATH
enumerator TOPSDNN_TENSOR_OP_MATH
enumerator TOPSDNN_TENSOR_OP_MATH_ALLOW_CONVERSION
enumerator TOPSDNN_FMA_MATH
enum topsdnnDeterminism_t

Values:

enumerator TOPSDNN_NON_DETERMINISTIC
enumerator TOPSDNN_DETERMINISTIC
enum topsdnnActivationMode_t

topsdnnActivationMode_t is an enumerated type used to select the neuron activation function used in topsdnnActivationForward(), topsdnnActivationBackward(), and topsdnnConvolutionBiasActivationForward().

Values:

enumerator TOPSDNN_ACTIVATION_SIGMOID

Selects the sigmoid function.

enumerator TOPSDNN_ACTIVATION_RELU

Selects the rectified linear function.

enumerator TOPSDNN_ACTIVATION_TANH

Selects the hyperbolic tangent function.

enumerator TOPSDNN_ACTIVATION_CLIPPED_RELU

Selects the clipped rectified linear function.

enumerator TOPSDNN_ACTIVATION_ELU

Selects the exponential linear function.

enumerator TOPSDNN_ACTIVATION_IDENTITY

Selects the identity function, intended for bypassing the activation step in topsdnnConvolutionBiasActivationForward().

enumerator TOPSDNN_ACTIVATION_SWISH

Selects the swish function.

enumerator TOPSDNN_ACTIVATION_LEAKY_RELU

Selects the leaky rectified linear function.

enum topsdnnConvolutionFwdAlgo_t

Values:

enumerator TOPSDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM
enumerator TOPSDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM
enumerator TOPSDNN_CONVOLUTION_FWD_ALGO_GEMM
enumerator TOPSDNN_CONVOLUTION_FWD_ALGO_DIRECT
enumerator TOPSDNN_CONVOLUTION_FWD_ALGO_FFT
enumerator TOPSDNN_CONVOLUTION_FWD_ALGO_FFT_TILING
enumerator TOPSDNN_CONVOLUTION_FWD_ALGO_WINOGRAD
enumerator TOPSDNN_CONVOLUTION_FWD_ALGO_WINOGRAD_NONFUSED
enumerator TOPSDNN_CONVOLUTION_FWD_ALGO_COUNT
enum topsdnnConvolutionBwdFilterAlgo_t

Values:

enumerator TOPSDNN_CONVOLUTION_BWD_FILTER_ALGO_0
enumerator TOPSDNN_CONVOLUTION_BWD_FILTER_ALGO_1
enumerator TOPSDNN_CONVOLUTION_BWD_FILTER_ALGO_FFT
enumerator TOPSDNN_CONVOLUTION_BWD_FILTER_ALGO_3
enumerator TOPSDNN_CONVOLUTION_BWD_FILTER_ALGO_WINOGRAD
enumerator TOPSDNN_CONVOLUTION_BWD_FILTER_ALGO_WINOGRAD_NONFUSED
enumerator TOPSDNN_CONVOLUTION_BWD_FILTER_ALGO_FFT_TILING
enumerator TOPSDNN_CONVOLUTION_BWD_FILTER_ALGO_COUNT
enum topsdnnConvolutionBwdDataAlgo_t

Values:

enumerator TOPSDNN_CONVOLUTION_BWD_DATA_ALGO_0
enumerator TOPSDNN_CONVOLUTION_BWD_DATA_ALGO_1
enumerator TOPSDNN_CONVOLUTION_BWD_DATA_ALGO_FFT
enumerator TOPSDNN_CONVOLUTION_BWD_DATA_ALGO_FFT_TILING
enumerator TOPSDNN_CONVOLUTION_BWD_DATA_ALGO_WINOGRAD
enumerator TOPSDNN_CONVOLUTION_BWD_DATA_ALGO_WINOGRAD_NONFUSED
enumerator TOPSDNN_CONVOLUTION_BWD_DATA_ALGO_COUNT
enum topsdnnPoolingMode_t

Values:

enumerator TOPSDNN_POOLING_MAX
enumerator TOPSDNN_POOLING_AVERAGE_COUNT_INCLUDE_PADDING
enumerator TOPSDNN_POOLING_AVERAGE_COUNT_EXCLUDE_PADDING
enumerator TOPSDNN_POOLING_MAX_DETERMINISTIC
enum topsdnnSoftmaxAlgorithm_t

Values:

enumerator TOPSDNN_SOFTMAX_FAST
enumerator TOPSDNN_SOFTMAX_ACCURATE
enumerator TOPSDNN_SOFTMAX_LOG
enum topsdnnSoftmaxMode_t

Values:

enumerator TOPSDNN_SOFTMAX_MODE_INSTANCE
enumerator TOPSDNN_SOFTMAX_MODE_CHANNEL
enum topsdnnReduceTensorOp_t

Values:

enumerator TOPSDNN_REDUCE_TENSOR_ADD
enumerator TOPSDNN_REDUCE_TENSOR_MUL
enumerator TOPSDNN_REDUCE_TENSOR_MIN
enumerator TOPSDNN_REDUCE_TENSOR_MAX
enumerator TOPSDNN_REDUCE_TENSOR_AMAX
enumerator TOPSDNN_REDUCE_TENSOR_AVG
enumerator TOPSDNN_REDUCE_TENSOR_NORM1
enumerator TOPSDNN_REDUCE_TENSOR_NORM2
enumerator TOPSDNN_REDUCE_TENSOR_MUL_NO_ZEROS
enum topsdnnReduceTensorIndices_t

Values:

enumerator TOPSDNN_REDUCE_TENSOR_NO_INDICES
enumerator TOPSDNN_REDUCE_TENSOR_FLATTENED_INDICES
enum topsdnnIndicesType_t

Values:

enumerator TOPSDNN_32BIT_INDICES
enumerator TOPSDNN_64BIT_INDICES
enumerator TOPSDNN_16BIT_INDICES
enumerator TOPSDNN_8BIT_INDICES
enum topsdnnOpTensorOp_t

Values:

enumerator TOPSDNN_OP_TENSOR_ADD
enumerator TOPSDNN_OP_TENSOR_MUL
enumerator TOPSDNN_OP_TENSOR_MIN
enumerator TOPSDNN_OP_TENSOR_MAX
enumerator TOPSDNN_OP_TENSOR_SQRT
enumerator TOPSDNN_OP_TENSOR_NOT
enum topsdnnBatchNormMode_t

Values:

enumerator TOPSDNN_BATCHNORM_PER_ACTIVATION
enumerator TOPSDNN_BATCHNORM_SPATIAL
enumerator TOPSDNN_BATCHNORM_SPATIAL_PERSISTENT
enum topsdnnLRNMode_t

topsdnnLRNMode_t is an enumerated type used to specify the mode of operation in topsdnnLRNCrossChannelForward() and topsdnnLRNCrossChannelBackward().

Values:

enumerator TOPSDNN_LRN_CROSS_CHANNEL_DIM1

LRN computation is performed across the tensor’s dimension dimA[1].

1.1.2. OpsInferFunctions

These are the API functions in the topsdnn_ops_infer library.

size_t TOPSDNN_EXPORT topsdnnGetVersion (void)

This function returns the version number of the TOPSDNN library. It returns the TOPSDNN_VERSION defined present in the topsdnn.h header file.

Returns

Version number of size_t type.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnCreate (topsdnnHandle_t *handle)

This function initializes the TOPSDNN library and creates a handle to an opaque structure holding the TOPSDNN library context. It allocates hardware resources on the host and device and must be called prior to making any other TOPSDNN library calls.

Parameters

handle – Output. Pointer to pointer where to store the address to the allocated TopsDNN handle. For more information, refer to topsdnnHandle_t.

Returns

TOPSDNN_STATUS_SUCCESS TOPSDNN handle was created successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM Invalid (NULL) input pointer supplied.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnDestroy (topsdnnHandle_t handle)

This function releases the resources used by the TOPSDNN handle.

Parameters

handle – Input. The TOPSDNN handle to be destroyed.

Returns

TOPSDNN_STATUS_SUCCESS The TOPSDNN context destruction was successful.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSetStream (topsdnnHandle_t handle, topsStream_t streamId)

This function sets the user’s tops stream in the topsDNN handle. The new stream will be used to launch topsDNN GCU kernels or to synchronize to this stream when topsDNN kernels are launched in the internal streams. If the topsDNN library stream is not set, all kernels use the default (NULL) stream. Setting the user stream in the topsDNN handle guarantees the issue-order execution of topsDNN calls and other GCU kernels launched in the same stream.

Parameters
  • handle – Pointer to the topsDNN handle.

  • streamID – New Tops stream to be written to the topsDNN handle.

Returns

TOPSDNN_STATUS_BAD_PARAM Invalid (NULL) handle.

Returns

TOPSDNN_STATUS_MAPPING_ERROR Mismatch between the user stream and the topsDNN handle context.

Returns

TOPSDNN_STATUS_SUCCESS The new stream was set successfully.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetStream (topsdnnHandle_t handle, topsStream_t *streamId)

This function retrieves the user TOPS stream programmed in the topsDNN handle. When the user’s tops stream is not set in the topsDNN handle, this function reports the null-stream.

Parameters
  • handle – Pointer to the topsDNN handle.

  • streamID – Pointer where the current TOPS stream from the topsDNN handle should be stored.

Returns

TOPSDNN_STATUS_BAD_PARAM Invalid (NULL) handle.

Returns

TOPSDNN_STATUS_SUCCESS The stream identifier was retrieved successfully.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnCreateTensorDescriptor (topsdnnTensorDescriptor_t *tensorDesc)

This function creates a generic tensor descriptor object by allocating the memory needed to hold its opaque structure. The data is initialized to all zeros.

Parameters

tensorDesc – Output. Pointer to pointer where the address to the allocated tensor descriptor object should be stored.

Returns

TOPSDNN_STATUS_SUCCESS The object was created successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM Invalid input argument.

Returns

TOPSDNN_STATUS_ALLOC_FAILED The resources could not be allocated.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnDestroyTensorDescriptor (topsdnnTensorDescriptor_t tensorDesc)

This function destroys a previously created tensor descriptor object. When the input pointer is NULL , this function performs no destroy operation.

Parameters

tensorDesc – Input. Pointer to the tensor descriptor object to be destroyed.

Returns

TOPSDNN_STATUS_SUCCESS The object was destroyed successfully.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSetTensor4dDescriptor (topsdnnTensorDescriptor_t tensorDesc, topsdnnTensorFormat_t format, topsdnnDataType_t dataType, int n, int c, int h, int w)

This function initializes a previously created generic tensor descriptor object into a 4D tensor. The strides of the four dimensions are inferred from the format parameter and set in such a way that the data is contiguous in memory with no padding between dimensions.

Parameters
  • tensorDesc – Input/Output. Handle to a previously created tensor descriptor.

  • format – Input. Type of format.

  • dataType – Input. Data type.

  • n – Input. Number of images.

  • c – Input. Number of feature maps per image.

  • h – Input. Height of each feature map.

  • w – Input. Width of each feature map.

Returns

TOPSDNN_STATUS_SUCCESS The object was set successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the parameters n, c, h, w was negative or format has an invalid enumerant value or dataType has an invalid enumerant value.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED the total size of the tensor descriptor exceeds the maximum limit of 2 Giga-elements.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetTensor4dDescriptor (const topsdnnTensorDescriptor_t tensorDesc, topsdnnDataType_t *dataType, int *n, int *c, int *h, int *w, int *nStride, int *cStride, int *hStride, int *wStride)

This function queries the parameters of the previously initialized Tensor4d descriptor object.

Parameters
  • tensorDesc – Input. Handle to a previously initialized tensor descriptor.

  • dataType – Output. Data type.

  • n – Output. Number of images.

  • c – Output. Number of feature maps per image.

  • h – Output. Height of each feature map.

  • w – Output. Width of each feature map.

  • nStride – Output. Stride between two consecutive images.

  • cStride – Output. Stride between two consecutive feature maps.

  • hStride – Output. Stride between two consecutive rows.

  • wStride – Output. Stride between two consecutive columns.

Returns

TOPSDNN_STATUS_SUCCESS The operation succeeded.

Returns

TOPSDNN_STATUS_BAD_PARAM Invalid input argument.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSetTensorNdDescriptor (topsdnnTensorDescriptor_t tensorDesc, topsdnnDataType_t dataType, int nbDims, const int dimA[], const int strideA[])

This function initializes a previously created generic tensor descriptor object.

Parameters
  • tensorDesc – Input/Output. Handle to a previously created tensor descriptor.

  • dataType – Input. Data type.

  • nbDims – Input. Dimension of the tensor.

  • dimA – Input. Array of dimension nbDims that contain the size of the tensor for every dimension.

  • strideA – Input. Array of dimension nbDims that contain the stride of the tensor for every dimension.

Returns

TOPSDNN_STATUS_SUCCESS The operation succeeded.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the elements of the array dimA was negative or zero, or dataType has an invalid enumerant value.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED The parameter nbDims is outside the range [4, TOPSDNN_DIM_MAX], or the total size of the tensor descriptor exceeds the maximum limit of 2 Giga-elements.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetTensorNdDescriptor (const topsdnnTensorDescriptor_t tensorDesc, int nbDimsRequested, topsdnnDataType_t *dataType, int *nbDims, int dimA[], int strideA[])

This function retrieves values stored in a previously initialized TensorNd descriptor object.

Parameters
  • tensorDesc – Input. Handle to a previously initialized tensor descriptor.

  • nbDimsRequested – Input. Number of dimensions to extract from a given tensor descriptor. It is also the minimum size of the arrays dimA and strideA. If this number is greater than the resulting nbDims[0], only nbDims[0] dimensions will be returned.

  • dataType – Output. Data type.

  • nbDims – Output. Actual number of dimensions of the tensor will be returned in nbDims[0].

  • dimA – Output. Array of dimensions of at least nbDimsRequested that will be filled with the dimensions from the provided tensor descriptor.

  • strideA – Output. Array of dimensions of at least nbDimsRequested that will be filled with the strides from the provided tensor descriptor.

Returns

TOPSDNN_STATUS_SUCCESS The results were returned successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM Either tensorDesc or nbDims pointer is NULL.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetTensorSizeInBytes (const topsdnnTensorDescriptor_t tensorDesc, size_t *size)

This function returns the size of the tensor in memory in respect to the given descriptor. This function can be used to know the amount of GPU memory to be allocated to hold that tensor.

Parameters
  • tensorDesc – Input. Handle to a previously initialized tensor descriptor.

  • size – Output. Size in bytes needed to hold the tensor in GPU memory.

Returns

TOPSDNN_STATUS_SUCCESS The results were returned successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM input pointer is NULL.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnCreateActivationDescriptor (topsdnnActivationDescriptor_t *activationDesc)

This function creates an activation descriptor object by allocating the memory needed to hold its opaque structure.

Returns

TOPSDNN_STATUS_SUCCESS The object was created successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM The input parameters was out of valid range.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSetActivationDescriptor (topsdnnActivationDescriptor_t activationDesc, topsdnnActivationMode_t mode, topsdnnNanPropagation_t reluNanOpt, double coef)

This function initializes a previously created generic activation descriptor object.

Parameters
  • activationDesc – Input/Output. Handle to a previously created activation descriptor.

  • mode – Input. Enumerant to specify the activation mode.

  • reluNanOpt – Input. Enumerant to specify the Nan propagation mode.

  • coef – Input. Floating point number.

Returns

TOPSDNN_STATUS_SUCCESS The object was set successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM The input parameters was out of valid range.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetActivationDescriptor (const topsdnnActivationDescriptor_t activationDesc, topsdnnActivationMode_t *mode, topsdnnNanPropagation_t *reluNanOpt, double *coef)

This function queries a previously initialized generic activation descriptor object.

Parameters
  • activationDesc – Input. Handle to a previously created activation descriptor.

  • mode – Output. Enumerant to specify the activation mode.

  • reluNanOpt – Output. Enumerant to specify the Nan propagation mode.

  • coef – Output. Floating point number.

Returns

TOPSDNN_STATUS_SUCCESS The object was queried successfully.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnDestroyActivationDescriptor (topsdnnActivationDescriptor_t activationDesc)

This function destroys a previously created activation descriptor object.

Returns

TOPSDNN_STATUS_SUCCESS The object was destroyed successfully.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnActivationForward (topsdnnHandle_t handle, topsdnnActivationDescriptor_t activationDesc, const void *alpha, const topsdnnTensorDescriptor_t xDesc, const void *x, const void *beta, const topsdnnTensorDescriptor_t yDesc, void *y)

This routine applies a specified neuron activation function element-wise over each input value.

Parameters
  • handle – Input. Handle to a previously created topsDNN context.

  • activationDesc – Input. Handle to a previously created activation descriptor.

  • alpha – Input. Pointers to scaling factors (in host memory) used to blend the layer output value with prior value in the destination tensor

  • xDesc – Input. Handle to the previously initialized input tensor descriptor.

  • x – Input. Data pointer to device memory associated with the tensor descriptor xDesc.

  • beta – Input. Pointers to scaling factors (in host memory) used to blend the layer output value with prior value in the destination tensor

  • yDesc – Input. Handle to the previously initialized output tensor descriptor.

  • y – Output. Data pointer to device memory associated with the output tensor descriptor yDesc.

Returns

TOPSDNN_STATUS_SUCCESS The function launched successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM The input parameters was out of valid range.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnCreateFilterDescriptor (topsdnnFilterDescriptor_t *filterDesc)

This function creates a filter descriptor object by allocating the memory needed to hold its opaque structure.

Parameters

filterDesc – pointer to an opaque structure holding the description of a filter dataset.

Returns

TOPSDNN_STATUS_SUCCESS The object was created successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM Invalid input argument.

Returns

TOPSDNN_STATUS_ALLOC_FAILED The resources could not be allocated.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnDestroyFilterDescriptor (topsdnnFilterDescriptor_t filterDesc)

This function destroys a filter object.

Parameters

filterDesc – pointer to an opaque structure holding the description of a filter dataset.

Returns

TOPSDNN_STATUS_SUCCESS The object was destroyed successfully.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSetFilter4dDescriptor (topsdnnFilterDescriptor_t filterDesc, topsdnnDataType_t dataType, topsdnnTensorFormat_t format, int k, int c, int h, int w)

This function initializes a previously created filter descriptor object into a 4D filter. The layout of the filters must be contiguous in memory.

Parameters
  • filterDesc – a previously created filter descriptor.

  • dataType – Data Type.

  • format – Type of the filter layout format, NHWC or HCHW.

  • k – Number of output feature maps.

  • c – Number of input feature maps.

  • h – Height of each filter.

  • w – Width of each filter.

Returns

TOPSDNN_STATUS_SUCCESS The object was set successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the parameters k, c, h, w was negative or format has an invalid enumerant value or dataType has an invalid enumerant value.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED the total size of the tensor descriptor exceeds the maximum limit of 2 Giga-elements.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetFilter4dDescriptor (const topsdnnFilterDescriptor_t filterDesc, topsdnnDataType_t *dataType, topsdnnTensorFormat_t *format, int *k, int *c, int *h, int *w)

This function queries the parameters of the previously initialized filter descriptor object.

Parameters
  • filterDesc – a previously created filter descriptor.

  • dataType – Data Type.

  • format – Type of the filter layout format.

  • k – Number of output feature maps.

  • c – Number of input feature maps.

  • h – Height of each filter.

  • w – Width of each filter.

Returns

TOPSDNN_STATUS_SUCCESS The operation succeeded.

Returns

TOPSDNN_STATUS_BAD_PARAM Invalid input argument.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSetFilterNdDescriptor (topsdnnFilterDescriptor_t filterDesc, topsdnnDataType_t dataType, topsdnnTensorFormat_t format, int nbDims, const int filterDimA[])

This function initializes a previously created filter descriptor object into a 4D filter. The layout of the filters must be contiguous in memory.

Parameters
  • filterDesc – Input/Output. Handle to a previously created filter descriptor.

  • dataType – Input. Data type.

  • format – Input.Type of the filter layout format.

  • nbDims – Input. Dimension of the filter.

  • filterDimA – Input. Array of dimension nbDims containing the size of the filter for each dimension.

Returns

TOPSDNN_STATUS_SUCCESS The operation succeeded.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the elements of the array filterDimA was negative or zero, or dataType or format has an invalid enumerant value.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED The parameter nbDims is outside the range [4, TOPSDNN_DIM_MAX], or the total size of the tensor descriptor exceeds the maximum limit of 2 Giga-elements.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetFilterNdDescriptor (const topsdnnFilterDescriptor_t filterDesc, int nbDimsRequested, topsdnnDataType_t *dataType, topsdnnTensorFormat_t *format, int *nbDims, int filterDimA[])

This function queries a previously initialized FilterNd descriptor object.

Parameters
  • filterDesc – Input. Handle to a previously initialized filter descriptor.

  • nbDimsRequested – Input. Dimension of the expected filter descriptor. It is also the minimum size of the arrays filterDimA in order to be able to hold the results.

  • dataType – Output. Data type.

  • format – Output. Type of format.

  • nbDims – Output. Actual dimension of the filter.

  • filterDimA – Output. Array of dimensions of at least nbDimsRequested that will be filled with the filter parameters from the provided filter descriptor.

Returns

TOPSDNN_STATUS_SUCCESS The results were returned successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM Either tensorDesc or nbDims pointer is NULL.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnCreatePoolingDescriptor (topsdnnPoolingDescriptor_t *poolingDesc)

This function creates a pooling descriptor object by allocating the memory needed to hold its opaque structure.

Parameters

poolingDesc – pointer to pooling descriptor

Returns

TOPSDNN_STATUS_SUCCESS The object was created successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM The input parameters was out of valid range.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSetPooling2dDescriptor (topsdnnPoolingDescriptor_t poolingDesc, topsdnnPoolingMode_t mode, topsdnnNanPropagation_t maxpoolingNanOpt, int windowHeight, int windowWidth, int verticalPadding, int horizontalPadding, int verticalStride, int horizontalStride)

This function initializes a previously created generic pooling descriptor object into a 2D description.

Parameters
  • poolingDesc – Input/Output. Handle to a previously created pooling descriptor.

  • mode – Input. Enumerant to specify the pooling mode.

  • maxpoolingNanOpt – Input. Enumerant to specify the Nan propagation mode.

  • windowHeight – Input. Height of the pooling window.

  • windowWidth – Input. Width of the pooling window.

  • verticalPadding – Input. Size of vertical padding.

  • horizontalPadding – Input. Size of horizontal padding.

  • verticalStride – Input. Pooling vertical stride.

  • horizontalStride – Input. Pooling horizontal stride.

Returns

TOPSDNN_STATUS_SUCCESS The object was set successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the parameters windowHeight, windowWidth, verticalStride, horizontalStride is negative or mode or maxpoolingNanOpt has an invalid enumerate value.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSetPoolingNdDescriptor (topsdnnPoolingDescriptor_t poolingDesc, const topsdnnPoolingMode_t mode, const topsdnnNanPropagation_t maxpoolingNanOpt, int nbDims, const int windowDimA[], const int paddingA[], const int strideA[])

This function initializes a previously created generic pooling descriptor object.

Parameters
  • poolingDesc – Input/Output. Handle to a previously created pooling descriptor.

  • mode – Input. Enumerant to specify the pooling mode.

  • maxpoolingNanOpt – Input. Enumerant to specify the Nan propagation mode.

  • nbDims – Input. Dimension of the pooling operation. Must be greater than zero.

  • windowDimA – Input. Array of dimension nbDims containing the window size for each dimension. The value of array elements must be greater than zero.

  • paddingA – Input. Array of dimension nbDims containing the padding size for each dimension. Negative padding is allowed.

  • strideA – Input. Array of dimension nbDims containing the striding size for each dimension. The value of array elements must be greater than zero (meaning, negative striding size is not allowed).

Returns

TOPSDNN_STATUS_SUCCESS The object was initialized successfully.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED If (nbDims > TOPSDNN_DIM_MAX-2).

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: The descriptor poolingDesc is invalid.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetPooling2dDescriptor (const topsdnnPoolingDescriptor_t poolingDesc, topsdnnPoolingMode_t *mode, topsdnnNanPropagation_t *maxpoolingNanOpt, int *windowHeight, int *windowWidth, int *verticalPadding, int *horizontalPadding, int *verticalStride, int *horizontalStride)

This function queries a previously created 2D pooling descriptor object.

Parameters
  • poolingDesc – Input. Handle to a previously created pooling descriptor.

  • mode – Output. Enumerant to specify the pooling mode.

  • maxpoolingNanOpt – Output. Enumerant to specify the Nan propagation mode.

  • windowHeight – Output. Height of the pooling window.

  • windowWidth – Output. Width of the pooling window.

  • verticalPadding – Output. Size of vertical padding.

  • horizontalPadding – Output. Size of horizontal padding.

  • verticalStride – Output. Pooling vertical stride.

  • horizontalStride – Output. Pooling horizontal stride.

Returns

TOPSDNN_STATUS_SUCCESS The object was set successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: poolingDesc has not been initialized.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetPoolingNdDescriptor (const topsdnnPoolingDescriptor_t poolingDesc, int nbDimsRequested, topsdnnPoolingMode_t *mode, topsdnnNanPropagation_t *maxpoolingNanOpt, int *nbDims, int windowDimA[], int paddingA[], int strideA[])

This function queries a previously initialized generic pooling descriptor object.

Parameters
  • poolingDesc – Input. Handle to a previously created pooling descriptor.

  • nbDimsRequested – Input. Dimension of the expected pooling descriptor. It is also the minimum size of the arrays windowDimA, paddingA, and strideA in order to be able to hold the results.

  • mode – Output. Enumerant to specify the pooling mode.

  • maxpoolingNanOpt – Input. Enumerant to specify the Nan propagation mode.

  • nbDims – Output. Actual dimension of the pooling descriptor.

  • windowDimA – Output. Array of dimension of at least nbDimsRequested that will be filled with the window parameters from the provided pooling descriptor.

  • paddingA – Output. Array of dimension of at least nbDimsRequested that will be filled with the padding parameters from the provided pooling descriptor.

  • strideA – Output. Array of dimension at least nbDimsRequested that will be filled with the stride parameters from the provided pooling descriptor.

Returns

TOPSDNN_STATUS_SUCCESS The object was queried successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: The descriptor poolingDesc is invalid.The nbDimsRequest is negative. At least one of the parameters nbDims, windowDimA, paddingA, strideA is invalid.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED The parameter nbDimsRequested is greater than TOPSDNN_DIM_MAX.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnDestroyPoolingDescriptor (topsdnnPoolingDescriptor_t poolingDesc)

This function destroys a previously created pooling descriptor object.

Parameters

poolingDesc – pooling description.

Returns

TOPSDNN_STATUS_SUCCESS The object was destroyed successfully.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetPooling2dForwardOutputDim (const topsdnnPoolingDescriptor_t poolingDesc, const topsdnnTensorDescriptor_t inputDesc, int *outN, int *outC, int *outH, int *outW)

This function provides the output dimensions of a tensor after 2d pooling has been applied.

Parameters
  • poolingDesc – Input. Handle to a previously initialized pooling descriptor.

  • inputDesc – Input. Handle to the previously initialized input tensor descriptor.

  • outN – Output. Number of images in the output.

  • outC – Output. Number of channels in the output.

  • outH – Output. Height of images in the output.

  • outW – Output. Width of images in the output.

Returns

TOPSDNN_STATUS_SUCCESS The function launched successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: poolingDesc has not been initialized. poolingDesc or inputDesc has an invalid number of dimensions (2 and 4 respectively are required). At least one of the parameters outN, outC, outH, outW is invalid.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetPoolingNdForwardOutputDim (const topsdnnPoolingDescriptor_t poolingDesc, const topsdnnTensorDescriptor_t inputDesc, int nbDims, int outDimA[])

This function provides the output dimensions of a tensor after Nd pooling has been applied.

Parameters
  • poolingDesc – Input. Handle to a previously initialized pooling descriptor.

  • inputDesc – Input. Handle to the previously initialized input tensor descriptor.

  • nbDims – Input. Number of dimensions in which pooling is to be applied

  • outDimA – Output. Array of nbDims output dimensions.

Returns

TOPSDNN_STATUS_SUCCESS The function launched successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: poolingDesc has not been initialized. The value of nbDims is inconsistent with the dimensionality of poolingDesc and inputDesc. outDimA is invalid.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnPoolingForward (topsdnnHandle_t handle, const topsdnnPoolingDescriptor_t poolingDesc, const void *alpha, const topsdnnTensorDescriptor_t xDesc, const void *x, const void *beta, const topsdnnTensorDescriptor_t yDesc, void *y)

This function computes pooling of input values (the maximum of several adjacent values) to produce an output with smaller height and/or width.

Parameters
  • handle – Input. Handle to a previously created TopsDNN context.

  • poolingDesc – Input. Pooling descriptor.

  • alpha, beta – Input. Pointers to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]*resultValue + beta[0]*priorDstValue

  • xDesc – Handle to the previously initialized input tensor descriptor. Must be of type TOPSDNN_DATA_FLOAT / TOPSDNN_DATA_HALF / TOPSDNN_DATA_INT8.

  • x – Input. Data pointer to device memory associated with the tensor descriptor xDesc.

  • yDesc – nput. Handle to the previously initialized output tensor descriptor. Must be of type TOPSDNN_DATA_FLOAT / TOPSDNN_DATA_HALF / TOPSDNN_DATA_INT8.

  • y – Output. Data pointer to device memory associated with the output tensor descriptor yDesc.

Returns

TOPSDNN_STATUS_SUCCESS The function launched successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: The dimensions n, c of the input tensor and output tensors differ. The datatype of the input tensor and output tensors differs.

Returns

TOPSDNN_STATUS_EXECUTION_FAILED The function failed to launch on the device.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED The function does not support the provided configuration.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSoftmaxForward (topsdnnHandle_t handle, topsdnnSoftmaxAlgorithm_t algorithm, topsdnnSoftmaxMode_t mode, const void *alpha, const topsdnnTensorDescriptor_t xDesc, const void *x, const void *beta, const topsdnnTensorDescriptor_t yDesc, void *y)

This routine computes the softmax function.

Parameters
  • handle – Input. Handle to a previously created TopsDNN context.

  • algorithm – Input. Enumerant to specify the softmax algorithm.

  • mode – Input. Enumerant to specify the softmax mode.

  • alpha, beta – Input. Pointers to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]*resultValue + beta[0]*priorDstValue

  • xDesc – Input. Handle to the previously initialized input tensor descriptor.

  • x – Input. Data pointer to device memory associated with the tensor descriptor xDesc.

  • yDesc – Input. Handle to the previously initialized output tensor descriptor.

  • y – Output. Data pointer to device memory associated with the output tensor descriptor yDesc.

Returns

TOPSDNN_STATUS_SUCCESS The function launched successfully.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED The function does not support the provided configuration.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: The dimensions n, c, h, w of the input tensor and output tensors differ. The datatype of the input tensor and output tensors differ. The parameters algorithm or mode have an invalid enumerant value.

Returns

TOPSDNN_STATUS_EXECUTION_FAILED The function failed to launch on the device.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnCreateReduceTensorDescriptor (topsdnnReduceTensorDescriptor_t *reduceTensorDesc)

This function creates a reduced tensor descriptor object by allocating the memory needed to hold its opaque structure.

Parameters

reduceTensorDesc – Output. reduce tensor descriptor.

Returns

TOPSDNN_STATUS_SUCCESS The object was created successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM reduceTensorDesc is a NULL pointer.

Returns

TOPSDNN_STATUS_ALLOC_FAILED The resources could not be allocated.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSetReduceTensorDescriptor (topsdnnReduceTensorDescriptor_t reduceTensorDesc, topsdnnReduceTensorOp_t reduceTensorOp, topsdnnDataType_t reduceTensorCompType, topsdnnNanPropagation_t reduceTensorNanOpt, topsdnnReduceTensorIndices_t reduceTensorIndices, topsdnnIndicesType_t reduceTensorIndicesType)

This function initializes a previously created reduce tensor descriptor object.

Parameters
  • reduceTensorDesc – Input/Output. Handle to a previously created reduce tensor descriptor.

  • reduceTensorOp – Input. Enumerant to specify the reduce tensor operation.

  • reduceTensorCompType – Input. Enumerant to specify the computation datatype of the reduction.

  • reduceTensorNanOpt – Input. Enumerant to specify the Nan propagation mode.

  • reduceTensorIndices – Input. Enumerant to specify the reduced tensor indices.

  • reduceTensorIndicesType – Input. Enumerant to specify the reduce tensor indices type.

Returns

TOPSDNN_STATUS_SUCCESS The object was set successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM reduceTensorDesc is NULL (reduceTensorOp, reduceTensorCompType, reduceTensorNanOpt,reduceTensorIndices or reduceTensorIndicesType has an invalid enumerant value).

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetReduceTensorDescriptor (const topsdnnReduceTensorDescriptor_t reduceTensorDesc, topsdnnReduceTensorOp_t *reduceTensorOp, topsdnnDataType_t *reduceTensorCompType, topsdnnNanPropagation_t *reduceTensorNanOpt, topsdnnReduceTensorIndices_t *reduceTensorIndices, topsdnnIndicesType_t *reduceTensorIndicesType)

This function queries a previously initialized reduce tensor descriptor object.

Parameters
  • reduceTensorDesc – Input. Pointer to a previously initialized reduce tensor descriptor object.

  • reduceTensorOp – Output. Enumerant to specify the reduce tensor operation.

  • reduceTensorCompType – Output. Enumerant to specify the computation datatype of the reduction.

  • reduceTensorNanOpt – Input. Enumerant to specify the Nan propagation mode.

  • reduceTensorIndices – Output. Enumerant to specify the reduced tensor indices.

  • reduceTensorIndicesType – Output. Enumerant to specify the reduce tensor indices type.

Returns

TOPSDNN_STATUS_SUCCESS The object was queried successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM reduceTensorDesc is NULL.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnDestroyReduceTensorDescriptor (topsdnnReduceTensorDescriptor_t tensorDesc)

This function destroys a previously created reduce tensor descriptor object. When the input pointer is NULL, this function performs no destroy operation.

Parameters

tensorDesc – Input. Pointer to the reduce tensor descriptor object to be destroyed.

Returns

TOPSDNN_STATUS_SUCCESS The object was destroyed successfully.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetReductionIndicesSize (topsdnnHandle_t handle, const topsdnnReduceTensorDescriptor_t reduceDesc, const topsdnnTensorDescriptor_t aDesc, const topsdnnTensorDescriptor_t cDesc, size_t *sizeInBytes)

This is a helper function to return the minimum size of the index space to be passed to the reduction given the input and output tensors.

Parameters
  • handle – Input. Handle to a previously created TopsDNN library descriptor.

  • reduceDesc – Input. Pointer to a previously initialized reduce tensor descriptor object.

  • aDesc – Input. Pointer to the input tensor descriptor.

  • cDesc – Input. Pointer to the output tensor descriptor.

  • sizeInBytes – Output. Minimum size of the index space to be passed to the reduction.

Returns

TOPSDNN_STATUS_SUCCESS The index space size is returned successfully.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetReductionWorkspaceSize (topsdnnHandle_t handle, const topsdnnReduceTensorDescriptor_t reduceDesc, const topsdnnTensorDescriptor_t aDesc, const topsdnnTensorDescriptor_t cDesc, size_t *sizeInBytes)

This is a helper function to return the minimum size of the workspace to be passed to the reduction given the input and output tensors.

Parameters
  • handle – Input. Handle to a previously created TopsDNN library descriptor.

  • reduceDesc – Input. Pointer to a previously initialized reduce tensor descriptor object.

  • aDesc – Input. Pointer to the input tensor descriptor.

  • cDesc – Input. Pointer to the output tensor descriptor.

  • sizeInBytes – Output. Minimum size of the index space to be passed to the reduction.

Returns

TOPSDNN_STATUS_SUCCESS The workspace size is returned successfully.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnReduceTensor (topsdnnHandle_t handle, const topsdnnReduceTensorDescriptor_t reduceTensorDesc, void *indices, size_t indicesSizeInBytes, void *workspace, size_t workspaceSizeInBytes, const void *alpha, const topsdnnTensorDescriptor_t aDesc, const void *a, const void *beta, const topsdnnTensorDescriptor_t cDesc, void *c)

This function reduces tensor A by implementing the equation C = alpha reduce op ( A ) + beta * C, given tensors A and C and scaling factors alpha and beta. The reduction op to use is indicated by the descriptor reduceTensorDesc. All ops are listed by the topsdnnReduceTensorOp_t enum. Each dimension of the output tensor C must match the corresponding dimension of the input tensor A or must be equal to 1. The dimensions equal to 1 indicate the dimensions of A to be reduced. The implementation will generate indices for the min and max ops only, as indicated by the topsdnnReduceTensorIndices_t enum of the reduceTensorDesc. Requesting indices for the other reduction ops results in an error. The data type of the indices is indicated by the topsdnnIndicesType_t enum; currently only the 32-bit (unsigned int) type is supported. The indices returned by the implementation are not absolute indices but relative to the dimensions being reduced. The indices are also flattened, meaning, not coordinate tuples. The HALF and INT8 data types may be mixed with the FLOAT data types. In these cases, the computation enum of reduceTensorDesc is required to be of type FLOAT.

Parameters
  • handle – Input. Handle to a previously created TopsDNN context.

  • reduceTensorDesc – Input. Handle to a previously initialized reduce tensor descriptor.

  • indices – Output. Handle to a previously allocated space for writing indices.

  • indicesSizeInBytes – Input. Size of the above previously allocated space.

  • workspace – Input. Handle to a previously allocated space for the reduction implementation.

  • workspaceSizeInBytes – Input. Size of the above previously allocated space.

  • alpha, beta – Input. Pointers to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]*resultValue + beta[0]*priorDstValue

  • aDesc, cDesc – Input. Handle to a previously initialized tensor descriptor.

  • a – Input. Pointer to data of the tensor described by the aDesc descriptor.

  • c – Input/Output. Pointer to data of the tensor described by the cDesc descriptor.

Returns

TOPSDNN_STATUS_SUCCESS The function executed successfully.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED The function does not support the provided configuration. See the following for some examples of non-supported configurations: The dimensions of the input tensor and the output tensor are above 8. reduceTensorCompType is not set as stated above.

Returns

TOPSDNN_STATUS_BAD_PARAM The corresponding dimensions of the input and output tensors all match, or the conditions in the above paragraphs are unmet.

Returns

TOPSDNN_INVALID_VALUE The allocations for the indices or workspace are insufficient.

Returns

TOPSDNN_STATUS_EXECUTION_FAILED The function failed to launch on the device.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnCreateOpTensorDescriptor (topsdnnOpTensorDescriptor_t *opTensorDesc)

This function creates a opTensor descriptor object by allocating the memory needed to hold its opaque structure.

Parameters

opTensorDesc – opTensor description.

Returns

TOPSDNN_STATUS_SUCCESS The object was created successfully.

Returns

TOPSDNN_STATUS_ALLOC_FAILED The resources could not be allocated.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSetOpTensorDescriptor (topsdnnOpTensorDescriptor_t opTensorDesc, topsdnnOpTensorOp_t opTensorOp, topsdnnDataType_t opTensorCompType, topsdnnNanPropagation_t opTensorNanOpt)

This function initializes a previously created generic opTensor descriptor object.

Parameters
  • opTensorDesc – opTensor description.

  • opTensorOp – Enumerant to specify the tensor pointwise math operation.

  • opTensorCompType – Enumerant to specify the computation datatype for this tensor pointwise math operation.

  • opTensorNanOpt – Enumerant to specify the NAN propagation policy.

Returns

TOPSDNN_STATUS_SUCCESS The object was set successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: Invalid (NULL) opTensorDesc supplied. opTensorOp or opTensorCompType or opTensorNanOpt has an invalid enumerant value.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetOpTensorDescriptor (const topsdnnOpTensorDescriptor_t opTensorDesc, topsdnnOpTensorOp_t *opTensorOp, topsdnnDataType_t *opTensorCompType, topsdnnNanPropagation_t *opTensorNanOpt)

This function queries a previously initialized generic opTensor descriptor object.

Parameters
  • opTensorDesc – opTensor description.

  • opTensorOp – Enumerant to specify the tensor pointwise math operation.

  • opTensorCompType – Enumerant to specify the computation datatype for this tensor pointwise math operation.

  • opTensorNanOpt – Enumerant to specify the NAN propagation policy.

Returns

TOPSDNN_STATUS_SUCCESS The object was queried successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: Invalid (NULL) input pointer supplied.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnDestroyOpTensorDescriptor (topsdnnOpTensorDescriptor_t opTensorDesc)

This function destroys a previously created opTensor descriptor object.

Parameters

opTensorDesc – opTensor description.

Returns

TOPSDNN_STATUS_SUCCESS The descriptor was destroyed successfully.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnOpTensor (topsdnnHandle_t handle, topsdnnOpTensorDescriptor_t opTensorDesc, const void *alpha1, const topsdnnTensorDescriptor_t aDesc, const void *A, const void *alpha2, const topsdnnTensorDescriptor_t bDesc, const void *B, const void *beta, const topsdnnTensorDescriptor_t cDesc, void *C)

This function implements the equation C = op(alpha1[0] * A, alpha2[0] * B) + beta[0] * C, given the tensors A, B, and C and the scaling factors alpha1, alpha2, and beta. Currently-supported ops are listed by the topsdnnOpTensorOp_t enum.

Parameters
  • handle – Handle to a previously created topsDNN context.

  • opTensorDesc – opTensor description.

  • alpha1 – Pointers to scaling factor of the tensor pointed by A.

  • aDesc – Handle to the previously initialized input tensor descriptor.

  • A – Data pointer to device memory associated with the input tensor descriptor aDesc.

  • alpha2 – Pointers to scaling factor of the tensor pointed by B.

  • bDesc – Handle to the previously initialized input tensor descriptor.

  • B – Data pointer to device memory associated with the input tensor descriptor bDesc.

  • beta – Pointers to scaling factor of the tensor pointed by C .

  • cDesc – Handle to the previously initialized input/output tensor descriptor.

  • C – Data pointer to device memory associated with the input/output tensor descriptor cDesc.

Returns

TOPSDNN_STATUS_SUCCESS The function launched successfully.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED Unsupported broadcast type or convert type.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: Invalid (NULL) input pointer supplied. Invalid topsdnnTensorDescriptor supplied. opTensorOp or opTensorCompType or opTensorNanOpt has an invalid enumerant value.

Returns

TOPSDNN_STATUS_EXECUTION_FAILED The function failed to launch on the device.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnAddTensor (topsdnnHandle_t handle, const void *alpha, const topsdnnTensorDescriptor_t aDesc, const void *A, const void *beta, const topsdnnTensorDescriptor_t cDesc, void *C)

This function adds the scaled values of a bias tensor to another tensor.

Parameters
  • handle – Handle to a previously created topsDNN context.

  • alpha – Pointers to scaling factor of the tensor pointed by A.

  • aDesc – Handle to the previously initialized input tensor descriptor.

  • A – Data pointer to device memory associated with the input tensor descriptor aDesc.

  • beta – Pointers to scaling factor of the tensor pointed by C .

  • cDesc – Handle to the previously initialized input/output tensor descriptor.

  • C – Data pointer to device memory associated with the input/output tensor descriptor cDesc.

Returns

TOPSDNN_STATUS_SUCCESS The query was successful.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnScaleTensor (topsdnnHandle_t handle, const topsdnnTensorDescriptor_t yDesc, void *y, const void *alpha)

This function scales all the elements of a tensor by a given factor.

Parameters
  • handle – Handle to a previously created topsDNN context.

  • yDesc – Handle to the previously initialized input tensor descriptor.

  • y – Data pointer to device memory associated with the output tensor descriptor yDesc.

  • alpha – Pointers to scaling factor (in host memory).

Returns

TOPSDNN_STATUS_SUCCESS

topsdnnStatus_t TOPSDNN_EXPORT topsdnnDeriveBNTensorDescriptor (topsdnnTensorDescriptor_t derivedBnDesc, const topsdnnTensorDescriptor_t xDesc, topsdnnBatchNormMode_t mode)

This function derives a secondary tensor descriptor for the batch normalization scale, invVariance, bnBias, and bnScale subtensors from the layer’s x data descriptor. Use the tensor descriptor produced by this function as the bnScaleBiasMeanVarDesc parameter for the topsdnnBatchNormalizationForwardInference() and topsdnnBatchNormalizationForwardTraining() functions, and as the bnScaleBiasDiffDesc parameter in the topsdnnBatchNormalizationBackward() function. The resulting dimensions will be: 1xCx1x1 for 4D and 1xCx1x1x1 for 5D for BATCHNORM_MODE_SPATIAL 1xCxHxW for 4D and 1xCxDxHxW for 5D for BATCHNORM_MODE_PER_ACTIVATION mode For HALF input data type the resulting tensor descriptor will have a FLOAT type. For other data types, it will have the same type as the input data. Note: Only 4D tensors are supported. The derivedBnDesc should be first created using topsdnnCreateTensorDescriptor(). xDesc is the descriptor for the layer’s x data and has to be set up with proper dimensions prior to calling this function.

Parameters
  • derivedBnDesc – Output. Handle to a previously created tensor descriptor.

  • xDesc – Input. Handle to a previously created and initialized layer’s x data descriptor.

  • mode – Input. Batch normalization layer mode of operation(only support BATCHNORM_MODE_SPATIAL mode now).

Returns

TOPSDNN_STATUS_SUCCESS The computation was performed successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM Invalid Batch Normalization mode.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnBatchNormalizationForwardInference (topsdnnHandle_t handle, topsdnnBatchNormMode_t mode, const void *alpha, const void *beta, const topsdnnTensorDescriptor_t xDesc, const void *x, const topsdnnTensorDescriptor_t yDesc, void *y, const topsdnnTensorDescriptor_t bnScaleBiasMeanVarDesc, const void *bnScale, const void *bnBias, const void *estimatedMean, const void *estimatedVariance, double epsilon)

This function performs the forward batch normalization layer computation for the inference phase. This layer is based on the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, S. Ioffe, C. Szegedy, 2015.

Parameters
  • handle – Input. Handle to a previously created TopsDNN library descriptor. For more information, refer to topsdnnHandle_t.

  • mode – Input. Mode of operation (spatial or per-activation). For more information, refer to topsdnnBatchNormMode_t.

  • alpha, beta – Inputs. Pointers to scaling factors (in host memory) used to blend the layer output value with prior value in the destination tensor as follows: dstValue = alpha[0]*srcValue + beta[0]*priorDstValue Currently, only (alpha == 1 and beta == 0) is supported.

  • xDesc, yDesc – Input. Handles to the previously initialized tensor descriptors.

  • x – Input. Data pointer to device memory associated with the tensor descriptor xDesc, for the layer’s x input data. Currently only NHWC format is supported.

  • y – Input/Output. Data pointer to device memory associated with the tensor descriptor yDesc, for the y output of the batch normalization layer. Currently only NHWC format is supported.

  • bnScaleBiasMeanVarDesc, bnScale, bnBias – Inputs. Tensor descriptors and pointers in device memory for the batch normalization scale and bias parameters (in the original paper bias is referred to as beta and scale as gamma).

  • estimatedMean, estimatedVariance – Inputs. Mean and variance tensors (these have the same descriptor as the bias and scale). The resultRunningMean and resultRunningVariance, accumulated during the training phase from the topsdnnBatchNormalizationForwardTraining() call, should be passed as inputs here.

  • epsilon – Input. Epsilon value used in the batch normalization formula. Its value should be convertible to FLOAT and equal to or greater than the value defined for TOPSDNN_BN_MIN_EPSILON in topsdnn.h.

Returns

TOPSDNN_STATUS_SUCCESS The computation was performed successfully.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED The function does not support the provided configuration.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: One of the pointers alpha, beta, x, y, bnScale, bnBias, estimatedMean, estimatedInvVariance is NULL. The number of xDesc or yDesc tensor descriptor dimensions is not within the range of 4,5 bnScaleBiasMeanVarDesc dimensions are not 1xCx1x1 for 4D and 1xCx1x1x1 for 5D for spatial, and are not 1xCxHxW for 4D and 1xCxDxHxW for 5D for per-activation mode. epsilon value is less than TOPSDNN_BN_MIN_EPSILON. Dimensions or data types mismatch for xDesc, yDesc.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnDropoutGetStatesSize (topsdnnHandle_t handle, size_t *sizeInBytes)

This function is used to query the amount of space required to store the states of the random number generators used by the topsdnnDropoutForward() function.

Parameters
  • handle – Input. Handle to a previously created topsDNN context.

  • sizeInBytes – Output. Amount of device memory needed to store random generator states.

Returns

TOPSDNN_STATUS_SUCCESS The query was successful.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnDropoutGetReserveSpaceSize (topsdnnTensorDescriptor_t xDesc, size_t *sizeInBytes)

This function is used to query the amount of reserve needed to run dropout with the input dimensions given by xDesc.

Parameters
  • xDesc – Input. Handle to a previously initialized tensor descriptor, describing input to a dropout operation.

  • sizeInBytes – Output. Amount of device memory needed as reserve space to be able to run dropout with an input tensor descriptor specified by xDesc.

Returns

TOPSDNN_STATUS_SUCCESS The query was successful.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnCreateDropoutDescriptor (topsdnnDropoutDescriptor_t *dropoutDesc)

This function creates a generic dropout descriptor object by allocating the memory needed to hold its opaque structure.

Returns

TOPSDNN_STATUS_SUCCESS The object was created successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM The input parameters was out of valid range.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnDestroyDropoutDescriptor (topsdnnDropoutDescriptor_t dropoutDesc)

This function destroys a previously created dropout descriptor object.

Returns

TOPSDNN_STATUS_SUCCESS The object was destroyed successfully.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetDropoutDescriptor (topsdnnDropoutDescriptor_t dropoutDesc, topsdnnHandle_t handle, float *dropout, void **states, unsigned long long *seed)

This function queries the fields of a previously initialized dropout descriptor.

Parameters
  • dropoutDesc – Input. Previously initialized dropout descriptor.

  • handle – Input. Handle to a previously created topsDNN context.

  • dropout – Output. The probability with which the value from input is set to 0 during the dropout layer.

  • states – Output. Pointer to user-allocated device memory that holds random number generator states.

  • seed – Output. Seed used to initialize random number generator states.

Returns

TOPSDNN_STATUS_SUCCESS The call was successful.

Returns

TOPSDNN_STATUS_BAD_PARAM The input parameters was out of valid range.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSetDropoutDescriptor (topsdnnDropoutDescriptor_t dropoutDesc, topsdnnHandle_t handle, float dropout, void *states, size_t stateSizeInBytes, unsigned long long seed)

This function initializes a previously created dropout dropout descriptor.

Parameters
  • dropoutDesc – Input/Output. Previously created dropout descriptor object.

  • handle – Input. Handle to a previously created topsDNN context.

  • dropout – Input. The probability with which the value from input is set to zero during the dropout layer.

  • states – Output. Pointer to user-allocated device memory that will hold random number generator states.

  • stateSizeInBytes – Input. Specifies the size in bytes of the provided memory for the states.

  • seed – Input. Seed used to initialize random number generator states.

Returns

TOPSDNN_STATUS_SUCCESS The call was successful.

Returns

TOPSDNN_STATUS_BAD_PARAM The input parameters was out of valid range.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnRestoreDropoutDescriptor (topsdnnDropoutDescriptor_t dropoutDesc, topsdnnHandle_t handle, float dropout, void *states, size_t stateSizeInBytes, unsigned long long seed)

This function restores a dropout descriptor to a previously saved-off state.

Parameters
  • dropoutDesc – Input/Output. Previously created dropout descriptor object.

  • handle – Input. Handle to a previously created topsDNN context.

  • dropout – Input. The probability with which the value from input is set to zero during the dropout layer.

  • states – Input. Pointer to device memory that holds random number generator states.

  • stateSizeInBytes – Input. Size in bytes of buffer holding random number generator states.

  • seed – Input. Seed used in prior calls to topsdnnSetDropoutDescriptor()

Returns

TOPSDNN_STATUS_SUCCESS The call was successful.

Returns

TOPSDNN_STATUS_BAD_PARAM The input parameters was out of valid range.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnDropoutForward (topsdnnHandle_t handle, const topsdnnDropoutDescriptor_t dropoutDesc, const topsdnnTensorDescriptor_t xDesc, const void *x, const topsdnnTensorDescriptor_t yDesc, void *y, void *reserveSpace, size_t reserveSpaceSizeInBytes)

This function performs forward dropout operation over x returning results in y.

Parameters
  • handle – Input. Handle to a previously created topsDNN context.

  • dropoutDesc – Input. Previously created dropout descriptor object.

  • xDesc – Input. Handle to a previously initialized tensor descriptor.

  • x – Input. Pointer to data of the tensor described by the xDesc descriptor.

  • yDesc – Input. Handle to a previously initialized tensor descriptor.

  • y – Output. Pointer to data of the tensor described by the yDesc descriptor.

  • reserveSpace – Output. Pointer to user-allocated device memory used by this function.

  • reserveSpaceSizeInBytes – Input. Specifies the size in bytes of the provided memory for the reserve space.

Returns

TOPSDNN_STATUS_SUCCESS The call was successful.

Returns

TOPSDNN_STATUS_BAD_PARAM The input parameters was out of valid range.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnTransformTensor (topsdnnHandle_t handle, const void *alpha, const topsdnnTensorDescriptor_t xDesc, const void *x, const void *beta, const topsdnnTensorDescriptor_t yDesc, void *y)

This function copies the scaled data from one tensor to another tensor with a different layout. Those descriptors need to have the same dimensions but not necessarily the same strides. The input and output tensors must not overlap in any way (meaning, tensors cannot be transformed in place). This function can be used to convert a tensor with an unsupported format to a supported one.

Parameters
  • handle – Input. Handle to a previously created TopsDNN context.

  • alpha, beta – Input. Pointers to scaling factors (in host memory) used to blend the source value with prior value in the destination tensor as follows: dstValue = alpha[0]*srcValue + beta[0]*priorDstValue Currently, only alpha[0] == 1 && beta[0] == 0 is supported.

  • xDesc – Input. Handle to a previously initialized tensor descriptor. Currently only NHWC or NCHW is supported

  • x – Input. Pointer to data of the tensor described by the xDesc descriptor.

  • yDesc – Input. Handle to a previously initialized tensor descriptor. Currently only NHWC or NCHW is supported.

  • y – Output. Pointer to data of the tensor described by the yDesc descriptor.

Returns

TOPSDNN_STATUS_SUCCESS The function launched successfully.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED The function does not support the provided configuration.

Returns

TOPSDNN_STATUS_BAD_PARAM The dimensions n, c, h, w or the dataType of the two tensor descriptors are different.

Returns

TOPSDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GCU.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnCreateLRNDescriptor (topsdnnLRNDescriptor_t *lrnDesc)

This function allocates the memory needed to hold the data needed for LRN layers operation and returns a descriptor used with subsequent layer forward and backward calls.

Parameters

lrnDesc – lrn description.

Returns

TOPSDNN_STATUS_SUCCESS The object was created successfully.

Returns

TOPSDNN_STATUS_ALLOC_FAILED The resources could not be allocated.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSetLRNDescriptor (topsdnnLRNDescriptor_t lrnDesc, unsigned lrnN, double lrnAlpha, double lrnBeta, double lrnK)

This function initializes a previously created LRN descriptor object.

Parameters
  • lrnDesc – Output. Handle to a previously created LRN descriptor.

  • lrnN – Input. Normalization window width in elements.

  • lrnAlpha – Input. Value of the alpha variance scaling parameter in the normalization formula.

  • lrnBeta – Input. Value of the beta power parameter in the normalization formula.

  • lrnK – Input. Value of the k parameter in the normalization formula.

Returns

TOPSDNN_STATUS_SUCCESS The object was set successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM The input parameters was out of valid range.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetLRNDescriptor (topsdnnLRNDescriptor_t lrnDesc, unsigned *lrnN, double *lrnAlpha, double *lrnBeta, double *lrnK)

This function retrieves values stored in the previously initialized LRN descriptor object.

Parameters
  • lrnDesc – Input. Handle to a previously created LRN descriptor.

  • lrnN – Output. Pointers to receive values of parameters stored in the descriptor object.

  • lrnAlpha – Output. Pointers to receive values of parameters stored in the descriptor object.

  • lrnBeta – Output. Pointers to receive values of parameters stored in thedescriptor object.

  • lrnK – Output. Pointers to receive values of parameters stored in the descriptor object.

Returns

TOPSDNN_STATUS_SUCCESS Function completed successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM The input parameters was out of valid range.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnDestroyLRNDescriptor (topsdnnLRNDescriptor_t lrnDesc)

This function destroys a previously created LRN descriptor object.

Parameters

lrnDesc – Input. Handle to a previously created LRN descriptor.

Returns

TOPSDNN_STATUS_SUCCESS The function returned successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM The input parameters was out of valid range.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnLRNCrossChannelForward (topsdnnHandle_t handle, topsdnnLRNDescriptor_t lrnDesc, topsdnnLRNMode_t lrnMode, const void *alpha, const topsdnnTensorDescriptor_t xDesc, const void *x, const void *beta, const topsdnnTensorDescriptor_t yDesc, void *y)

This function performs the forward LRN layer computation.

Parameters
  • handle – Input. Handle to a previously created topsDNN library descriptor.

  • lrnDesc – Input. Handle to a previously created LRN descriptor.

  • lrnMode – Input. LRN layer mode of operation.

  • alpha – Input. Pointers to scaling factors (in host memory) used to blend the layer output value with prior value in the destination tensor

  • xDesc – Input. Tensor descriptor objects for the input tensors.

  • x – Input. Input tensor data pointer in device memory.

  • beta – Input. Pointers to scaling factors (in host memory) used to blend the layer output value with prior value in the destination tensor

  • yDesc – Input. Tensor descriptor objects for the output tensors.

  • y – Output. Output tensor data pointer in device memory.

Returns

TOPSDNN_STATUS_SUCCESS The computation was performed successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM The input parameters was out of valid range.

1.2. OpsTrainLibrary

This entity contains common training routines and algorithms.

1.2.1. OpsTrainFunctions

These are the API functions in the topsdnn_ops_train library.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnActivationBackward (topsdnnHandle_t handle, topsdnnActivationDescriptor_t activationDesc, const void *alpha, const topsdnnTensorDescriptor_t yDesc, const void *y, const topsdnnTensorDescriptor_t dyDesc, const void *dy, const topsdnnTensorDescriptor_t xDesc, const void *x, const void *beta, const topsdnnTensorDescriptor_t dxDesc, void *dx)

This function computes the gradient of a neuron activation function.

Parameters
  • handle – Handle to a previously created topsDNN context.

  • alpha – Pointers to scaling factors (in host memory) used to blend the computation result with prior value in the output layer.

  • yDesc – Handle to the previously initialized input tensor descriptor.

  • y – Data pointer to device memory associated with the tensor descriptor yDesc.

  • dyDesc – Handle to the previously initialized input differential tensor descriptor.

  • dy – Data pointer to device memory associated with the tensor descriptor dyDesc.

  • xDesc – Handle to the previously initialized output tensor descriptor.

  • x – Data pointer to device memory associated with the output tensor descriptor xDesc.

  • beta – Pointers to scaling factors (in host memory) used to blend the computation result with prior value in the output layer.

  • dxDesc – Handle to the previously initialized output differential tensor descriptor.

  • dx – Data pointer to device memory associated with the output tensor descriptor dxDesc.

Returns

TOPSDNN_STATUS_SUCCESS The computation was performed successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM The input parameters was out of valid range.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnBatchNormalizationForwardTraining (topsdnnHandle_t handle, topsdnnBatchNormMode_t mode, const void *alpha, const void *beta, const topsdnnTensorDescriptor_t inputDesc, const void *input, const topsdnnTensorDescriptor_t outputDesc, void *output, const topsdnnTensorDescriptor_t bnScaleBiasMeanVarDesc, const void *bnScale, const void *bnBias, double exponentialAverageFactor, void *resultRunningMean, void *resultRunningVariance, double epsilon, void *resultSaveMean, void *resultSaveInvVariance)

This function performs tthe forward batch normalization layer computation for the training phase..

Parameters
  • handle – Input. Handle to a previously created topsDNN library descriptor.

  • mode – Mode of operation (spatial or per-activation).

  • alpha – Input. Pointers to scaling factors (in host memory) used to blend the layer output value with prior value in the destination tensor

  • beta – Input. Pointers to scaling factors (in host memory) used to blend the layer output value with prior value in the destination tensor

  • inputDesc – Input. Tensor descriptor objects for the input tensors.

  • input – Input. Input tensor data pointer in device memory.

  • outputDesc – output. Tensor descriptor objects for the output tensors.

  • output – Output. Output tensor data pointer in device memory.

  • bnScaleBiasMeanVarDesc – Input, Shared tensor descriptor desc for the secondary tensor that was derived by topsdnnDeriveBNTensorDescriptor(). The dimensions for this tensor descriptor are dependent on the normalization mode.

  • bnScale – Input, Pointers in device memory for the batch normalization scale parameters.

  • bnBias – Input, Pointers in device memory for the batch normalization bias parameters.

  • exponentialAverageFactor – Input, Factor used in the moving average computation.

  • resultRunningMean – Input/Output, Running mean tensors is use for the forward inference. this pointers can be NULL but only at the same time with resultRunningVariance.

  • resultRunningVariance – Input/Output, Running variance tensors is use for the forward inference. this pointers can be NULL but only at the same time with resultRunningMean.

  • epsilon – Input, Epsilon value used in the batch normalization formula.

  • resultSaveMean – Ouput, Optional cache to save intermediate mean computed during the forward pass.

  • resultSaveInvVariance – Output, Optional cache to save intermediate var computed during the forward pass.

Returns

TOPSDNN_STATUS_SUCCESS The computation was performed successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM The input parameters was out of valid range.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnPoolingBackward (topsdnnHandle_t handle, const topsdnnPoolingDescriptor_t poolingDesc, const void *alpha, const topsdnnTensorDescriptor_t yDesc, const void *y, const topsdnnTensorDescriptor_t dyDesc, const void *dy, const topsdnnTensorDescriptor_t xDesc, const void *x, const void *beta, const topsdnnTensorDescriptor_t dxDesc, void *dx)
topsdnnStatus_t TOPSDNN_EXPORT topsdnnBatchNormalizationBackward (topsdnnHandle_t handle, topsdnnBatchNormMode_t mode, const void *alphaDataDiff, const void *betaDataDiff, const void *alphaParamDiff, const void *betaParamDiff, const topsdnnTensorDescriptor_t input_x_desc, const void *input_x, const topsdnnTensorDescriptor_t input_dy_desc, const void *input_dy, const topsdnnTensorDescriptor_t output_dx_desc, void *output_dx, const topsdnnTensorDescriptor_t bnScaleBiasDiffDesc, const void *bnScale, void *resultBnScaleDiff, void *resultBnBiasDiff, double epsilon, const void *SaveMean, const void *SaveInvVariance)

This function performs the forward batch normalization layer computation for the training phase..

Parameters
  • handle – Input. Handle to a previously created topsDNN library descriptor.

  • mode – Mode of operation (spatial or per-activation).

  • alphaDataDiff – Input. Pointers to scaling factors (in host memory) used to blend the output_dx with prior value in the destination tensor

  • betaDataDiff – Input. Pointers to scaling factors (in host memory) used to blend the output_dx with prior value in the destination tensor

  • alphaParamDiff – Input. Pointers to scaling factors (in host memory) used to blend the resultBnScaleDiff/resultBnBiasDiff with prior value in the destination tensor

  • betaParamDiff – Input. Pointers to scaling factors (in host memory) used to blend the resultBnScaleDiff/resultBnBiasDiff with prior value in the destination tensor

  • input_x_desc – Input. Tensor descriptor objects for the input tensors.

  • input_x – Input. Input tensor data pointer in device memory.

  • input_dy_desc – Input. Tensor descriptor objects for the input tensors.

  • input_dy – Input. Input tensor data pointer in device memory for the backpropagated differential dy input.

  • output_dx_desc – inputput. Tensor descriptor objects for the output tensors.

  • output_dx – Output. Output tensor data pointer in device memory for the resulting differential output with respect to input_x.

  • bnScaleBiasDiffDesc – Input, Shared tensor descriptor desc for the secondary tensor that was derived by topsdnnDeriveBNTensorDescriptor(). The dimensions for this tensor descriptor are dependent on the normalization mode.

  • bnScale – Input, Pointers in device memory for the batch normalization scale parameters.

  • resultBnScaleDiff – output, Pointers in device memory for the resulting scale differentials computed by this routine

  • resultBnBiasDiff – output, Pointers in device memory for the resulting bias differentials computed by this routine

  • epsilon – Input, Epsilon value used in the batch normalization formula.

  • SaveMean – Input, Optional cache containing saved intermediate results that were computed during the forward pass.

  • SaveInvVariance – Input, Optional cache containing saved intermediate results that were computed during the forward pass.

Returns

TOPSDNN_STATUS_SUCCESS The computation was performed successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM The input parameters was out of valid range.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSoftmaxBackward (topsdnnHandle_t handle, topsdnnSoftmaxAlgorithm_t algorithm, topsdnnSoftmaxMode_t mode, const void *alpha, const topsdnnTensorDescriptor_t yDesc, const void *y, const topsdnnTensorDescriptor_t dyDesc, const void *dy, const void *beta, const topsdnnTensorDescriptor_t dxDesc, void *dx)

This function computes the gradient of the softmax function.

Parameters
  • handle – Handle to a previously created topsDNN context.

  • algorithm – Enumerant to specify the softmax algorithm.

  • mode – Enumerant to specify the softmax mode.

  • alpha – Pointers to scaling factors

  • yDesc – Handle to the previously initialized input tensor descriptor.

  • y – Data pointer to device memory associated with the tensor descriptor yDesc.

  • dyDesc – Handle to the previously initialized input differential tensor descriptor.

  • dy – Data pointer to device memory associated with the tensor descriptor dyData.

  • beta – Pointers to scaling factors

  • dxDesc – Handle to the previously initialized output differential tensor descriptor.

  • dx – Data pointer to device memory associated with the output tensor descriptor dxDesc.

Returns

TOPSDNN_STATUS_SUCCESS The function launched successfully.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED The function does not support the provided configuration.

Returns

TOPSDNN_STATUS_BAD_PARAM The input parameters are invalid.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnDropoutBackward (topsdnnHandle_t handle, const topsdnnDropoutDescriptor_t dropoutDesc, const topsdnnTensorDescriptor_t dyDesc, const void *dy, const topsdnnTensorDescriptor_t dxDesc, void *dx, void *reserveSpace, size_t reserveSpaceSizeInBytes)

This function performs backward dropout operation over dy returning results in dx.

Parameters
  • handle – Handle to a previously created topsDNN context.

  • dyDesc – Handle to the previously initialized input differential tensor descriptor.

  • dy – Data pointer to device memory associated with the tensor descriptor dyDesc.

  • dxDesc – Handle to the previously initialized output differential tensor descriptor.

  • dx – Data pointer to device memory associated with the output tensor descriptor dxDesc.

  • reserveSpace – Pointer to user-allocated device memory used by this function.

  • reserveSpaceSizeInBytes – Specifies the size in bytes of the provided memory for the reserve space.

Returns

TOPSDNN_STATUS_SUCCESS The computation was performed successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM The input parameters was out of valid range.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnLRNCrossChannelBackward (topsdnnHandle_t handle, topsdnnLRNDescriptor_t lrnDesc, topsdnnLRNMode_t lrnMode, const void *alpha, const topsdnnTensorDescriptor_t yDesc, const void *y, const topsdnnTensorDescriptor_t dyDesc, const void *dy, const topsdnnTensorDescriptor_t xDesc, const void *x, const void *beta, const topsdnnTensorDescriptor_t dxDesc, void *dx)

This function performs the backward LRN layer computation.

Parameters
  • handle – Handle to a previously created topsDNN context.

  • lrnDesc – Handle to a previously initialized LRN parameter descriptor.

  • lrnMode – Input. LRN layer mode of operation.

  • alpha – Pointers to scaling factors

  • yDesc – Handle to the previously initialized input tensor descriptor.

  • y – Data pointer to device memory associated with the tensor descriptor yDesc.

  • dyDesc – Handle to the previously initialized input differential tensor descriptor.

  • dy – Data pointer to device memory associated with the tensor descriptor dyDesc.

  • xDesc – Handle to the previously initialized output tensor descriptor.

  • x – Data pointer to device memory associated with the output tensor descriptor xDesc.

  • beta – Pointers to scaling factors (in host memory) used to blend the computation result with prior value in the output layer.

  • dxDesc – Handle to the previously initialized output differential tensor descriptor.

  • dx – Data pointer to device memory associated with the output tensor descriptor dxDesc.

Returns

TOPSDNN_STATUS_SUCCESS The computation was performed successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM The input parameters was out of valid range.

1.3. OpsInferExLibrary

This entity contains common device functions in the topsdnn_ops_infer_ex library.

typedef topsdnnResizeDescriptor *topsdnnResizeDescriptor_t
struct topsdnnResizeDescriptor

1.3.1. OpsInferExEnumeration

These are the enumeration types in the topsdnn_ops_infer_ex library.

enum topsdnnMemcpyKind_t

TOPSDNN memory copy types

Values:

enumerator topsdnnMemcpyHostToDevice

Host -> Device

enumerator topsdnnMemcpyDeviceToHost

Device -> Host

enumerator topsdnnMemcpyDeviceToDevice

Device -> Device>

enum topsdnnResizeCoordTransMode_t

Values:

enumerator TOPSDNN_RESIZE_HALF_PIXEL
enumerator TOPSDNN_RESIZE_ASYMMETRIC
enumerator TOPSDNN_RESIZE_PYTORCH_HALF_PIXEL
enumerator TOPSDNN_RESIZE_TF_HALF_PIXEL
enumerator TOPSDNN_RESIZE_ALIGN_CORNERS
enumerator TOPSDNN_RESIZE_TF_CROP_AND_RESIZE
enumerator TOPSDNN_RESIZE_CoordinateTransformationModeCount
enum topsdnnResizeInterpolationMode_t

Values:

enumerator TOPSDNN_RESIZE_NEAREST
enumerator TOPSDNN_RESIZE_BILINEAR
enumerator TOPSDNN_RESIZE_CUBIC
enum topsdnnResizeNearestMode_t

Values:

enumerator TOPSDNN_RESIZE_SAMPLE
enumerator TOPSDNN_RESIZE_ROUND_PREFER_FLOOR
enumerator TOPSDNN_RESIZE_ROUND_PREFER_CEIL
enumerator TOPSDNN_RESIZE_FLOOR
enumerator TOPSDNN_RESIZE_CEIL
enumerator TOPSDNN_RESIZE_NearestModeCount
enum topsdnnTopkCmpMode_t

Values:

enumerator TOPSDNN_TOPK_TYPE_INVALID
enumerator TOPSDNN_TOPK_TYPE_GT
enumerator TOPSDNN_TOPK_TYPE_LT
enum topsdnnTopkStableMode_t

Values:

enumerator TOPSDNN_TOPK_STABLE_INVALID
enumerator TOPSDNN_TOPK_STABLE
enumerator TOPSDNN_TOPK_INSTABLE

1.3.2. OpsInferExFunctions

These are the API functions in the topsdnn_ops_infer_ex library.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetDeviceCount (int *count)
topsdnnStatus_t TOPSDNN_EXPORT topsdnnSetDevice (int device)
topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetDevice (int *device)
topsdnnStatus_t TOPSDNN_EXPORT topsdnnMalloc (void **devPtr, size_t size)
topsdnnStatus_t TOPSDNN_EXPORT topsdnnFree (void *devPtr)
topsdnnStatus_t TOPSDNN_EXPORT topsdnnMemcpy (void *dst, const void *src, size_t count, topsdnnMemcpyKind_t kind)
topsdnnStatus_t TOPSDNN_EXPORT topsdnnCreateResizeDescriptor (topsdnnResizeDescriptor_t *resizeDesc)

This function creates a resize descriptor object by allocating the memory needed to hold its opaque structure.

Parameters

resizeDesc – resize description.

Returns

TOPSDNN_STATUS_SUCCESS The object was created successfully.

Returns

TOPSDNN_STATUS_ALLOC_FAILED The resources could not be allocated.

Returns

TOPSDNN_STATUS_BAD_PARAM The descriptor resizeDesc is invalid.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnDestroyResizeDescriptor (topsdnnResizeDescriptor_t resizeDesc)

This function destroys a previously created resize descriptor object.

Parameters

resizeDesc – resize description.

Returns

TOPSDNN_STATUS_SUCCESS The object was destroyed successfully.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSetResizeDescriptor (topsdnnResizeDescriptor_t resizeDesc, topsdnnResizeInterpolationMode_t interpolationMode, topsdnnResizeCoordTransMode_t transMode, topsdnnResizeNearestMode_t nearestMode)

This function initializes a previously created resize descriptor object.

Parameters
  • resizeDesc – Input/Output. Handle to a previously created resize descriptor.

  • interpolationMode – input. Enumerant to specify the resize interpolation mode.

  • transMode – input. Enumerant to specify the resize coordinate transformation mode.

  • nearestMode – input. Enumerant to specify the resize nearest mode.

Returns

TOPSDNN_STATUS_SUCCESS The object was set successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: The descriptor poolingDesc is invalid. At least one of the parameters is interpolationMode, transMode, nearestMode invalid.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED The function does not support the provided configuration.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetResizeDescriptor (topsdnnResizeDescriptor_t resizeDesc, topsdnnResizeInterpolationMode_t *interpolationMode, topsdnnResizeCoordTransMode_t *transMode, topsdnnResizeNearestMode_t *nearestMode)

This function queries a previously created resize descriptor object.

Parameters
  • resizeDesc – Input. Handle to a previously created resize descriptor.

  • interpolationMode – Output. Enumerant to specify the resize interpolation mode.

  • transMode – Output. Enumerant to specify the resize coordinate transformation mode.

  • nearestMode – Output. Enumerant to specify the resize nearest mode.

Returns

TOPSDNN_STATUS_SUCCESS The object was set successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: resizeDesc has not been initialized. At least one of the parameters is interpolationMode, transMode, nearestMode invalid.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnResizeForward (topsdnnHandle_t handle, topsdnnResizeDescriptor_t resizeDesc, const topsdnnTensorDescriptor_t xDesc, const void *x, const topsdnnTensorDescriptor_t scaleDesc, const void *scale, const topsdnnTensorDescriptor_t sizesDesc, const void *sizes, const topsdnnTensorDescriptor_t yDesc, const void *y)

This function performs forward resize operation over x returning results in y.

Parameters
  • handle – Input. Handle to a previously created TopsDNN context.

  • resizeDesc – Input. Resize descriptor.

  • xDesc – Input. Handle to the previously initialized input tensor descriptor. Must be of type TOPSDNN_DATA_FLOAT / TOPSDNN_DATA_HALF/ TOPSDNN_DATA_INT8 (Only support nearest resize interpolation) .

  • x – Input. Data pointer to device memory associated with the tensor descriptor xDesc.

  • scaleDesc – Input. Handle to the previously initialized scale tensor descriptor. Must be of dims 4.

  • scale – Input. Data pointer to device memory associated with the tensor descriptor scaleDesc.

  • sizesDesc – Input. Handle to the previously initialized input tensor descriptor. Must be of dim 4.

  • sizes – Input. Data pointer to device memory associated with the tensor descriptor sizesDesc. Notice: This parameter is currently not supported, must be of [0,0,0,0]

  • yDesc – Input. Handle to the previously initialized output tensor descriptor. Must be of type TOPSDNN_DATA_FLOAT / TOPSDNN_DATA_HALF/ TOPSDNN_DATA_INT8(Only support nearest resize interpolation).

  • y – Output. Data pointer to device memory associated with the output tensor descriptor yDesc.

Returns

TOPSDNN_STATUS_SUCCESS The function launched successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: The dimensions n, c of the input tensor and output tensors differ. The datatype of the input tensor and output tensors differs.

Returns

TOPSDNN_STATUS_EXECUTION_FAILED The function failed to launch on the device.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED The function does not support the provided configuration.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSelectForward (topsdnnHandle_t handle, const topsdnnTensorDescriptor_t xDesc, const void *x, const void *alpha, const topsdnnTensorDescriptor_t lhsDesc, const void *lhs, const topsdnnTensorDescriptor_t rhsDesc, const void *rhs, const void *beta, const topsdnnTensorDescriptor_t yDesc, const void *y)

This function to Select(XLA operation_semantics) y = (x[index] == 1 ? lhs[index] : rhs[index]) * alpha * beta.

Parameters
  • handle – Handle to a previously created topsDNN context.

  • xDesc – Handle to the previously initialized input tensor descriptor.

  • x – Data pointer to device memory associated with the input tensor descriptor xDesc.

  • alpha – Pointers to scaling factor of the tensor pointed by output.

  • lhsDesc – Handle to the previously initialized input tensor descriptor.

  • lhs – Data pointer to device memory associated with the input tensor descriptor lhsDesc.

  • rhsDesc – Handle to the previously initialized input tensor descriptor.

  • rhs – Data pointer to device memory associated with the input tensor descriptor rhsDesc.

  • beta – Pointers to scaling factor of the tensor pointed by output.

  • yDesc – Handle to the previously initialized output tensor descriptor.

  • y – Data pointer to device memory associated with the input tensor descriptor yDesc.

Returns

TOPSDNN_STATUS_SUCCESS The call was successful.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED The function does not support the provided configuration.

Returns

TOPSDNN_STATUS_BAD_PARAM Following conditions are met: The tensor lhs and rhs must have the same shape. The tensor lhs and rhs and output must have the same datatype. The tensor x datatype must be int8.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnMaskForward (topsdnnHandle_t handle, const topsdnnTensorDescriptor_t xDesc, const void *x, const void *alpha, const topsdnnTensorDescriptor_t lhsDesc, const void *lhs, const topsdnnTensorDescriptor_t rhsDesc, const void *rhs, const void *beta, const topsdnnTensorDescriptor_t yDesc, const void *y)

This function implements the equation y = (x[index] == 1 ? lhs[index] : rhs[0]) * alpha * beta.

Parameters
  • handle – Handle to a previously created topsDNN context.

  • xDesc – Handle to the previously initialized input tensor descriptor.

  • x – Data pointer to device memory associated with the input tensor descriptor xDesc.

  • alpha – Pointers to scaling factor of the tensor pointed by output.

  • lhsDesc – Handle to the previously initialized input tensor descriptor.

  • lhs – Data pointer to device memory associated with the input tensor descriptor lhsDesc.

  • rhsDesc – Handle to the previously initialized input tensor descriptor.

  • rhs – Data pointer to device memory associated with the input tensor descriptor rhsDesc.

  • beta – Pointers to scaling factor of the tensor pointed by output.

  • yDesc – Handle to the previously initialized output tensor descriptor.

  • y – Data pointer to device memory associated with the input tensor descriptor yDesc.

Returns

TOPSDNN_STATUS_SUCCESS The call was successful.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED The function does not support the provided configuration.

Returns

TOPSDNN_STATUS_BAD_PARAM Following conditions are met: The tensor lhs and rhs and output must have the same datatype. The tensor rhs shape must rank = 1 and have only one element. The tensor x datatype must be int8.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnCreateTopkDescriptor (topsdnnTopkDescriptor_t *topkDesc)

This function creates a topk tensor descriptor object by allocating the memory needed to hold its opaque structure.

Parameters

topkDesc – Output. topk tensor descriptor.

Returns

TOPSDNN_STATUS_SUCCESS The object was created successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM TopkDesc is a NULL pointer.

Returns

TOPSDNN_STATUS_ALLOC_FAILED The resources could not be allocated.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnDestroyTopkDescriptor (topsdnnTopkDescriptor_t topkDesc)

This function destroys a previously created topk tensor descriptor object. When the input pointer is NULL, this function performs no destroy operation.

Parameters

topkDesc – Input. Pointer to the topk tensor descriptor object to be destroyed.

Returns

TOPSDNN_STATUS_SUCCESS The object was destroyed successfully.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetTopkWorkspaceSize (const topsdnnTopkDescriptor_t topkDesc, const topsdnnTensorDescriptor_t xDesc, topsdnnIndicesType_t indexDataType, size_t *sizeInBytes)

This is a helper function to return the minimum size of the workspace to be passed to the topk given the input and output tensors. Currently just return 0 for not supporting max /min.

Parameters
  • topkDesc – Input. Pointer to a previously initialized reduce tensor descriptor object.

  • xDesc – Input. Pointer to the input tensor descriptor.

  • indexDataType – Input. Enumerant to specify the topk tensor indices type.

  • sizeInBytes – Output. Minimum size of the index space to be passed to the reduction.

Returns

TOPSDNN_STATUS_SUCCESS The workspace size is returned successfully.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSetTopkDescriptor (topsdnnTopkDescriptor_t topkDesc, topsdnnTopkCmpMode_t cmpMode, topsdnnTopkStableMode_t stableMode, topsdnnIndicesType_t indicesType, int k, int axis)

This function initializes a previously created topk tensor descriptor object.

Parameters
  • topkDesc – Input/Output. Handle to a previously created topk tensor descriptor.

  • cmpMode – Input. Enumerant to specify maximize or minimize the first k.

  • stableMode – Input. Enumerant to specify the stable sort or non-stable sort mode.stable is implemented based on the selective sorting algorithm;non-stable is implemented based on bitonic sorting algorithm;

  • indicesType – Input. Enumerant to specify the topk tensor indices type.

  • k – Input. Specify the k value.

  • axis – Input. Specifies the dimension to be topk.

Returns

TOPSDNN_STATUS_SUCCESS The object was set successfully.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED The function does not support the provided configuration.

Returns

TOPSDNN_STATUS_BAD_PARAM topkDesc is NULL.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetTopkDescriptor (topsdnnTopkDescriptor_t topkDesc, topsdnnTopkCmpMode_t *cmpMode, topsdnnTopkStableMode_t *stableMode, topsdnnIndicesType_t *indicesType, int *k, int *axis)

This function queries a previously initialized topk tensor descriptor object.

Parameters
  • topkDesc – Input. Pointer to a previously initialized topk tensor descriptor object.

  • cmpMode – Output. Enumerant to specify the compare type of the topk.

  • stableMode – Output. Enumerant to specify the stable sort or non-stable sort mode.

  • indicesType – Output. Enumerant to specify the topk tensor indices type.

  • k – Output. K value.

  • axis – Output. The dimension of topk to be taken.

Returns

TOPSDNN_STATUS_SUCCESS The object was queried successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM topkTensorDesc is NULL.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnTopkTensor (topsdnnHandle_t handle, topsdnnTopkDescriptor_t TopkDesc, void *indices, size_t indicesSizeInBytes, void *workspace, size_t workspaceSizeInBytes, const void *alpha, const topsdnnTensorDescriptor_t xDesc, const void *x, const void *beta, const topsdnnTensorDescriptor_t yDesc, void *y)
topsdnnStatus_t TOPSDNN_EXPORT topsdnnCreateConcatenateDescriptor (topsdnnConcatenateDescriptor_t *concatenateDesc)

topsdnnConcatenateDescriptor_t is an opaque structure containing the description of the tensor concatenating. Use the topsdnnCreateConcatenateDescriptor() function to create an instance of this descriptor, and topsdnnSetConcatenateDescriptor() must be used to initialize this instance while topsdnnDestroyConcatenateDescriptor() must be used to destroy this instance.

Parameters

concatenateDesc – description.

Returns

TOPSDNN_STATUS_SUCCESS The object was created successfully.

Returns

TOPSDNN_STATUS_ALLOC_FAILED The resources could not be allocated.

Returns

TOPSDNN_STATUS_BAD_PARAM The descriptor is invalid.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnDestroyConcatenateDescriptor (topsdnnConcatenateDescriptor_t concatenateDesc)

This function destroys a previously created concatenate tensor descriptor object. When the input pointer is NULL, this function performs no destroy operation.

Parameters

concatenateDesc – Input. Pointer to the concatenate tensor descriptor object to be destroyed.

Returns

TOPSDNN_STATUS_SUCCESS The object was destroyed successfully.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSetConcatenateDescriptor (topsdnnConcatenateDescriptor_t concatenateDesc, int concatenate_dim)

This function initializes a previously created concatenate tensor descriptor object.

Parameters
  • concatenateDesc – Input/Output. Handle to a previously created concatenate tensor descriptor.

  • concatenate_dim – Input. Specify a dimension to do concatenate.

Returns

TOPSDNN_STATUS_SUCCES The object was set successfully.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED The function does not support the provided configuration.

Returns

TOPSDNN_STATUS_BAD_PARAM concatenateDesc is NULL.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetConcatenateDescriptor (topsdnnConcatenateDescriptor_t concatenateDesc, int *concatenate_dim)

This function is used to query information in concatenateDesc.

Parameters
  • concatenateDesc – Input. Handle to a previously initialized concatenate tensor descriptor.

  • concatenate_dim – Output. Returns the dimension to do concatenate in concatenateDesc.

Returns

TOPSDNN_STATUS_SUCCESS The object was set successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM concatenateDesc is NULL.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnConcatenateTensor (topsdnnHandle_t handle, const topsdnnConcatenateDescriptor_t concatenateDesc, const void *alpha1, const topsdnnTensorDescriptor_t aDesc, const void *A, const void *alpha2, const topsdnnTensorDescriptor_t bDesc, const void *B, const void *beta, const topsdnnTensorDescriptor_t cDesc, void *C)

This function is used to concatenate input tensors A and B through a specific dimension and outputs as tensor C. Datatypes, ranks and formats must be the same among A, B and C. The rank must be no more than 4, and each corresponding dim size should be the same except for concatenate_dim. Currently, size per each dim should be less than 65536.

Parameters
  • handle – Input. Handle to a previously created TopsDNN context.

  • concatenateDesc – Input. Descriptor object with operation information

  • alpha1, alpha2, beta – Input. Pointers to scaling factors (in host memory) used to blend the source value with prior value in the destination tensor as follows: dstValue = alpha1[0]*srcValue1 + alpha2[0]*srcValue2 + beta[0]*priorDstValue Currently, only alpha1[0] == 1 && alpha2[0] == 1 && beta[0] == 0 is supported.

  • aDesc – Input. Handle to a previously initialized tensor descriptor, describing first operand.

  • A – Input. Pointer to data of the tensor described by the aDesc descriptor.

  • bDesc – Input. Handle to a previously initialized tensor descriptor, describing second operand

  • B – Input. Pointer to data of the tensor described by the bDesc descriptor.

  • cDesc – Input. Handle to a previously initialized tensor descriptor, describing the result

  • C – Output. Pointer to data of the tensor described by the cDesc descriptor.

Returns

TOPSDNN_STATUS_SUCCESS The call was successful.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED The function does not support the provided configuration.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: Null pointer occurs in any of the parameters. The datatype or rank of the input tensor and output tensors differ. The strides of the input tensors differ or layout of input tensors and output tensor differ. The concatenate_dim is negative or greater than rank-1.

1.4. CnnInferLibrary

This entity contains all routines related to convolutional neural networks needed at inference time.

1.4.1. CnnInferStruction

These are the Struction types in the topsdnn_cnn_infer library.

typedef struct topsdnnConvolutionFwdAlgoPerfStruct topsdnnConvolutionFwdAlgoPerf_t
struct topsdnnConvolutionFwdAlgoPerfStruct

1.4.2. CnnInferEnumeration

These are the enumeration types in the topsdnn_cnn_infer library.

enum topsdnnConvolutionMode_t

Values:

enumerator TOPSDNN_CONVOLUTION
enumerator TOPSDNN_CROSS_CORRELATION

1.4.3. CnnInferFunctions

These are the API functions in the topsdnn_cnn_infer library.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnCreateDotDescriptor (topsdnnDotDescriptor_t *dotDesc)

creates a dot descriptor object by allocating the memory needed to hold its opaque structure.

Parameters

dotDesc – created dot descriptor.

Returns

TOPSDNN_STATUS_SUCCESS The object was created successfully.

Returns

TOPSDNN_STATUS_ALLOC_FAILED The resources could not be allocated.

Returns

TOPSDNN_STATUS_BAD_PARAM If dotDesc is nil.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnDestroyDotDescriptor (topsdnnDotDescriptor_t dotDesc)

destroys a previously created dot descriptor object.

Parameters

dotDesc – Handle to a previously created dot descriptor.

Returns

TOPSDNN_STATUS_SUCCESS The descriptor was destroyed successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM If dotDesc is NULL

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSetDotDescriptor (topsdnnDotDescriptor_t dotDesc, int lhsContractingDimension, int rhsContractingDimension, int BatchDimensionNum, int BatchDimensionMap[][2])

initializes a previously created dot descriptor object.

Parameters
  • dotDesc – Handle to a previously created dot descriptor.

  • lhsContractingDimension – Left operate contracting dimension.

  • rhsContractingDimension – Right operate contracting dimension.

  • BatchDimensionNum – The number of batch dimensions.

  • BatchDimensionMap – The map of batch dimensions.

Returns

TOPSDNN_STATUS_SUCCESS The object was set successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: The descriptor dotDesc is nil. LhsContractingDimension < 0 RhsContractingDimension < 0 BatchDimensionNum < 0 Length of BatchDimensionMap[][0] != BatchDimensionNum BatchDimensionMap[*][*] <0

Returns

TOPSDNN_STATUS_NOT_SUPPORTED At least one of the following conditions are met: LhsContractingDimension >= 4 RhsContractingDimension >= 4 BatchDimensionNum > 3 BatchDimensionMap[*][*] >= 3

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetDotDescriptor (topsdnnDotDescriptor_t dotDesc, int *lhsContractingDimension, int *rhsContractingDimension, int *BatchDimensionNum, int BatchDimensionMap[][2])

performs the queries of a previously initialized dot descriptor object.

Parameters
  • dotDesc – Handle to a previously created dot descriptor.

  • lhsContractingDimension – Left operate contracting dimension.

  • rhsContractingDimension – Right operate contracting dimension.

  • BatchDimensionNum – The number of batch dimensions.

  • BatchDimensionMap – The map of batch dimensions.

Returns

TOPSDNN_STATUS_SUCCESS The operation was successful.

Returns

TOPSDNN_STATUS_BAD_PARAM The parameter dotDesc is nil. LhsContractingDimension is nil. RhsContractingDimension is nil BatchDimensionNum is nil. BatchDimensionMap is nil.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnDot (topsdnnHandle_t handle, const void *alpha, const topsdnnTensorDescriptor_t aDesc, const void *A, const topsdnnTensorDescriptor_t bDesc, const void *B, topsdnnDotDescriptor_t dotDesc, const void *beta, const topsdnnTensorDescriptor_t cDesc, void *C)

performs the sum of products over contracting dimensions.

Parameters
  • context – Tops context

  • binaryOpMode – Binary arithmetic operation mode.

  • aDesc – Handle to a previously initialized input tensor descriptor.

  • A – Data pointer to device memory described by the aDesc.

  • bDesc – Handle to a previously initialized input tensor descriptor.

  • B – Data pointer to device memory described by the bDesc.

  • cDesc – Handle to a previously initialized output tensor descriptor.

  • dotDesc – Previously initialized dot descriptor

  • C – Data pointer to device memory described by the cDesc that carries the result of the topsDot.

Returns

TOPSDNN_STATUS_SUCCESS The operation was launched successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: Any of aDesc,bDesc,cDesc is null Any of A,B,C, alpha is null or beta is not 0.0. Dimensions of aDesc, bDesc, cDesc, dotDesc not match. Any dimension of aDesc,bDesc,cDesc is less than 1. Any of M,K,N <= 0

Returns

TOPSDNN_STATUS_NOT_SUPPORTED At least one of the following conditions are met: Data format is not in the supported datatype list. M,N,K exceed 16777216. Batch exceed 16777216. Batch, M,N is not sequential. Batch is not all in MSB of dimenstion.

Returns

TOPSDNN_STATUS_MAPPING_ERROR An error occurs during the texture object creation associated with the filter data.

Returns

TOPSDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GCU.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnCreateConvolutionDescriptor (topsdnnConvolutionDescriptor_t *convDesc)

This function creates a convolution descriptor object by allocating the memory needed to hold its opaque structure.

Parameters

convDesc – point to convolution description.

Returns

TOPSDNN_STATUS_SUCCESS The object was created successfully.

Returns

TOPSDNN_STATUS_ALLOC_FAILED The resources could not be allocated.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnDestroyConvolutionDescriptor (topsdnnConvolutionDescriptor_t convDesc)

This function destroys a previously created convolution descriptor object.

Parameters

convDesc – point to convolution description.

Returns

TOPSDNN_STATUS_SUCCESS The descriptor was destroyed successfully.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSetConvolutionNdDescriptor (topsdnnConvolutionDescriptor_t convDesc, int arrayLength, const int padA[], const int filterStrideA[], const int dilationA[], topsdnnConvolutionMode_t mode, topsdnnDataType_t dataType)

This function initializes a previously created generic convolution descriptor object into a n-D correlation. That same convolution descriptor can be reused in the backward path provided it corresponds to the same layer. The convolution computation will be done in the specified dataType, which can be potentially different from the input/output tensors.

Parameters
  • convDesc – Input/Output. Handle to a previously created convolution descriptor.

  • arrayLength – Input. Dimension of the convolution. For now only 2(2D convolution) and 3(3D convolution) is supported.

  • padA – Input. Array of dimension arrayLength containing the zero-padding size for each dimension. For every dimension, the padding represents the number of extra zeros implicitly concatenated at the start and at the end of every element of that dimension. For now 0 <= padA <= 25 can be supported.NOTE: only 0 <= padA <= 1 can be supported when conv algo set as TOPSDNN_CONVOLUTION_FWD_ALGO_WINOGRAD or TOPSDNN_CONVOLUTION_FWD_ALGO_WINOGRAD_NONFUSED (not support now).

  • filterStrideA – Input. Array of dimension arrayLength containing the filter stride for each dimension. For every dimension, the filter stride represents the number of elements to slide to reach the next start of the filtering window of the next point. GroutCount=1 : For now 1 <= filterStrideA[i] <= 7 can be supported. GroutCount=Input Feature : For now 1 <= filterStrideA[i] <= 7 can be supported. GroutCount=Other : For now 1 <= filterStrideA[i] <= 2 can be supported. NOTE: only filterStrideA[i] == 1 can be supported when conv algo set as TOPSDNN_CONVOLUTION_FWD_ALGO_WINOGRAD or TOPSDNN_CONVOLUTION_FWD_ALGO_WINOGRAD_NONFUSED (not support now).

  • dilationA – Input. Array of dimension arrayLength containing the dilation factor for each dimension. GroupCount=1 : For now 1 <= dilationA[i] <= 25 can be supported. GroupCount=Input Feature : For now 1 <= dilationA [i] <= 7 can be supported. GroupCount=Other : For now 1 <= dilationA [i] <= 2 can be supported. To be noticed filterStride and dilation can not be supported in same case. So make sure you only set dilationA[i] > 1 when all the filterStrideA[i] values are equal to 1.NOTE: only dilationA[i] == 1 can be supported when conv algo set as TOPSDNN_CONVOLUTION_FWD_ALGO_WINOGRAD or TOPSDNN_CONVOLUTION_FWD_ALGO_WINOGRAD_NONFUSED (not support now).

  • mode – Input. Selects between TOPSDNN_CONVOLUTION and TOPSDNN_CROSS_CORRELATION. For now only TOPSDNN_CROSS_CORRELATION can be supported for common machine learning use case.

  • dataType – Input. Selects the data type in which the computation will be done.

Returns

TOPSDNN_STATUS_SUCCESS The object was set successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: The descriptor convDesc is nil. The arrayLengthRequest is negative. The enumerant mode has an invalid value. The enumerant datatype has an invalid value. One of the elements of padA is strictly negative. One of the elements of strideA is negative or zero. One of the elements of dilationA is negative or zero.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED At least one of the following conditions are met: The arrayLengthRequest is greater than TOPSDNN_DIM_MAX.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetConvolutionNdDescriptor (const topsdnnConvolutionDescriptor_t convDesc, int arrayLengthRequested, int *arrayLength, int padA[], int filterStrideA[], int dilationA[], topsdnnConvolutionMode_t *mode, topsdnnDataType_t *dataType)

This function queries a previously initialized convolution descriptor object.

Parameters
  • convDesc – Input/Output. Handle to a previously created convolution descriptor.

  • arrayLengthRequested – Input. Dimension of the expected convolution descriptor. It is also the minimum size of the arrays padA, filterStrideA, and dilationA in order to be able to hold the results.

  • arrayLength – Output. Actual dimension of the convolution descriptor.

  • padA – Output. Array of dimension of at least arrayLengthRequested that will be filled with the padding parameters from the provided convolution descriptor.

  • filterStrideA – Output. Array of dimension of at least arrayLengthRequested that will be filled with the filter stride from the provided convolution descriptor.

  • dilationA – Output. Array of dimension of at least arrayLengthRequested that will be filled with the dilation parameters from the provided convolution descriptor.

  • mode – Output. Convolution mode of the provided descriptor.

  • dataType – Output. Datatype of the provided descriptor.

Returns

TOPSDNN_STATUS_SUCCESS The query was successful.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: The descriptor convDesc is nil. The arrayLengthRequest is negative.

Returns

TOPSDNN_STATUS_NOT_SUPPORTED The arrayLengthRequested is greater than TOPSDNN_DIM_MAX-2.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSetConvolutionGroupCount (topsdnnConvolutionDescriptor_t convDesc, int groupCount)

This function allows the user to specify the number of groups to be used in the associated convolution.

Parameters
  • convDesc – Input/Output. Handle to a previously created convolution descriptor.

  • groupCount – Integer larger than 1.

Returns

TOPSDNN _STATUS_SUCCESS The group count was set successfully.

Returns

TOPSDNN _STATUS_BAD_PARAM An invalid convolution descriptor was provided.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetConvolutionGroupCount (const topsdnnConvolutionDescriptor_t convDesc, int *groupCount)

This function returns the group count specified in the given convolution descriptor.

Parameters
  • convDesc – Input/Output. Handle to a previously created convolution descriptor.

  • groupCount – previously set in topsdnnSetConvolutionGroupCount.

Returns

TOPSDNN_STATUS_SUCCESS The group count was returned successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM An invalid convolution descriptor was provided.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnConvolutionForward (topsdnnHandle_t handle, const void *alpha, const topsdnnTensorDescriptor_t xDesc, const void *x, const topsdnnFilterDescriptor_t wDesc, const void *w, const topsdnnConvolutionDescriptor_t convDesc, topsdnnConvolutionFwdAlgo_t algo, void *workSpace, size_t workSpaceSizeInBytes, const void *beta, const topsdnnTensorDescriptor_t yDesc, void *y)

This function executes convolutions or cross-correlations over x using filters specified with w, returning results in y. Scaling factors alpha and beta can be used to scale the input tensor and the output tensor respectively.

Parameters
  • handle – Input. Handle to a previously created TopsDNN context.

  • alpha, beta – Input. Pointers to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]*result + beta[0]*priorDstValue When set algo as TOPSDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM, alpha can be supported for when data type is FP32,FP16 or INT8. When set algo as TOPSDNN_CONVOLUTION_FWD_ALGO_WINOGRAD, alpha can be supported for when data type is FP32 or FP16. Beta is not supported currently. Please set alpha to 1 and beta to 0 in unsupported conditions.

  • xDesc – Input. Handle to a previously initialized tensor descriptor.

  • x – Input. Data pointer to GCU memory associated with the tensor descriptor xDesc.

  • wDesc – Input. Handle to a previously initialized filter descriptor.

  • w – Input. Data pointer to GCU memory associated with the filter descriptor wDesc.

  • convDesc – Input. Previously initialized convolution descriptor.

  • algo – Input. Enumerant that specifies which convolution algorithm should be used to compute the results. Curent we only support the value of TOPSDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM, TOPSDNN_CONVOLUTION_FWD_ALGO_WINOGRAD and TOPSDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM (conv3d)

  • workSpace – Input. Data pointer to GCU memory to a workspace needed to be able to execute the specified algorithm. If no workspace is needed for a particular algorithm, that pointer can be nil.

  • workSpaceSizeInBytes – Input. Specifies the size in bytes of the provided workSpace.

  • yDesc – Input. Handle to a previously initialized tensor descriptor.

  • y – Input/Output. Data pointer to GCU memory associated with the tensor descriptor yDesc that carries the result of the convolution.

Returns

TOPSDNN_STATUS_SUCCESS The operation was launched successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: At least one of the following is NULL: handle, xDesc, wDesc, convDesc, yDesc, xData, w, yData, alpha, beta. xDesc and yDesc have a non-matching number of dimensions. xDesc and wDesc have a non-matching number of dimensions. xDesc has fewer than three number of dimensions. xDesc’s number of dimensions is not equal to convDesc array length

  • 2. xDesc and wDesc have a non-matching number of input feature maps per image (or group in case of grouped convolutions). yDesc or wDesc indicate an output channel count that isn’t a multiple of group count (if group count has been set in convDesc). xDesc, wDesc, and yDesc have a non-matching data type. For some spatial dimension, wDesc has a spatial size that is larger than the input spatial size (including zero-padding size).

Returns

TOPSDNN_STATUS_NOT_SUPPORTED At least one of the following conditions are met: xDesc or yDesc have negative tensor striding. xDesc, wDesc, or yDesc has a number of dimensions that is not 4 or 5. yDesc spatial sizes do not match with the expected size as determined by The chosen algo does not support the parameters provided; see above for an exhaustive list of parameters supported for each algo. TOPSDNN_STATUS_MAPPING_ERROR An error occurs during the texture object creation associated with the filter data. TOPSDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GCU.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnConvolutionBiasActivationForward (topsdnnHandle_t handle, const void *alpha1, const topsdnnTensorDescriptor_t xDesc, const void *x, const topsdnnFilterDescriptor_t wDesc, const void *w, const topsdnnConvolutionDescriptor_t convDesc, topsdnnConvolutionFwdAlgo_t algo, void *workSpace, size_t workSpaceSizeInBytes, const void *alpha2, const topsdnnTensorDescriptor_t zDesc, const void *z, const topsdnnTensorDescriptor_t biasDesc, const void *bias, const topsdnnActivationDescriptor_t activationDesc, const topsdnnTensorDescriptor_t yDesc, void *y)
topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetConvolution2dForwardOutputDim (const topsdnnConvolutionDescriptor_t convDesc, const topsdnnTensorDescriptor_t inputTensorDesc, const topsdnnFilterDescriptor_t filterDesc, int *n, int *c, int *h, int *w)

This function returns the dimensions of the resulting 4D tensor of a 2D convolution, given the convolution descriptor, the input tensor descriptor and the filter descriptor This function can help to setup the output tensor and allocate the proper amount of memory prior to launch the actual convolution.

Parameters
  • convDesc – a previously created convolution descriptor.

  • inputTensorDesc – a previously initialized tensor descriptor.

  • filterDesc – a previously initialized filter descriptor.

  • n – Number of output images.

  • c – Number of output feature maps per image.

  • h – Height of each output feature map.

  • w – Width of each output feature map.

Returns

TOPSDNN_STATUS_SUCCESS The operation was launched successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: At least one of the following is NULL: handle, xDesc, wDesc, convDesc, yDesc, zDesc, biasDesc, activationDesc, xData, wData, yData, zData, bias, alpha1, alpha2. xDesc and yDesc have a non-matching number of dimensions. xDesc and wDesc have a non-matching number of dimensions. xDesc and zDesc have a non-matching number of dimensions. xDesc has fewer than three number of dimensions. xDesc’s number of dimensions is not equal to convDesc array length + 2 xDesc and wDesc have a non-matching number of input feature maps per image (or group in case of grouped convolutions). yDesc or wDesc indicate. an output channel count that isn’t a multiple of group count (if group count has been set in convDesc). xDesc, wDesc, and yDesc have a non-matching data. type. For some spatial dimension, wDesc has a spatial size that is larger than the input spatial size (including zero-padding size).

Returns

TOPSDNN_STATUS_NOT_SUPPORTED At least one of the following conditions are met: xDesc or yDesc have negative tensor striding. xDesc, wDesc, or yDesc has a number of dimensions that is not 4 or 5. yDesc spatial sizes do not match with the expected size as determined by The chosen algo does not support the parameters provided; see above for an exhaustive list of parameters supported for each algo. The second stride of biasDesc is not equal to one. The first dimension of biasDesc is not equal to one. The second dimension of biasDesc and the first dimension of filterDesc are not equal. The data type of biasDesc does not correspond to the data type of yDesc as listed in the above data types table. zDesc and destDesc do not match. Algo is not TOPSDNN_CONVOLUTION_FWD_ALGO_WINOGRAD.

Returns

TOPSDNN_STATUS_MAPPING_ERROR An error occurs during the texture object creation associated with the filter data.

Returns

TOPSDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GCU.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetConvolutionForwardWorkspaceSize (topsdnnHandle_t handle, const topsdnnTensorDescriptor_t xDesc, const topsdnnFilterDescriptor_t wDesc, const topsdnnConvolutionDescriptor_t convDesc, const topsdnnTensorDescriptor_t yDesc, topsdnnConvolutionFwdAlgo_t algo, size_t *sizeInBytes)

This function returns the amount of DTU memory workspace the user needs to allocate to be able to call topsdnnConvolutionForward() with the specified algorithm.

Parameters
  • handle – a previously created topsdnn context.

  • xDesc – the previously initialized x tensor descriptor.

  • wDesc – a previously initialized filter descriptor.

  • convDesc – Previously initialized convolution descriptor.

  • yDesc – the previously initialized y tensor descriptor.

  • algo – Enumerant that specifies the chosen convolution algorithm.

  • sizeInBytes – Amount of DTU memory needed as workspace to be able to execute a forward convolution with the specified algo.

Returns

TOPSDNN_STATUS_BAD_PARAM One or more of the descriptors has not been created correctly or there is a mismatch between the feature maps of inputTensorDesc and filterDesc.

Returns

TOPSDNN_STATUS_SUCCESS The object was set successfully.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnSetConvolution2dDescriptor (topsdnnConvolutionDescriptor_t convDesc, int pad_h, int pad_w, int u, int v, int dilation_h, int dilation_w, topsdnnConvolutionMode_t mode, topsdnnDataType_t dataType)

This function initializes a previously created convolution descriptor object into a 2D correlation. This function assumes that the tensor and filter descriptors correspond to the forward convolution path and checks if their settings are valid. That same convolution descriptor can be reused in the backward path provided it corresponds to the same layer.

Parameters
  • convDesc – a previously created convolution descriptor.

  • pad_h – Zero-padding height: number of rows of zeros implicitly concatenated onto the top and onto the bottom of input images.

  • pad_w – Zero-padding width: number of columns of zeros implicitly concatenated onto the left and onto the right of input images.

  • u – Vertical filter stride.

  • v – Horizontal filter stride.

  • dilation_h – Filter height dilation.

  • dilation_w – Filter width dilation.

  • mode – Selects between TOPSDNN_CONVOLUTION and TOPSDNN_CROSS_CORRELATION.

  • dataType – data type.

Returns

TOPSDNN_STATUS_SUCCESS The object was set successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: The descriptor convDesc is nil. One of the parameters pad_h, pad_w is strictly negative. One of the parameters u, v is negative or zero. One of the parameters dilation_h, dilation_w is negative or zero. The parameter mode has an invalid enumerant value.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnGetConvolution2dDescriptor (const topsdnnConvolutionDescriptor_t convDesc, int *pad_h, int *pad_w, int *u, int *v, int *dilation_h, int *dilation_w, topsdnnConvolutionMode_t *mode, topsdnnDataType_t *dataType)

This function queries a previously initialized 2D convolution descriptor object.

Parameters
  • convDesc – a previously created convolution descriptor.

  • pad_h – Zero-padding height: number of rows of zeros implicitly concatenated onto the top and onto the bottom of input images.

  • pad_w – Zero-padding width: number of columns of zeros implicitly concatenated onto the left and onto the right of input images.

  • u – Vertical filter stride.

  • v – Horizontal filter stride.

  • dilation_h – Filter height dilation.

  • dilation_w – Filter width dilation.

  • mode – Selects between TOPSDNN_CONVOLUTION and TOPSDNN_CROSS_CORRELATION.

  • dataType – data type.

Returns

TOPSDNN_STATUS_SUCCESS The object was set successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM The parameter convDesc is nil.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnFindConvolutionForwardAlgorithm (topsdnnHandle_t handle, const topsdnnTensorDescriptor_t xDesc, const topsdnnFilterDescriptor_t wDesc, const topsdnnConvolutionDescriptor_t convDesc, const topsdnnTensorDescriptor_t yDesc, const int requestedAlgoCount, int *returnedAlgoCount, topsdnnConvolutionFwdAlgoPerf_t *perfResults)

This function attempts all algorithms available for topsdnnConvolutionForward(). It will attempt both the provided convDesc mathType and TOPSDNN_DEFAULT_MATH (assuming the two differ).

Parameters
  • handle – a previously created topsdnn context.

  • xDesc – the previously initialized x tensor descriptor.

  • wDesc – a previously initialized filter descriptor.

  • convDesc – Previously initialized convolution descriptor.

  • yDesc – the previously initialized y tensor descriptor.

  • requestedAlgoCount – The maximum number of elements to be stored in perfResults.

  • returnedAlgoCount – The number of output elements stored in perfResults.

  • perfResults – A user-allocated array to store performance metrics sorted ascending by compute time.

Returns

TOPSDNN_STATUS_SUCCESS The query was successful.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: handle is not allocated properly. xDesc, wDesc, or yDesc are not allocated properly. xDesc, wDesc, or yDesc has fewer than 1 dimension. Either returnedCount or perfResults is nil. requestedCount is less than 1.

Returns

TOPSDNN_STATUS_ALLOC_FAILED This function was unable to allocate memory to store sample input, filters and output.

Returns

TOPSDNN_STATUS_INTERNAL_ERROR At least one of the following conditions are met: The function was unable to allocate necessary timing objects. The function was unable to deallocate necessary timing objects. The function was unable to deallocate sample input, filters and output.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnConvolutionBackwardData (topsdnnHandle_t handle, const void *alpha, const topsdnnFilterDescriptor_t wDesc, const void *w, const topsdnnTensorDescriptor_t dyDesc, const void *dy, const topsdnnConvolutionDescriptor_t convDesc, topsdnnConvolutionBwdDataAlgo_t algo, void *workSpace, size_t workSpaceSizeInBytes, const void *beta, const topsdnnTensorDescriptor_t dxDesc, void *dx)

This function computes the convolution data gradient of the tensor dy, where y is the output of the forward convolution in topsdnnConvolutionForward(). It uses the specified algo, and returns the results in the output tensor dx. Scaling factors alpha and beta can be used to scale the computed result or accumulate with the current dx.

Parameters
  • handle – Input. Handle to a previously created TopsDNN context. For more information, refer to topsdnnHandle_t.

  • alpha, beta – Input. Pointers to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]*result + beta[0]*priorDstValue

  • wDesc – Input. Handle to a previously initialized filter descriptor. For more information, refer to topsdnnFilterDescriptor_t.

  • w – Input. Data pointer to GCU memory associated with the filter descriptor wDesc.

  • dyDesc – Input. Handle to the previously initialized input differential tensor descriptor. For more information, refer to topsdnnTensorDescriptor_t.

  • dy – Input. Data pointer to GCU memory associated with the input differential tensor descriptor dyDesc.

  • convDesc – Input. Previously initialized convolution descriptor. For more information, refer to topsdnnConvolutionDescriptor_t.

  • algo – Input. Enumerant that specifies which backward data convolution algorithm should be used to compute the results. For more information, refer to topsdnnConvolutionBwdDataAlgo_t.

  • workSpace – Input. Data pointer to GPU memory to a workspace needed to be able to execute the specified algorithm. If no workspace is needed for a particular algorithm, that pointer can be NIL.

  • workSpaceSizeInBytes – Input. Specifies the size in bytes of the provided workSpace.

  • dxDesc – Input. Handle to the previously initialized output tensor descriptor.

  • dx – Input/Output. Data pointer to GCU memory associated with the output tensor descriptor dxDesc that carries the result.

Returns

TOPSDNN_STATUS_SUCCESS The operation was launched successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: At least one of the following is NULL: handle, dyDesc, wDesc, convDesc, dxDesc, dy, w, dx, alpha, beta. wDesc and dyDesc have a non-matching number of dimensions. wDesc and dxDesc have a non-matching number of dimensions. wDesc has fewer than three number of dimensions. wDesc, dxDesc, and dyDesc have a non-matching data type. wDesc and dxDesc have a non-matching number of input feature maps per image (or group in case of grouped convolutions). dyDesc spatial sizes do not match with the expected size as determined by topsdnnGetConvolutionNdForwardOutputDim

Returns

TOPSDNN_STATUS_NOT_SUPPORTED At least one of the following conditions are met: dyDesc or dxDesc have negative tensor striding. dyDesc, wDesc, or dxDesc has a number of dimensions that is not 4 or 5. The chosen algo does not support the parameters provided; see above for an exhaustive list of parameters that support each algo. dyDesc or wDesc indicate an output channel count that isn’t a multiple of group count (if group count has been set in convDesc). TOPSDNN_STATUS_MAPPING_ERROR An error occurs during the texture object creation associated with the filter data.

Returns

TOPSDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GCU.

1.5. CnnTrainLibrary

This entity contains all routines related to convolutional neural networks needed during training time.

1.5.1. CnnTrainFunctions

These are the API functions in the topsdnn_cnn_train library.

topsdnnStatus_t TOPSDNN_EXPORT topsdnnConvolutionBackwardFilter (topsdnnHandle_t handle, const void *alpha, const topsdnnTensorDescriptor_t xDesc, const void *x, const topsdnnTensorDescriptor_t dyDesc, const void *dy, const topsdnnConvolutionDescriptor_t convDesc, topsdnnConvolutionBwdFilterAlgo_t algo, void *workSpace, size_t workSpaceSizeInBytes, const void *beta, const topsdnnFilterDescriptor_t dwDesc, void *dw)

This function computes the convolution weight (filter) gradient of the tensor dy, where y is the output of the forward convolution in topsdnnConvolutionForward. It uses the specified algo, and returns the results in the output tensor dw. Scaling factors alpha and beta can be used to scale the computed result or accumulate with the current dw.

Parameters
  • handle – Input. Handle to a previously created TopsDNN context.

  • alpha, beta – Input. Pointers to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]*result + beta[0]*priorDstValue

  • xDesc – Input. Handle to a previously initialized tensor descriptor.

  • x – Input. Data pointer to GCU memory associated with the tensor descriptor xDesc.

  • dyDesc – Input. Handle to the previously initialized input differential tensor descriptor.

  • dy – Input. Data pointer to GCU memory associated with the backpropagation gradient tensor descriptor dyDesc.

  • convDesc – Input. Previously initialized convolution descriptor.

  • algo – Input. Enumerant that specifies which convolution algorithm should be used to compute the results. Now support TOPSDNN_CONVOLUTION_BWD_FILTER_ALGO_1 only.

  • workSpace – Input. Data pointer to GCU memory to a workspace needed to be able to execute the specified algorithm. If no workspace is needed for a particular algorithm, that pointer can be NIL.

  • workSpaceSizeInBytes – Input. Specifies the size in bytes of the provided workSpace.

  • dwDesc – Input. Handle to a previously initialized filter gradient descriptor.

  • dw – Input/Output. Data pointer to GCU memory associated with the filter gradient descriptor dwDesc that carries the result.

Returns

TOPSDNN_STATUS_SUCCESS The operation was launched successfully.

Returns

TOPSDNN_STATUS_BAD_PARAM At least one of the following conditions are met: At least one of the following is NULL: handle, xDesc, dyDesc, convDesc, dwDesc, xData, dyData, dwData, alpha, or beta xDesc and dyDesc have a non-matching number of dimensions xDesc and dwDesc have a non-matching number of dimensions xDesc has fewer than three number of dimensions xDesc, dyDesc, and dwDesc have a non-matching data type. xDesc and dwDesc have a non-matching number of input feature maps per image (or group in case of grouped convolutions). yDesc or dwDesc indicate an output channel count that isn’t a multiple of group count (if group count has been set in convDesc).

Returns

TOPSDNN_STATUS_NOT_SUPPORTED At least one of the following conditions are met: xDesc or dyDesc have negative tensor striding xDesc, dyDesc or dwDesc has a number of dimensions that is not 4 or 5 The chosen algo does not support the parameters provided; see above for exhaustive list of parameter support for each algo

Returns

TOPSDNN_STATUS_MAPPING_ERROR An error occurs during the texture object creation associated with the filter data.

Returns

TOPSDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GCU.