2. API Function¶
2.1. Driver¶
This section describes the Driver Initialization and Version.
- 
TOPS_PUBLIC_API topsError_t topsInit(unsigned int flags)¶
- Explicitly initializes the TOPS runtime. - Most TOPS APIs implicitly initialize the TOPS runtime. This API provides control over the timing of the initialization. 
- 
TOPS_PUBLIC_API topsError_t topsDriverGetVersion(int *driverVersion)¶
- Returns the approximate TOPS driver version. - The version is returned as (1000 major + 10 minor). For example, topsrider 2.2 would be represented by 2020. - See also - Parameters
- driverVersion – [out] 
- Returns
- topsSuccess, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsRuntimeGetVersion(int *runtimeVersion)¶
- Returns the approximate TOPS Runtime version. - The version is returned as (1000 major + 10 minor). For example, topsrider 2.2 would be represented by 2020. - See also - Parameters
- runtimeVersion – [out] 
- Returns
- topsSuccess, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsDeviceGet(topsDevice_t *device, int ordinal)¶
- Returns a handle to a compute device. - Parameters
- device – [out] 
- ordinal – [in] 
 
- Returns
- topsSuccess, topsErrorInvalidDevice 
 
- 
TOPS_PUBLIC_API topsError_t topsDeviceComputeCapability(int *major, int *minor, topsDevice_t device)¶
- Returns the compute capability of the device. - Parameters
- major – [out] 
- minor – [out] 
- device – [in] 
 
- Returns
- topsSuccess, topsErrorInvalidDevice 
 
- 
TOPS_PUBLIC_API topsError_t topsDeviceGetName(char *name, int len, topsDevice_t device)¶
- Returns an identifier string for the device. - Warning - these versions are ignored. - Parameters
- name – [out] 
- len – [in] 
- device – [in] 
 
- Returns
- topsSuccess, topsErrorInvalidDevice 
 
- 
TOPS_PUBLIC_API topsError_t topsDeviceGetPCIBusId(char *pciBusId, int len, int device)¶
- Returns a PCI Bus Id string for the device, overloaded to take int device ID. - Parameters
- pciBusId – [out] 
- len – [in] 
- device – [in] 
 
- Returns
- topsSuccess, topsErrorInvalidDevice 
 
- 
TOPS_PUBLIC_API topsError_t topsDeviceGetByPCIBusId(int *device, const char *pciBusId)¶
- Returns a handle to a compute device. - Parameters
- device – [out] handle 
- pciBusId – [in] 
 
- Returns
- topsSuccess, topsErrorInvalidDevice, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsDeviceTotalMem(size_t *bytes, topsDevice_t device)¶
- Returns the total amount of memory on the device. - Parameters
- bytes – [out] 
- device – [in] 
 
- Returns
- topsSuccess, topsErrorInvalidDevice 
 
2.2. Device¶
This section describes the device management functions of TOPS runtime API.
- 
TOPS_PUBLIC_API topsError_t topsDeviceSynchronize(void)¶
- Waits on all active streams on current device. - When this command is invoked, the host thread gets blocked until all the commands associated with streams associated with the device. TOPS does not support multiple blocking modes (yet!). - See also - Returns
- topsSuccess 
 
- 
TOPS_PUBLIC_API topsError_t topsDeviceReset(void)¶
- The state of current device is discarded and updated to a fresh state. - Calling this function deletes all streams created, memory allocated, kernels running, events created. Make sure that no other thread is using the device or streams, memory, kernels, events associated with the current device. - See also - Returns
- topsSuccess 
 
- 
TOPS_PUBLIC_API topsError_t topsSetDevice(int deviceId)¶
- Set default device to be used for subsequent tops API calls from this thread. - Sets - deviceas the default device for the calling host thread. Valid device id’s are 0… (topsGetDeviceCount()-1).- Many TOPS APIs implicitly use the “default device” : - Any device memory subsequently allocated from this host thread (using topsMalloc) will be allocated on device. 
- Any streams or events created from this host thread will be associated with device. 
- Any kernels launched from this host thread (using topsLaunchKernel) will be executed on device (unless a specific stream is specified, in which case the device associated with that stream will be used). 
 - This function may be called from any host thread. Multiple host threads may use the same device. This function does no synchronization with the previous or new device, and has very little runtime overhead. Applications can use topsSetDevice to quickly switch the default device before making a TOPS runtime call which uses the default device. - The default device is stored in thread-local-storage for each thread. Thread-pool implementations may inherit the default device of the previous thread. A good practice is to always call topsSetDevice at the start of TOPS coding sequency to establish a known standard device. - See also - Parameters
- deviceId – [in] Valid device in range 0…(topsGetDeviceCount()-1). 
- Returns
- topsSuccess, topsErrorInvalidDevice, #topsErrorDeviceAlreadyInUse 
 
- 
TOPS_PUBLIC_API topsError_t topsGetDevice(int *deviceId)¶
- Return the default device id for the calling host thread. - TOPS maintains an default device for each thread using thread-local-storage. This device is used implicitly for TOPS runtime APIs called by this thread. topsGetDevice returns in * - devicethe default device for the calling host thread.- Parameters
- deviceId – [out] *deviceId is written with the default device 
- Returns
- topsSuccess, topsErrorInvalidDevice, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsGetDeviceCount(int *count)¶
- Return number of compute-capable devices. - Returns in - *countthe number of devices that have ability to run compute commands. If there are no such devices, then topsGetDeviceCount will return topsErrorNoDevice. If 1 or more devices can be found, then topsGetDeviceCount returns topsSuccess.- Parameters
- [output] – count Returns number of compute-capable devices. 
- Returns
- topsSuccess, topsErrorNoDevice 
 
- 
TOPS_PUBLIC_API topsError_t topsDeviceGetAttribute(int *pi, topsDeviceAttribute_t attr, int deviceId)¶
- Query for a specific device attribute. - Parameters
- pi – [out] pointer to value to return 
- attr – [in] attribute to query 
- deviceId – [in] which device to query for information 
 
- Returns
- topsSuccess, topsErrorInvalidDevice, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsGetDeviceProperties(topsDeviceProp_t *prop, int deviceId)¶
- Returns device properties. - Populates topsGetDeviceProperties with information for the specified device. - Parameters
- prop – [out] written with device properties 
- deviceId – [in] which device to query for information 
 
- Returns
- topsSuccess, topsErrorInvalidDevice 
 
- 
TOPS_PUBLIC_API topsError_t topsDeviceGetLimit(size_t *pValue, enum topsLimit_t limit)¶
- Get Resource limits of current device. - Parameters
- pValue – [out] 
- limit – [in] 
 
- Returns
- topsSuccess, #topsErrorUnsupportedLimit, topsErrorInvalidValue Note: Currently, only topsLimitMallocHeapSize is available 
 
- 
TOPS_PUBLIC_API topsError_t topsGetDeviceFlags(unsigned int *flags)¶
- Gets the flags set for current device. - Parameters
- flags – [out] 
- Returns
- topsSuccess, topsErrorInvalidDevice, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsSetDeviceFlags(unsigned flags)¶
- The current device behavior is changed according the flags passed. - topsDeviceScheduleSpin : TOPS runtime will actively spin in the thread which submitted the work until the command completes. This offers the lowest latency, but will consume a CPU core and may increase power. - topsDeviceScheduleYield : The TOPS runtime will yield the CPU to system so that other tasks can use it. This may increase latency to detect the completion but will consume less power and is friendlier to other tasks in the system. - topsDeviceScheduleBlockingSync : This is a synonym for topsDeviceScheduleYield. - topsDeviceScheduleAuto : Use a heuristic to select between Spin and Yield modes. If the number of TOPS contexts is greater than the number of logical processors in the system, use Spin scheduling. Else use Yield scheduling. - topsDeviceMapHost : Allow mapping host memory. On GCU, this is always allowed and the flag is ignored. - topsDeviceLmemResizeToMax : - Warning - GCU silently ignores this flag. - Parameters
- flags – [in] The schedule flags impact how TOPS waits for the completion of a command running on a device. 
- Returns
- topsSuccess, topsErrorInvalidDevice, #topsErrorSetOnActiveProcess 
 
- 
TOPS_PUBLIC_API topsError_t topsChooseDevice(int *device, const topsDeviceProp_t *prop)¶
- Device which matches topsDeviceProp_t is returned. - Parameters
- device – [out] The device ID 
- prop – [in] The device properties pointer 
 
- Returns
- topsSuccess, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsIpcGetMemHandle(topsIpcMemHandle_t *handle, void *devPtr)¶
- Gets an interprocess memory handle for an existing device memory allocation. - Takes a pointer to the base of an existing device memory allocation created with topsMalloc and exports it for use in another process. This is a lightweight operation and may be called multiple times on an allocation without adverse effects. - If a region of memory is freed with topsFree and a subsequent call to topsMalloc returns memory with the same device address, topsIpcGetMemHandle will return a unique handle for the new memory. - Parameters
- handle – - Pointer to user allocated topsIpcMemHandle to return the handle in. 
- devPtr – - Base pointer to previously allocated device memory 
 
- Returns
- topsSuccess, topsErrorInvalidHandle, topsErrorOutOfMemory, topsErrorMapFailed, 
 
- 
TOPS_PUBLIC_API topsError_t topsIpcOpenMemHandle(void **devPtr, topsIpcMemHandle_t handle, unsigned int flags)¶
- Opens an interprocess memory handle exported from another process and returns a device pointer usable in the local process. - Maps memory exported from another process with topsIpcGetMemHandle into the current device address space. For contexts on different devices topsIpcOpenMemHandle can attempt to enable peer access between the devices as if the user called topsDeviceEnablePeerAccess. This behavior is controlled by the topsIpcMemLazyEnablePeerAccess flag. topsDeviceCanAccessPeer can determine if a mapping is possible. - Contexts that may open topsIpcMemHandles are restricted in the following way. topsIpcMemHandles from each device in a given process may only be opened by one context per device per other process. - Memory returned from topsIpcOpenMemHandle must be freed with topsIpcCloseMemHandle. - Calling topsFree on an exported memory region before calling topsIpcCloseMemHandle in the importing context will result in undefined behavior. - Note - No guarantees are made about the address returned in - *devPtr. In particular, multiple processes may not receive the same address for the same- handle.- Parameters
- devPtr – - Returned device pointer 
- handle – - topsIpcMemHandle to open 
- flags – - Flags for this operation. currently only flag 0 is supported 
 
- Returns
- topsSuccess, topsErrorMapFailed, topsErrorInvalidHandle, topsErrorTooManyPeers 
 
- 
TOPS_PUBLIC_API topsError_t topsIpcCloseMemHandle(void *devPtr)¶
- Close memory mapped with topsIpcOpenMemHandle. - Unmaps memory returned by topsIpcOpenMemHandle. The original allocation in the exporting process as well as imported mappings in other processes will be unaffected. - Any resources used to enable peer access will be freed if this is the last mapping using them. - Parameters
- devPtr – - Device pointer returned by topsIpcOpenMemHandle 
- Returns
- topsSuccess, topsErrorMapFailed, topsErrorInvalidHandle, 
 
- 
TOPS_PUBLIC_API topsError_t topsIpcGetEventHandle(topsIpcEventHandle_t *handle, topsEvent_t event)¶
- Gets an opaque interprocess handle for an event. - This opaque handle may be copied into other processes and opened with topsIpcOpenEventHandle. Then topsEventRecord, topsEventSynchronize, topsEventQuery may be used in remote processes. The topsStreamWaitEvent is called in local processes only. Operations on the imported event after the exported event has been freed with topsEventDestroy will result in undefined behavior. - Parameters
- handle – [out] Pointer to topsIpcEventHandle to return the opaque event handle 
- event – [in] Event allocated with topsEventInterprocess and topsEventDisableTiming flags 
 
- Returns
- topsSuccess, #topsErrorInvalidConfiguration, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsIpcOpenEventHandle(topsEvent_t *event, topsIpcEventHandle_t handle)¶
- Opens an interprocess event handle. - Opens an interprocess event handle exported from another process with topsIpcGetEventHandle. The returned topsEvent_t behaves like a locally created event with the topsEventDisableTiming flag specified. This event need be freed with topsEventDestroy. Operations on the imported event after the exported event has been freed with topsEventDestroy will result in undefined behavior. If the function is called within the same process where handle is returned by topsIpcGetEventHandle, it will return topsErrorInvalidContext. - Parameters
- event – [out] Pointer to topsEvent_t to return the event 
- handle – [in] The opaque interprocess handle to open 
 
- Returns
- topsSuccess, topsErrorInvalidValue, topsErrorInvalidContext 
 
- 
TOPS_PUBLIC_API topsError_t topsIpcOpenEventHandleExt(topsEvent_t *event, topsIpcEventHandle_t handle, topsTopologyMapType map, int port)¶
- Opens an interprocess event handle with topsTopologyMapType. - Opens an interprocess event handle exported from another process with topsIpcGetEventHandle. The returned topsEvent_t behaves like a locally created event with the topsEventDisableTiming flag specified. This event need be freed with topsEventDestroy. Operations on the imported event after the exported event has been freed with topsEventDestroy will result in undefined behavior. If the function is called within the same process where handle is returned by topsIpcGetEventHandle, it will return topsErrorInvalidContext. - Note. This function is only supported when the event is exported for a remote process on a different device. If the event is exported for a remote process on the same device, please use topsIpcOpenEventHandle instead. - Parameters
- event – [out] Pointer to topsEvent_t to return the event 
- handle – [in] The opaque interprocess handle to open 
- map – [in] The link that is expected to pass 
- port – [in] ESL port id 
 
- Returns
- topsSuccess, topsErrorInvalidValue, topsErrorInvalidContext 
 
2.3. Error¶
This section describes the error handling functions of TOPS runtime API.
- 
TOPS_PUBLIC_API topsError_t topsGetLastError(void)¶
- Return last error returned by any TOPS runtime API call and resets the stored error code to topsSuccess. - Returns the last error that has been returned by any of the runtime calls in the same host thread, and then resets the saved error to topsSuccess. - See also - topsGetErrorString, topsGetLastError, topsPeekAtLastError, topsError_t - Returns
- return code from last TOPS called from the active host thread 
 
- 
TOPS_PUBLIC_API topsError_t topsPeekAtLastError(void)¶
- Return last error returned by any TOPS runtime API call. - Returns the last error that has been returned by any of the runtime calls in the same host thread. Unlike topsGetLastError, this function does not reset the saved error code. - See also - topsGetErrorString, topsGetLastError, topsPeekAtLastError, topsError_t - Returns
- topsSuccess 
 
- 
TOPS_PUBLIC_API const char *topsGetErrorName(topsError_t tops_error)¶
- Return name of the specified error code in text form. - See also - topsGetErrorString, topsGetLastError, topsPeekAtLastError, topsError_t - Parameters
- tops_error – Error code to convert to name. 
- Returns
- const char pointer to the NULL-terminated error name 
 
- 
TOPS_PUBLIC_API const char *topsGetErrorString(topsError_t topsError)¶
- Return handy text string message to explain the error which occurred. - See also - topsGetErrorName, topsGetLastError, topsPeekAtLastError, topsError_t - Warning - : This function returns the name of the error (same as topsGetErrorName) - Parameters
- topsError – Error code to convert to string. 
- Returns
- const char pointer to the NULL-terminated error string 
 
2.4. Stream¶
This section describes the stream management functions of TOPS runtime API.
- 
typedef void (*topsStreamCallback_t)(topsStream_t stream, topsError_t status, void *userData)¶
- Stream CallBack struct 
- 
TOPS_PUBLIC_API topsError_t topsStreamCreate(topsStream_t *stream)¶
- Create an asynchronous stream. - Create a new asynchronous stream. - streamreturns an opaque handle that can be used to reference the newly created stream in subsequent topsStream* commands. The stream is allocated on the heap and will remain allocated even if the handle goes out-of-scope. To release the memory used by the stream, application must call topsStreamDestroy.- Parameters
- stream – [inout] Pointer to new stream 
- Returns
- topsSuccess, topsErrorInvalidValue 
- Returns
- topsSuccess, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsStreamCreateWithFlags(topsStream_t *stream, unsigned int flags)¶
- Create an asynchronous stream. - Create a new asynchronous stream. - streamreturns an opaque handle that can be used to reference the newly created stream in subsequent topsStream* commands. The stream is allocated on the heap and will remain allocated even if the handle goes out-of-scope. To release the memory used by the stream, application must call topsStreamDestroy. Flags controls behavior of the stream. See topsStreamDefault, topsStreamNonBlocking.- Parameters
- stream – [inout] Pointer to new stream 
- flags – [in] to control stream creation. 
 
- Returns
- topsSuccess, topsErrorInvalidValue 
- Returns
- topsSuccess, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsStreamDestroy(topsStream_t stream)¶
- Destroys the specified stream. - Destroys the specified stream. - If commands are still executing on the specified stream, some may complete execution before the queue is deleted. - The queue may be destroyed while some commands are still inflight, or may wait for all commands queued to the stream before destroying it. - Note: stream resource should be released before process exit - Parameters
- stream – [in] stream to destroy 
- Returns
- topsSuccess #topsErrorInvalidHandle 
 
- 
TOPS_PUBLIC_API topsError_t topsStreamQuery(topsStream_t stream)¶
- Return topsSuccess if all of the operations in the specified - streamhave completed, or topsErrorNotReady if not.- This is thread-safe and returns a snapshot of the current state of the queue. However, if other host threads are sending work to the stream, the status may change immediately after the function is called. It is typically used for debug. - Parameters
- stream – [in] stream to query 
- Returns
- topsSuccess, topsErrorNotReady, #topsErrorInvalidHandle 
 
- 
TOPS_PUBLIC_API topsError_t topsStreamSynchronize(topsStream_t stream)¶
- Wait for all commands in stream to complete. - This command is host-synchronous : the host will block until the specified stream is empty. - This command follows standard null-stream semantics. Specifically, specifying the null stream will cause the command to wait for other streams on the same device to complete all pending operations. - This command honors the topsDeviceLaunchBlocking flag, which controls whether the wait is active or blocking. - See also - Parameters
- stream – [in] stream identifier. 
- Returns
- topsSuccess, #topsErrorInvalidHandle 
 
- 
TOPS_PUBLIC_API topsError_t topsStreamWaitEvent(topsStream_t stream, topsEvent_t event, unsigned int flags)¶
- Make the specified compute stream wait for an event. - This function inserts a wait operation into the specified stream. All future work submitted to - streamwill wait until- eventreports completion before beginning execution.- This function only waits for commands in the current stream to complete. Notably,, this function does not implicit wait for commands in the default stream to complete, even if the specified stream is created with topsStreamNonBlocking = 0. - Parameters
- stream – [in] stream to make wait. 
- event – [in] event to wait on 
- flags – [in] control operation [must be 0] 
 
- Returns
- topsSuccess, #topsErrorInvalidHandle 
 
- 
TOPS_PUBLIC_API topsError_t topsStreamAddCallback(topsStream_t stream, topsStreamCallback_t callback, void *userData, unsigned int flags)¶
- Adds a callback to be called on the host after all currently enqueued items in the stream have completed. For each topsStreamAddCallback call, a callback will be executed exactly once. The callback will block later work in the stream until it is finished. - See also - topsStreamCreate, topsStreamQuery, topsStreamSynchronize, topsStreamWaitEvent, topsStreamDestroy - Parameters
- stream – [in] - Stream to add callback to 
- callback – [in] - The function to call once preceding stream operations are complete 
- userData – [in] - User specified data to be passed to the callback function 
- flags – [in] - topsStreamDefault: non-blocking stream execution; topsStreamCallbackBlocking: stream blocks until callback is completed. 
 
- Returns
- topsSuccess, #topsErrorInvalidHandle, topsErrorNotSupported 
 
- 
TOPS_PUBLIC_API topsError_t topsStreamWriteValue32(topsDeviceptr_t dst, int value, unsigned int flags)¶
- Write a value to local device memory. - Parameters
- dst – [in] - The device address to write to. 
- value – [in] - The value to write. 
- flags – [in] - Reserved for future expansion; must be 0. 
 
- Returns
- topsSuccess, topsErrorInvalidDevicePointer 
 
- TOPS_PUBLIC_API topsError_t topsStreamWriteValue32Async (topsDeviceptr_t dst, int value, unsigned int flags, topsStream_t stream __dparm(0))
- Write a value to local device memory async. - Parameters
- dst – [in] - The device address to write to. 
- value – [in] - The value to compare with the memory location. 
- flags – [in] - Reserved for future expansion; must be 0. 
- stream – [in] - The stream to synchronize on the memory location. 
 
- Returns
- topsSuccess, topsErrorInvalidDevicePointer 
 
- 
TOPS_PUBLIC_API topsError_t topsStreamWaitValue32(topsDeviceptr_t dst, int value, unsigned int flags)¶
- Wait on a memory location. - Parameters
- dst – [in] - The memory location to wait on. 
- value – [in] - The value to compare with the memory location. 
- flags – [in] - Reserved for future expansion; must be 0. 
 
- Returns
- topsSuccess, topsErrorInvalidDevicePointer 
 
- TOPS_PUBLIC_API topsError_t topsStreamWaitValue32Async (topsDeviceptr_t dst, int value, unsigned int flags, topsStream_t stream __dparm(0))
- Wait on a memory location async. - Parameters
- dst – [in] - The memory location to wait on. 
- value – [in] - The value to compare with the memory location. 
- flags – [in] - Reserved for future expansion; must be 0. 
- stream – [in] - The stream to synchronize on the memory location. 
 
- Returns
- topsSuccess, topsErrorInvalidDevicePointer 
 
2.5. Event¶
This section describes the event management functions of TOPS runtime API.
- 
TOPS_PUBLIC_API topsError_t topsEventCreateWithFlags(topsEvent_t *event, unsigned flags)¶
- Create an event object with the specified flags. - Creates an event object for the current device with the specified flags. Valid values include: -topsEventDefault: Default event create flag. The event will use active synchronization and will support timing. Blocking synchronization provides lowest possible latency at the expense of dedicating a CPU to poll on the event. -topsEventBlockingSync: Specifies that event should use blocking synchronization. A host thread that uses topsEventSynchronize() to wait on an event created with this flag will block until the event actually completes. -topsEventDisableTiming: Specifies that the created event does not need to record timing data. Events created with this flag specified and the topsEventBlockingSync flag not specified will provide the best performance when used with topsStreamWaitEvent() and topsEventQuery(). -topsEventInterprocess: Specifies that the created event may be used as an interprocess event by topsIpcGetEventHandle(). topsEventInterprocess must be specified along with topsEventDisableTiming. - Note - Note that this function may also return error codes from previous, asynchronous launches. 
 - Parameters
- event – [inout] Returns the newly created event. 
- flags – [in] Flags to control event behavior. 
 
- Returns
- topsSuccess, #topsErrorNotInitialized, topsErrorInvalidValue, topsErrorLaunchFailure, #topsErrorOutOfMemory 
 
- 
TOPS_PUBLIC_API topsError_t topsEventCreate(topsEvent_t *event)¶
- Create an event object. - Creates an event object for the current device using topsEventDefault. - See also - topsEventRecord, topsEventQuery, topsEventSynchronize, topsEventDestroy, topsEventElapsedTime - Parameters
- event – [inout] Returns the newly created event. 
- Returns
- topsSuccess, #topsErrorNotInitialized, topsErrorInvalidValue, topsErrorLaunchFailure, #topsErrorOutOfMemory 
 
- 
TOPS_PUBLIC_API topsError_t topsEventRecord(topsEvent_t event, topsStream_t stream)¶
- Record an event in the specified stream. - Captures in event the contents of stream at the time of this call. event and stream must be on the same TOPS context. Calls such as topsEventQuery() or topsStreamWaitEvent() will then examine or wait for completion of the work that was captured. Uses of stream after this call do not modify event. - topsEventRecord() can be called multiple times on the same event and will overwrite the previously captured state. Other APIs such as topsStreamWaitEvent() use the most recently captured state at the time of the API call, and are not affected by later calls to topsEventRecord(). Before the first call to topsEventRecord(), an event represents an empty set of work, so for example topsEventQuery() would return topsSuccess. - topsEventQuery() or topsEventSynchronize() must be used to determine when the event transitions from “recording” (after topsEventRecord() is called) to “recorded” (when timestamps are set, if requested). - Events which are recorded in a non-NULL stream will transition to from recording to “recorded” state when they reach the head of the specified stream, after all previous commands in that stream have completed executing. - If topsEventRecord() has been previously called on this event, then this call will overwrite any existing state in event. - If this function is called on an event that is currently being recorded, results are undefined - either outstanding recording may save state into the event, and the order is not guaranteed. 
 - See also - topsEventCreate, topsEventQuery, topsEventSynchronize, topsEventDestroy, topsEventElapsedTime - Parameters
- event – [in] event to record. 
- stream – [in] stream in which to record event. 
 
- Returns
- topsSuccess, topsErrorInvalidValue, #topsErrorNotInitialized, #topsErrorInvalidHandle, topsErrorLaunchFailure 
 
- 
TOPS_PUBLIC_API topsError_t topsEventDestroy(topsEvent_t event)¶
- Destroy the specified event. - Releases memory associated with the event. An event may be destroyed before it is complete (i.e., while topsEventQuery() would return topsErrorNotReady). If the event is recording but has not completed recording when topsEventDestroy() is called, the function will return immediately and any associated resources will automatically be released asynchronously at completion. - See also - topsEventCreate, topsEventQuery, topsEventSynchronize, topsEventRecord, topsEventElapsedTime - Note - Use of the handle after this call is undefined behavior. - Parameters
- event – [in] Event to destroy. 
- Returns
- topsSuccess, #topsErrorNotInitialized, topsErrorInvalidValue, topsErrorLaunchFailure 
 
- 
TOPS_PUBLIC_API topsError_t topsEventSynchronize(topsEvent_t event)¶
- Wait for an event to complete. - This function will block until the event is ready, waiting for all previous work in the stream specified when event was recorded with topsEventRecord(). - If topsEventRecord() has not been called on - event, this function returns immediately.- Note:This function needs to support topsEventBlockingSync parameter. - Parameters
- event – [in] Event on which to wait. 
- Returns
- topsSuccess, topsErrorInvalidValue, #topsErrorNotInitialized, #topsErrorInvalidHandle, topsErrorLaunchFailure 
 
- 
TOPS_PUBLIC_API topsError_t topsEventElapsedTime(float *ms, topsEvent_t start, topsEvent_t stop)¶
- Return the elapsed time between two events. - Computes the elapsed time between two events. Time is computed in ms, with a resolution of approximately 1 us. - Events which are recorded in a NULL stream will block until all commands on all other streams complete execution, and then record the timestamp. - Events which are recorded in a non-NULL stream will record their timestamp when they reach the head of the specified stream, after all previous commands in that stream have completed executing. Thus the time that the event recorded may be significantly after the host calls topsEventRecord(). - If topsEventRecord() has not been called on either event, then #topsErrorInvalidHandle is returned. If topsEventRecord() has been called on both events, but the timestamp has not yet been recorded on one or both events (that is, topsEventQuery() would return topsErrorNotReady on at least one of the events), then topsErrorNotReady is returned. - Note, for TOPS Events used in kernel dispatch using topsExtLaunchKernelGGL/topsExtLaunchKernel, events passed in topsExtLaunchKernelGGL/topsExtLaunchKernel are not explicitly recorded and should only be used to get elapsed time for that specific launch. In case events are used across multiple dispatches, for example, start and stop events from different topsExtLaunchKernelGGL/ topsExtLaunchKernel calls, they will be treated as invalid unrecorded events, TOPS will throw error “topsErrorInvalidHandle” from topsEventElapsedTime. - Parameters
- ms – [out] : Return time between start and stop in ms. 
- start – [in] : Start event. 
- stop – [in] : Stop event. 
 
- Returns
- topsSuccess, topsErrorInvalidValue, topsErrorNotReady, #topsErrorInvalidHandle, #topsErrorNotInitialized, topsErrorLaunchFailure 
 
- 
TOPS_PUBLIC_API topsError_t topsEventQuery(topsEvent_t event)¶
- Query event status. - Query the status of the specified event. This function will return topsErrorNotReady if all commands in the appropriate stream (specified to topsEventRecord()) have completed. If that work has not completed, or if topsEventRecord() was not called on the event, then topsSuccess is returned. - See also - topsEventCreate, topsEventRecord, topsEventDestroy, topsEventSynchronize, topsEventElapsedTime - Parameters
- event – [in] Event to query. 
- Returns
- topsSuccess, topsErrorNotReady, #topsErrorInvalidHandle, topsErrorInvalidValue, #topsErrorNotInitialized, topsErrorLaunchFailure 
 
2.6. Memory¶
This section describes the memory management functions of TOPS runtime API.
- 
TOPS_PUBLIC_API topsError_t topsPointerGetAttributes(topsPointerAttribute_t *attributes, const void *ptr)¶
- Return attributes for the specified pointer. - See also - Parameters
- attributes – [out] attributes for the specified pointer 
- ptr – [in] pointer to get attributes for 
 
- Returns
- topsSuccess, topsErrorInvalidDevice, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsPointerGetAttribute(void *data, topsPointer_attribute attribute, topsDeviceptr_t ptr)¶
- Returns information about the specified pointer. - See also - Parameters
- data – [inout] returned pointer attribute value 
- attribute – [in] attribute to query for 
- ptr – [in] pointer to get attributes for 
 
- Returns
- topsSuccess, topsErrorInvalidDevice, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsDrvPointerGetAttributes(unsigned int numAttributes, topsPointer_attribute *attributes, void **data, topsDeviceptr_t ptr)¶
- Returns information about the specified pointer. - See also - Parameters
- numAttributes – [in] number of attributes to query for 
- attributes – [in] attributes to query for 
- data – [inout] a two-dimensional containing pointers to memory locations where the result of each attribute query will be written to 
- ptr – [in] pointer to get attributes for 
 
- Returns
- topsSuccess, topsErrorInvalidDevice, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsMalloc(void **ptr, size_t size)¶
- Allocate memory on the default accelerator. - If size is 0, no memory is allocated, *ptr returns non-nullptr, and topsSuccess is returned. - See also - Parameters
- ptr – [out] Pointer to the allocated memory 
- size – [in] Requested memory size 
 
- Returns
- topsSuccess, #topsErrorOutOfMemory, topsErrorInvalidValue (bad context, null *ptr) 
 
- 
TOPS_PUBLIC_API topsError_t topsExtCodecMemHandle(void **pointer, uint64_t dev_addr, size_t size)¶
- convert device memory to efcodec memory handle - If size is 0, no memory is allocated, *ptr returns non-nullptr, and topsSuccess is returned. - See also - Parameters
- ptr – [out] Pointer to the allocated memory handle 
- dev_addr – [in] Requested memory device address 
- size – [in] Requested memory size 
 
- Returns
- topsSuccess, #topsErrorOutOfMemory, topsErrorInvalidValue (bad context, null *ptr) 
 
- 
TOPS_PUBLIC_API topsError_t topsExtMallocWithFlags(void **ptr, size_t sizeBytes, unsigned int flags)¶
- Allocate memory on the default accelerator. - If size is 0, no memory is allocated, *ptr returns non-nullptr, and topsSuccess is returned. - See also - Parameters
- ptr – [out] Pointer to the allocated memory 
- sizeBytes – [in] Requested memory size 
- flags – [in] Type of memory allocation flags only support topsDeviceMallocDefault/topsMallocTopDown/ topsMallocForbidMergeMove/topsMallocPreferHighSpeedMem 
 
- Returns
- topsSuccess, #topsErrorOutOfMemory, topsErrorInvalidValue (bad context, null *ptr) 
 
- 
TOPS_PUBLIC_API topsError_t topsHostMalloc(void **ptr, size_t size, unsigned int flags)¶
- Allocate device accessible page locked host memory. - If size is 0, no memory is allocated, *ptr returns nullptr, and topsSuccess is returned. - See also - Parameters
- ptr – [out] Pointer to the allocated host pinned memory 
- size – [in] Requested memory size 
- flags – [in] Type of host memory allocation 
 
- Returns
- topsSuccess, #topsErrorOutOfMemory 
 
- 
TOPS_PUBLIC_API topsError_t topsHostGetDevicePointer(void **devPtr, void *hostPtr, unsigned int flags)¶
- Get Device pointer from Host Pointer allocated through topsHostMalloc. - See also - Parameters
- devPtr – [out] Device Pointer mapped to passed host pointer 
- hostPtr – [in] Host Pointer allocated through topsHostMalloc 
- flags – [in] Flags to be passed for extension 
 
- Returns
- topsSuccess, topsErrorInvalidValue, #topsErrorOutOfMemory 
 
- 
TOPS_PUBLIC_API topsError_t topsHostGetFlags(unsigned int *flagsPtr, void *hostPtr)¶
- Return flags associated with host pointer. - See also - Parameters
- flagsPtr – [out] Memory location to store flags 
- hostPtr – [in] Host Pointer allocated through topsHostMalloc 
 
- Returns
- topsSuccess, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsHostRegister(void *hostPtr, size_t sizeBytes, unsigned int flags)¶
- Register host memory so it can be accessed from the current device. - Flags: - topsHostRegisterDefault Memory is Mapped and Portable 
- topsHostRegisterPortable Memory is considered registered by all contexts. TOPS only supports one context so this is always assumed true. 
- topsHostRegisterMapped Map the allocation into the address space for the current device. The device pointer can be obtained with topsHostGetDevicePointer. 
 - After registering the memory, use topsHostGetDevicePointer to obtain the mapped device pointer. On many systems, the mapped device pointer will have a different value than the mapped host pointer. Applications must use the device pointer in device code, and the host pointer in device code. - On some systems, registered memory is pinned. On some systems, registered memory may not be actually be pinned but uses OS or hardware facilities to all GCU access to the host memory. - Developers are strongly encouraged to register memory blocks which are aligned to the host cache-line size. (typically 64-bytes but can be obtains from the CPUID instruction). - If registering non-aligned pointers, the application must take care when register pointers from the same cache line on different devices. TOPS’s coarse-grained synchronization model does not guarantee correct results if different devices write to different parts of the same cache block - typically one of the writes will “win” and overwrite data from the other registered memory region. - Parameters
- hostPtr – [out] Pointer to host memory to be registered. 
- sizeBytes – [in] size of the host memory 
- flags. – [in] See below. 
 
- Returns
- topsSuccess, #topsErrorOutOfMemory 
 
- 
TOPS_PUBLIC_API topsError_t topsHostUnregister(void *hostPtr)¶
- Un-register host pointer. - See also - Parameters
- hostPtr – [in] Host pointer previously registered with topsHostRegister 
- Returns
- Error code 
 
- 
TOPS_PUBLIC_API topsError_t topsFree(void *ptr)¶
- Free memory allocated by the tops memory allocation API. This API performs an implicit topsDeviceSynchronize() call. If pointer is NULL, the tops runtime is initialized and topsSuccess is returned. - See also - Parameters
- ptr – [in] Pointer to memory to be freed 
- Returns
- topsSuccess 
- Returns
- topsErrorInvalidDevicePointer (if pointer is invalid, including host pointers allocated with topsHostMalloc) 
 
- 
TOPS_PUBLIC_API topsError_t topsHostFree(void *ptr)¶
- Free memory allocated by the tops host memory allocation API This API performs an implicit topsDeviceSynchronize() call. If pointer is NULL, the tops runtime is initialized and topsSuccess is returned. - See also - Parameters
- ptr – [in] Pointer to memory to be freed 
- Returns
- topsSuccess, topsErrorInvalidValue (if pointer is invalid, including device pointers allocated with topsMalloc) 
 
- 
TOPS_PUBLIC_API topsError_t topsMemcpy(void *dst, const void *src, size_t sizeBytes, topsMemcpyKind kind)¶
- Copy data from src to dst. - It supports memory from host to device, device to host, device to device and host to host The src and dst must not overlap. - For topsMemcpy, the copy is always performed by the current device (set by topsSetDevice). For multi-gcu or peer-to-peer configurations, it is recommended to set the current device to the device where the src data is physically located. For optimal peer-to-peer copies, the copy device must be able to access the src and dst pointers (by calling topsDeviceEnablePeerAccess with copy agent as the current device and src/dest as the peerDevice argument. if this is not done, the topsMemcpy will still work, but will perform the copy using a staging buffer on the host. Calling topsMemcpy with dst and src pointers that do not match the topsMemcpyKind results in undefined behavior. - See also - topsMalloc, topsFree, topsHostMalloc, topsHostFree, topsMemGetAddressRange, topsMemGetInfo, topsHostGetDevicePointer, topsMemcpyDtoD, topsMemcpyDtoDAsync, topsMemcpyDtoH, topsMemcpyDtoHAsync, topsMemcpyHtoD, topsMemcpyHtoDAsync - Parameters
- dst – [out] Data being copy to 
- src – [in] Data being copy from 
- sizeBytes – [in] Data size in bytes 
- kind – [in] Memory copy type 
 
- Returns
- topsSuccess, topsErrorInvalidValue, #topsErrorMemoryFree, #topsErrorUnknown 
 
- 
TOPS_PUBLIC_API topsError_t topsMemcpyWithStream(void *dst, const void *src, size_t sizeBytes, topsMemcpyKind kind, topsStream_t stream)¶
- Copy data from src to dst. - It supports memory from host to device, device to host, device to device and host to host The src and dst must not overlap. - See also - topsMalloc, topsFree, topsHostMalloc, topsHostFree, topsMemGetAddressRange, topsMemGetInfo, topsHostGetDevicePointer, topsMemcpyDtoD, topsMemcpyDtoDAsync, topsMemcpyDtoH, topsMemcpyDtoHAsync, topsMemcpyHtoD, topsMemcpyHtoDAsync - Parameters
- dst – [out] Data being copy to 
- src – [in] Data being copy from 
- sizeBytes – [in] Data size in bytes 
- kind – [in] Memory copy type 
- stream – [in] Stream to enqueue this operation. 
 
- Returns
- topsSuccess, topsErrorInvalidValue, #topsErrorMemoryFree, #topsErrorUnknown 
 
- 
TOPS_PUBLIC_API topsError_t topsMemcpyHtoD(topsDeviceptr_t dst, void *src, size_t sizeBytes)¶
- Copy data from Host to Device. - See also - topsMalloc, topsFree, topsHostMalloc, topsHostFree, topsMemGetAddressRange, topsMemGetInfo, topsHostGetDevicePointer, topsMemcpyDtoD, topsMemcpyDtoDAsync, topsMemcpyDtoH, topsMemcpyDtoHAsync, topsMemcpyHtoD, topsMemcpyHtoDAsync - Parameters
- dst – [out] Data being copy to 
- src – [in] Data being copy from 
- sizeBytes – [in] Data size in bytes 
 
- Returns
- topsSuccess, #topsErrorDeInitialized, #topsErrorNotInitialized, topsErrorInvalidContext, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsMemcpyDtoH(void *dst, topsDeviceptr_t src, size_t sizeBytes)¶
- Copy data from Device to Host. - See also - topsMalloc, topsFree, topsHostMalloc, topsHostFree, topsMemGetAddressRange, topsMemGetInfo, topsHostGetDevicePointer, topsMemcpyDtoD, topsMemcpyDtoDAsync, topsMemcpyDtoHAsync, topsMemcpyHtoD, topsMemcpyHtoDAsync - Parameters
- dst – [out] Data being copy to 
- src – [in] Data being copy from 
- sizeBytes – [in] Data size in bytes 
 
- Returns
- topsSuccess, #topsErrorDeInitialized, #topsErrorNotInitialized, topsErrorInvalidContext, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsMemcpyDtoD(topsDeviceptr_t dst, topsDeviceptr_t src, size_t sizeBytes)¶
- Copy data from Device to Device. - See also - topsMalloc, topsFree, topsHostMalloc, topsHostFree, topsMemGetAddressRange, topsMemGetInfo, topsHostGetDevicePointer, topsMemcpyDtoDAsync, topsMemcpyDtoH, topsMemcpyDtoHAsync, topsMemcpyHtoD, topsMemcpyHtoDAsync - Parameters
- dst – [out] Data being copy to 
- src – [in] Data being copy from 
- sizeBytes – [in] Data size in bytes 
 
- Returns
- topsSuccess, #topsErrorDeInitialized, #topsErrorNotInitialized, topsErrorInvalidContext, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsMemcpyHtoDAsync(topsDeviceptr_t dst, void *src, size_t sizeBytes, topsStream_t stream)¶
- Copy data from Host to Device asynchronously. - See also - topsMalloc, topsFree, topsHostMalloc, topsHostFree, topsMemGetAddressRange, topsMemGetInfo, topsHostGetDevicePointer, topsMemcpyDtoD, topsMemcpyDtoDAsync, topsMemcpyDtoH, topsMemcpyDtoHAsync, topsMemcpyHtoD - Parameters
- dst – [out] Data being copy to 
- src – [in] Data being copy from 
- sizeBytes – [in] Data size in bytes 
- stream – [in] Stream to enqueue this operation. 
 
- Returns
- topsSuccess, #topsErrorDeInitialized, #topsErrorNotInitialized, topsErrorInvalidContext, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsMemcpyDtoHAsync(void *dst, topsDeviceptr_t src, size_t sizeBytes, topsStream_t stream)¶
- Copy data from Device to Host asynchronously. - See also - topsMalloc, topsFree, topsHostMalloc, topsHostFree, topsMemGetAddressRange, topsMemGetInfo, topsHostGetDevicePointer, topsMemcpyDtoD, topsMemcpyDtoDAsync, topsMemcpyDtoH, topsMemcpyHtoD, topsMemcpyHtoDAsync - Parameters
- dst – [out] Data being copy to 
- src – [in] Data being copy from 
- sizeBytes – [in] Data size in bytes 
- stream – [in] Stream to enqueue this operation. 
 
- Returns
- topsSuccess, #topsErrorDeInitialized, #topsErrorNotInitialized, topsErrorInvalidContext, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsMemcpyDtoDAsync(topsDeviceptr_t dst, topsDeviceptr_t src, size_t sizeBytes, topsStream_t stream)¶
- Copy data from Device to Device asynchronously. - See also - topsMalloc, topsFree, topsHostMalloc, topsHostFree, topsMemGetAddressRange, topsMemGetInfo, topsHostGetDevicePointer, topsMemcpyDtoD, topsMemcpyDtoH, topsMemcpyDtoHAsync, topsMemcpyHtoD, topsMemcpyHtoDAsync - Parameters
- dst – [out] Data being copy to 
- src – [in] Data being copy from 
- sizeBytes – [in] Data size in bytes 
- stream – [in] Stream to enqueue this operation. 
 
- Returns
- topsSuccess, #topsErrorDeInitialized, #topsErrorNotInitialized, topsErrorInvalidContext, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsModuleGetGlobal(topsDeviceptr_t *dptr, size_t *bytes, topsModule_t hmod, const char *name)¶
- Returns a global pointer from a module. Returns in *dptr and *bytes the pointer and size of the global symbol located in module hmod. If no variable of that name exists, it returns topsErrorNotFound. Both parameters dptr and bytes are optional. If one of them is NULL, it is ignored and topsSuccess is returned. - Parameters
- dptr – [out] Returns global device pointer 
- bytes – [out] Returns global size in bytes 
- hmod – [in] Module to retrieve global from 
- name – [in] Name of global to retrieve 
 
- Returns
- topsSuccess, topsErrorInvalidValue, #topsErrorNotFound, topsErrorInvalidContext 
 
- 
TOPS_PUBLIC_API topsError_t topsGetSymbolAddress(void **devPtr, const void *symbol)¶
- Gets device pointer associated with symbol on the device. - Parameters
- devPtr – [out] pointer to the device associated the symbol 
- symbol – [in] pointer to the symbol of the device 
 
- Returns
- topsSuccess, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsGetSymbolSize(size_t *size, const void *symbol)¶
- Gets the size of the given symbol on the device. - Parameters
- symbol – [in] pointer to the device symbol 
- size – [out] pointer to the size 
 
- Returns
- topsSuccess, topsErrorInvalidValue 
 
- TOPS_PUBLIC_API topsError_t topsMemcpyToSymbol (const void *symbol, const void *src, size_t sizeBytes, size_t offset __dparm(0), topsMemcpyKind kind __dparm(topsMemcpyHostToDevice))
- Copies data to the given symbol on the device. Symbol TOPS APIs allow a kernel to define a device-side data symbol which can be accessed on the host side. The symbol can be in __constant or device space. Note that the symbol name needs to be encased in the TOPS_SYMBOL macro. This also applies to topsMemcpyFromSymbol, topsGetSymbolAddress, and topsGetSymbolSize. - Parameters
- symbol – [out] pointer to the device symbol 
- src – [in] pointer to the source address 
- sizeBytes – [in] size in bytes to copy 
- offset – [in] offset in bytes from start of symbol 
- kind – [in] type of memory transfer 
 
- Returns
- topsSuccess, topsErrorInvalidValue 
 
- TOPS_PUBLIC_API topsError_t topsMemcpyToSymbolAsync (const void *symbol, const void *src, size_t sizeBytes, size_t offset, topsMemcpyKind kind, topsStream_t stream __dparm(0))
- Copies data to the given symbol on the device asynchronously. - Parameters
- symbol – [out] pointer to the device symbol 
- src – [in] pointer to the source address 
- sizeBytes – [in] size in bytes to copy 
- offset – [in] offset in bytes from start of symbol 
- kind – [in] type of memory transfer 
- stream – [in] stream identifier 
 
- Returns
- topsSuccess, topsErrorInvalidValue 
 
- TOPS_PUBLIC_API topsError_t topsMemcpyFromSymbol (void *dst, const void *symbol, size_t sizeBytes, size_t offset __dparm(0), topsMemcpyKind kind __dparm(topsMemcpyDeviceToHost))
- Copies data from the given symbol on the device. - Parameters
- dptr – [out] Returns pointer to destination memory address 
- symbol – [in] pointer to the symbol address on the device 
- sizeBytes – [in] size in bytes to copy 
- offset – [in] offset in bytes from the start of symbol 
- kind – [in] type of memory transfer 
 
- Returns
- topsSuccess, topsErrorInvalidValue 
 
- TOPS_PUBLIC_API topsError_t topsMemcpyFromSymbolAsync (void *dst, const void *symbol, size_t sizeBytes, size_t offset, topsMemcpyKind kind, topsStream_t stream __dparm(0))
- Copies data from the given symbol on the device asynchronously. - Parameters
- dptr – [out] Returns pointer to destination memory address 
- symbol – [in] pointer to the symbol address on the device 
- sizeBytes – [in] size in bytes to copy 
- offset – [in] offset in bytes from the start of symbol 
- kind – [in] type of memory transfer 
- stream – [in] stream identifier 
 
- Returns
- topsSuccess, topsErrorInvalidValue 
 
- TOPS_PUBLIC_API topsError_t topsMemcpyAsync (void *dst, const void *src, size_t sizeBytes, topsMemcpyKind kind, topsStream_t stream __dparm(0))
- Copy data from src to dst asynchronously. - For multi-gcu or peer-to-peer configurations, it is recommended to use a stream which is a attached to the device where the src data is physically located. For optimal peer-to-peer copies, the copy device must be able to access the src and dst pointers (by calling topsDeviceEnablePeerAccess with copy agent as the current device and src/dest as the peerDevice argument. if this is not done, the topsMemcpy will still work, but will perform the copy using a staging buffer on the host. - See also - topsMalloc, topsFree, topsHostMalloc, topsHostFree, topsMemGetAddressRange, topsMemGetInfo, topsHostGetDevicePointer, topsMemcpyDtoD, topsMemcpyDtoH, topsMemcpyDtoHAsync, topsMemcpyHtoD, topsMemcpyHtoDAsync - Warning - If host or dest are not pinned, the memory copy will be performed synchronously. For best performance, use topsHostMalloc to allocate host memory that is transferred asynchronously. - Warning - topsMemcpyAsync does not support overlapped H2D and D2H copies. For topsMemcpy, the copy is always performed by the device associated with the specified stream. - Parameters
- dst – [out] Data being copy to 
- src – [in] Data being copy from 
- sizeBytes – [in] Data size in bytes 
- kind – [in] type of memory transfer 
- stream – [in] stream identifier 
 
- Returns
- topsSuccess, topsErrorInvalidValue, #topsErrorMemoryFree, #topsErrorUnknown 
 
- 
TOPS_PUBLIC_API topsError_t topsMemset(void *dst, int value, size_t sizeBytes)¶
- Fills the first sizeBytes bytes of the memory area pointed to by dest with the constant byte value value. - Parameters
- dst – [out] Dst Data being filled 
- value – [in] Constant value to be set 
- sizeBytes – [in] Data size in bytes 
 
- Returns
- topsSuccess, topsErrorInvalidValue, #topsErrorNotInitialized 
 
- 
TOPS_PUBLIC_API topsError_t topsMemsetD8(topsDeviceptr_t dest, unsigned char value, size_t count)¶
- Fills the first sizeBytes bytes of the memory area pointed to by dest with the constant byte value value. - Parameters
- dst – [out] Data ptr to be filled 
- value – [in] Constant value to be set 
- count – [in] Number of values to be set 
 
- Returns
- topsSuccess, topsErrorInvalidValue, #topsErrorNotInitialized 
 
- TOPS_PUBLIC_API topsError_t topsMemsetD8Async (topsDeviceptr_t dest, unsigned char value, size_t count, topsStream_t stream __dparm(0))
- Fills the first sizeBytes bytes of the memory area pointed to by dest with the constant byte value value. - topsMemsetD8Async() is asynchronous with respect to the host, so the call may return before the memset is complete. The operation can optionally be associated to a stream by passing a non-zero stream argument. If stream is non-zero, the operation may overlap with operations in other streams. - Parameters
- dest – [out] Data ptr to be filled 
- value – [in] Constant value to be set 
- count – [in] Number of values to be set 
- stream – [in] - Stream identifier 
 
- Returns
- topsSuccess, topsErrorInvalidValue, #topsErrorNotInitialized 
 
- 
TOPS_PUBLIC_API topsError_t topsMemsetD16(topsDeviceptr_t dest, unsigned short value, size_t count)¶
- Fills the first sizeBytes bytes of the memory area pointed to by dest with the constant short value value. - Parameters
- dest – [out] Data ptr to be filled 
- value – [in] Constant value to be set 
- count – [in] Number of values to be set 
 
- Returns
- topsSuccess, topsErrorInvalidValue, #topsErrorNotInitialized 
 
- TOPS_PUBLIC_API topsError_t topsMemsetD16Async (topsDeviceptr_t dest, unsigned short value, size_t count, topsStream_t stream __dparm(0))
- Fills the first sizeBytes bytes of the memory area pointed to by dest with the constant short value value. - topsMemsetD16Async() is asynchronous with respect to the host, so the call may return before the memset is complete. The operation can optionally be associated to a stream by passing a non-zero stream argument. If stream is non-zero, the operation may overlap with operations in other streams. - Parameters
- dest – [out] Data ptr to be filled 
- value – [in] Constant value to be set 
- count – [in] Number of values to be set 
- stream – [in] - Stream identifier 
 
- Returns
- topsSuccess, topsErrorInvalidValue, #topsErrorNotInitialized 
 
- 
TOPS_PUBLIC_API topsError_t topsMemsetD32(topsDeviceptr_t dest, int value, size_t count)¶
- Fills the memory area pointed to by dest with the constant integer value for specified number of times. - Parameters
- dest – [out] Data being filled 
- value – [in] Constant value to be set 
- count – [in] Number of values to be set 
 
- Returns
- topsSuccess, topsErrorInvalidValue, #topsErrorNotInitialized 
 
- TOPS_PUBLIC_API topsError_t topsMemsetAsync (void *dst, int value, size_t sizeBytes, topsStream_t stream __dparm(0))
- Fills the first sizeBytes bytes of the memory area pointed to by dev with the constant byte value value. - topsMemsetAsync() is asynchronous with respect to the host, so the call may return before the memset is complete. The operation can optionally be associated to a stream by passing a non-zero stream argument. If stream is non-zero, the operation may overlap with operations in other streams. - Parameters
- dst – [out] Pointer to device memory 
- value – [in] - Value to set for each byte of specified memory 
- sizeBytes – [in] - Size in bytes to set 
- stream – [in] - Stream identifier 
 
- Returns
- topsSuccess, topsErrorInvalidValue, #topsErrorMemoryFree 
 
- TOPS_PUBLIC_API topsError_t topsMemsetD32Async (topsDeviceptr_t dst, int value, size_t count, topsStream_t stream __dparm(0))
- Fills the memory area pointed to by dev with the constant integer value for specified number of times. - topsMemsetD32Async() is asynchronous with respect to the host, so the call may return before the memset is complete. The operation can optionally be associated to a stream by passing a non-zero stream argument. If stream is non-zero, the operation may overlap with operations in other streams. - Parameters
- dst – [out] Pointer to device memory 
- value – [in] - Value to set for each byte of specified memory 
- count – [in] - number of values to be set 
- stream – [in] - Stream identifier 
 
- Returns
- topsSuccess, topsErrorInvalidValue, #topsErrorMemoryFree 
 
- 
TOPS_PUBLIC_API topsError_t topsMemGetInfo(size_t *free, size_t *total)¶
- Query memory info. - Return snapshot of free memory, and total allocatable memory on the device. - Returns in *free a snapshot of the current free memory. - Warning - The free memory only accounts for memory allocated by this process and may be optimistic. - Returns
- topsSuccess, topsErrorInvalidDevice, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsMemPtrGetInfo(void *ptr, size_t *size)¶
- Query memory pointer info. Return size of the memory pointer. - Parameters
- size – [out] The size of memory pointer. 
- ptr – [in] Pointer to memory for query. 
 
- Returns
- topsSuccess, topsErrorInvalidDevice, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsMemGetAddressRange(topsDeviceptr_t *pbase, size_t *psize, topsDeviceptr_t dptr)¶
- Get information on memory allocations. - Parameters
- pbase – [out] - Base pointer address 
- psize – [out] - Size of allocation 
- dptr- – [in] Device Pointer 
 
- Returns
- topsSuccess, topsErrorInvalidDevicePointer 
 
2.7. PeerToPeer¶
This section describes the PeerToPeer device memory access functions of TOPS runtime API.
- 
TOPS_PUBLIC_API topsError_t topsDeviceCanAccessPeer(int *canAccess, int deviceId, int peerDeviceId)¶
- Checks if peer/esl access between two devices is possible. - Parameters
- deviceId – [in] - device id 
- peerDeviceId – [in] - peer device id 
- canAccess – [out] - access between two devices, bit[7~0] : Each bit indicating corresponding port status: 1 link, 0 no-link. bit[15~8] : p2p link type: 0 no-p2p-link, 1 PCIe switch link, 2 RCs link. bit[23~16] :cluster as device type: 1 cluster as device. 
 
- Returns
- topsSuccess, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsDeviceEnablePeerAccess(int peerDeviceId, unsigned int flags)¶
- Set peer/esl access property. - Parameters
- peerDeviceId – [in] - peer device id 
- flags – [in] - access property 
 
- Returns
- topsSuccess, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsDeviceEnablePeerAccessRegion(int peerDeviceId, void *peerDevPtr, size_t size, void **devPtr)¶
- Set peer access property for peer device’s region. - Parameters
- peerDeviceId – [in] - peer device id 
- peerDevPtr – [in] - the start device ptr of the peer device’s region 
- size – [in] - the size of the peer device’s region 
- devPtr – [out] - the p2p mapped device address 
 
- Returns
- topsSuccess, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsDeviceDisablePeerAccessRegion(int peerDeviceId, void *peerDevPtr, size_t size)¶
- destroy peer access property for peer device’s region. - Parameters
- peerDeviceId – [in] - peer device id 
- peerDevPtr – [in] - the start device ptr of the peer device’s region 
- size – [in] - the size of the peer device’s region 
 
- Returns
- topsSuccess, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsMemcpyPeer(void *dst, int dstDevice, const void *src, int srcDevice, size_t sizeBytes)¶
- Copies memory from one device to memory on another device. - For topsMemcpyPeer, the copy is always performed by the current device (set by topsSetDevice). For multi-gcu or peer-to-peer configurations, it is recommended to set the current device to the device where the src data is physically located. For optimal peer-to-peer copies, the copy device must be able to access the src and dst pointers (by calling topsDeviceEnablePeerAccess with copy agent as the current device and src/dest as the peerDevice argument. - Parameters
- dst – [out] Data being copy to 
- dstDevice – [in] Dst device id 
- src – [in] Data being copy from 
- srcDevice – [in] Src device id 
- sizeBytes – [in] Data size in bytes 
 
- Returns
- topsSuccess, topsErrorInvalidValue, #topsErrorMemoryFree, #topsErrorUnknown 
 
- 
TOPS_PUBLIC_API topsError_t topsMemcpyPeerAsync(void *dst, int dstDevice, const void *src, int srcDevice, size_t sizeBytes, topsStream_t stream)¶
- Copies memory from one device to memory on another device asynchronously. - For multi-gcu or peer-to-peer configurations, it is recommended to use a stream which is a attached to the device where the src data is physically located. For optimal peer-to-peer copies, the copy device must be able to access the src and dst pointers (by calling topsDeviceEnablePeerAccess with copy agent as the current device and src/dest as the peerDevice argument. - Parameters
- dst – [out] Data being copy to 
- dstDevice – [in] Dst device id 
- src – [in] Data being copy from 
- srcDevice – [in] Src device id 
- sizeBytes – [in] Data size in bytes 
- stream – [in] Stream identifier 
 
- Returns
- topsSuccess, topsErrorInvalidValue, #topsErrorMemoryFree, #topsErrorUnknown 
 
- 
TOPS_PUBLIC_API topsError_t topsMemcpyPeerExt(void *dst, int dstDevice, const void *src, int srcDevice, size_t sizeBytes, topsTopologyMapType map, int port)¶
- Copies memory from one device to memory on another device with special priority. - For topsMemcpyPeerExt, the copy is always performed by the current device (set by topsSetDevice). For multi-gcu or peer-to-peer configurations, it is recommended to set the current device to the device where the src data is physically located. For optimal peer-to-peer copies, the copy device must be able to access the src and dst pointers (by calling topsDeviceEnablePeerAccess with copy agent as the current device and src/dest as the peerDevice argument. - Parameters
- dst – [out] Data being copy to 
- dstDevice – [in] Dst device id 
- src – [in] Data being copy from 
- srcDevice – [in] Src device id 
- sizeBytes – [in] Data size in bytes 
- map – [in] The link that is expected to pass 
- port – [in] ESL port id 
 
- Returns
- topsSuccess, topsErrorInvalidValue, #topsErrorMemoryFree, #topsErrorUnknown 
 
- 
TOPS_PUBLIC_API topsError_t topsMemcpyPeerExtAsync(void *dst, int dstDevice, const void *src, int srcDevice, size_t sizeBytes, topsTopologyMapType map, int port, topsStream_t stream)¶
- Copies memory from one device to memory on another device asynchronously with special priority. - For topsMemcpyPeerExtAsync, the copy is always performed by the current device (set by topsSetDevice). For multi-gcu or peer-to-peer configurations, it is recommended to set the current device to the device where the src data is physically located. For optimal peer-to-peer copies, the copy device must be able to access the src and dst pointers (by calling topsDeviceEnablePeerAccess with copy agent as the current device and src/dest as the peerDevice argument. - Parameters
- dst – [out] Data being copy to 
- dstDevice – [in] Dst device id 
- src – [in] Data being copy from 
- srcDevice – [in] Src device id 
- sizeBytes – [in] Data size in bytes 
- map – [in] The link that is expected to pass 
- port – [in] ESL port id 
- stream – [in] Stream identifier 
 
- Returns
- topsSuccess, topsErrorInvalidValue, #topsErrorMemoryFree, #topsErrorUnknown 
 
2.8. Module¶
This section describes the module management functions of TOPS runtime API.
- 
TOPS_PUBLIC_API topsError_t topsModuleLoad(topsModule_t *module, const char *fname)¶
- Loads code object from file into a topsModule_t. - Parameters
- fname – [in] 
- module – [out] 
 
- Returns
- topsSuccess, topsErrorInvalidValue, topsErrorInvalidContext, topsErrorFileNotFound, topsErrorOutOfMemory, topsErrorSharedObjectInitFailed, topsErrorNotInitialized 
 
- 
TOPS_PUBLIC_API topsError_t topsModuleUnload(topsModule_t module)¶
- Frees the module. - Parameters
- module – [in] 
- Returns
- topsSuccess, topsErrorInvalidValue module is freed and the code objects associated with it are destroyed 
 
- 
TOPS_PUBLIC_API topsError_t topsModuleGetFunction(topsFunction_t *function, topsModule_t module, const char *kname)¶
- Function with kname will be extracted if present in module. - Parameters
- module – [in] 
- kname – [in] 
- function – [out] 
 
- Returns
- topsSuccess, topsErrorInvalidValue, topsErrorInvalidContext, topsErrorNotInitialized, topsErrorNotFound, 
 
- 
TOPS_PUBLIC_API topsError_t topsFuncGetAttributes(struct topsFuncAttributes *attr, const void *func)¶
- Find out attributes for a given function. - Parameters
- attr – [out] 
- func – [in] 
 
- Returns
- topsSuccess, topsErrorInvalidValue, topsErrorInvalidDeviceFunction 
 
- 
TOPS_PUBLIC_API topsError_t topsFuncGetAttribute(int *value, topsFunction_attribute attrib, topsFunction_t hfunc)¶
- Find out a specific attribute for a given function. - Parameters
- value – [out] 
- attrib – [in] 
- hfunc – [in] 
 
- Returns
- topsSuccess, topsErrorInvalidValue, topsErrorInvalidDeviceFunction 
 
- 
TOPS_PUBLIC_API topsError_t topsModuleLoadData(topsModule_t *module, const void *image)¶
- builds module from code object which resides in host memory. Image is pointer to that location. - Parameters
- image – [in] 
- module – [out] 
 
- Returns
- topsSuccess, topsErrorNotInitialized, topsErrorOutOfMemory, topsErrorNotInitialized 
 
- 
TOPS_PUBLIC_API topsError_t topsModuleLoadDataEx(topsModule_t *module, const void *image, unsigned int numOptions, topsJitOption *options, void **optionValues)¶
- builds module from code object which resides in host memory. Image is pointer to that location. Options are not used. topsModuleLoadData is called. - Parameters
- image – [in] 
- module – [out] 
- numOptions – [in] Number of options 
- options – [in] Options for JIT 
- optionValues – [in] Option values for JIT 
 
- Returns
- topsSuccess, topsErrorNotInitialized, topsErrorOutOfMemory, topsErrorNotInitialized 
 
- 
TOPS_PUBLIC_API topsError_t topsModuleLaunchKernel(topsFunction_t f, unsigned int gridDimX, unsigned int gridDimY, unsigned int gridDimZ, unsigned int blockDimX, unsigned int blockDimY, unsigned int blockDimZ, unsigned int sharedMemBytes, topsStream_t stream, void **kernelParams, void **extra)¶
- launches kernel f with launch parameters and shared memory on stream with arguments passed to kernelparams or extra - Please note, TOPS does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32. So gridDim.x * blockDim.x, gridDim.y * blockDim.y and gridDim.z * blockDim.z are always less than 2^32. - Warning - kernellParams argument is not yet implemented in TOPS. Please use extra instead. Please refer to tops_porting_driver_api.md for sample usage. - Parameters
- f – [in] Kernel to launch. 
- gridDimX – [in] X grid dimension specified as multiple of blockDimX. 
- gridDimY – [in] Y grid dimension specified as multiple of blockDimY. 
- gridDimZ – [in] Z grid dimension specified as multiple of blockDimZ. 
- blockDimX – [in] X block dimensions specified in work-items 
- blockDimY – [in] Y grid dimension specified in work-items 
- blockDimZ – [in] Z grid dimension specified in work-items 
- sharedMemBytes – [in] Amount of dynamic shared memory to allocate for this kernel. The TOPS-Clang compiler provides support for extern shared declarations. 
- stream – [in] Stream where the kernel should be dispatched. May be 0, in which case the default stream is used with associated synchronization rules. 
- kernelParams – [in] 
- extra – [in] Pointer to kernel arguments. These are passed directly to the kernel and must be in the memory layout and alignment expected by the kernel. 
 
- Returns
- topsSuccess, topsInvalidDevice, topsErrorNotInitialized, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsLaunchCooperativeKernel(const void *f, dim3 gridDim, dim3 blockDimX, void **kernelParams, size_t sharedMemBytes, topsStream_t stream)¶
- launches kernel f with launch parameters and shared memory on stream with arguments passed to kernelparams or extra, where thread blocks can cooperate and synchronize as they execute - Please note, TOPS does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32. - Parameters
- f – [in] Kernel to launch. 
- gridDim – [in] Grid dimensions specified as multiple of blockDim. 
- blockDim – [in] Block dimensions specified in work-items 
- kernelParams – [in] A list of kernel arguments 
- sharedMemBytes – [in] Amount of dynamic shared memory to allocate for this kernel. The TOPS-Clang compiler provides support for extern shared declarations. 
- stream – [in] Stream where the kernel should be dispatched. May be 0, in which case the default stream is used with associated synchronization rules. 
 
- Returns
- topsSuccess, topsInvalidDevice, topsErrorNotInitialized, topsErrorInvalidValue, topsErrorCooperativeLaunchTooLarge 
 
2.9. Clang¶
This section describes the API to support the triple-chevron syntax.
- TOPS_PUBLIC_API topsError_t topsConfigureCall (dim3 gridDim, dim3 blockDim, size_t sharedMem __dparm(0), topsStream_t stream __dparm(0))
- Configure a kernel launch. - Please note, TOPS does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32. - Parameters
- gridDim – [in] grid dimension specified as multiple of blockDim. 
- blockDim – [in] block dimensions specified in work-items 
- sharedMem – [in] Amount of dynamic shared memory to allocate for this kernel. The TOPS-Clang compiler provides support for extern shared declarations. 
- stream – [in] Stream where the kernel should be dispatched. May be 0, in which case the default stream is used with associated synchronization rules. 
 
- Returns
- topsSuccess, topsInvalidDevice, topsErrorNotInitialized, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsSetupArgument(const void *arg, size_t size, size_t offset)¶
- Set a kernel argument. - Parameters
- arg – [in] Pointer the argument in host memory. 
- size – [in] Size of the argument. 
- offset – [in] Offset of the argument on the argument stack. 
 
- Returns
- topsSuccess, topsInvalidDevice, topsErrorNotInitialized, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsLaunchByPtr(const void *func)¶
- Launch a kernel. - Parameters
- func – [in] Kernel to launch. 
- Returns
- topsSuccess, topsInvalidDevice, topsErrorNotInitialized, topsErrorInvalidValue 
 
- TOPS_PUBLIC_API topsError_t __topsPushCallConfiguration (dim3 gridDim, dim3 blockDim, size_t sharedMem __dparm(0), topsStream_t stream __dparm(0))
- Push configuration of a kernel launch. - Please note, TOPS does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32. - Parameters
- gridDim – [in] grid dimension specified as multiple of blockDim. 
- blockDim – [in] block dimensions specified in work-items 
- sharedMem – [in] Amount of dynamic shared memory to allocate for this kernel. The TOPS-Clang compiler provides support for extern shared declarations. 
- stream – [in] Stream where the kernel should be dispatched. May be 0, in which case the default stream is used with associated synchronization rules. 
 
- Returns
- topsSuccess, topsInvalidDevice, topsErrorNotInitialized, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t __topsPopCallConfiguration(dim3 *gridDim, dim3 *blockDim, size_t *sharedMem, topsStream_t *stream)¶
- Pop configuration of a kernel launch. - Please note, TOPS does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32. - Parameters
- gridDim – [out] grid dimension specified as multiple of blockDim. 
- blockDim – [out] block dimensions specified in work-items 
- sharedMem – [out] Amount of dynamic shared memory to allocate for this kernel. The TOPS-Clang compiler provides support for extern shared declarations. 
- stream – [out] Stream where the kernel should be dispatched. May be 0, in which case the default stream is used with associated synchronization rules. 
 
- Returns
- topsSuccess, topsInvalidDevice, topsErrorNotInitialized, topsErrorInvalidValue 
 
- TOPS_PUBLIC_API topsError_t topsLaunchKernel (const void *function_address, dim3 numBlocks, dim3 dimBlocks, void **args, size_t sharedMemBytes __dparm(0), topsStream_t stream __dparm(0))
- C compliant kernel launch API. - Parameters
- function_address – [in] - kernel stub function pointer. 
- numBlocks – [in] - number of blocks 
- dimBlocks – [in] - dimension of a block 
- args – [in] - kernel arguments 
- sharedMemBytes – [in] - Amount of dynamic shared memory to allocate for this kernel. The TOPS-Clang compiler provides support for extern shared declarations. 
- stream – [in] - Stream where the kernel should be dispatched. May be 0, in which case the default stream is used with associated synchronization rules. 
 
- Returns
- topsSuccess, topsErrorInvalidValue, topsInvalidDevice 
 
2.10. Runtime¶
This section describes the runtime compilation functions of TOPS runtime API
- 
enum topsrtcResult¶
- Values: - 
enumerator TOPSRTC_SUCCESS¶
 - 
enumerator TOPSRTC_ERROR_OUT_OF_MEMORY¶
 - 
enumerator TOPSRTC_ERROR_PROGRAM_CREATION_FAILURE¶
 - 
enumerator TOPSRTC_ERROR_INVALID_INPUT¶
 - 
enumerator TOPSRTC_ERROR_INVALID_PROGRAM¶
 - 
enumerator TOPSRTC_ERROR_INVALID_OPTION¶
 - 
enumerator TOPSRTC_ERROR_COMPILATION¶
 - 
enumerator TOPSRTC_ERROR_BUILTIN_OPERATION_FAILURE¶
 - 
enumerator TOPSRTC_ERROR_NAME_EXPRESSION_NOT_VALID¶
 - 
enumerator TOPSRTC_ERROR_INTERNAL_ERROR¶
 
- 
enumerator TOPSRTC_SUCCESS¶
- 
typedef enum topsrtcResult topsrtcResult
- 
typedef struct _topsrtcProgram *topsrtcProgram¶
- 
TOPS_PUBLIC_API const char *topsrtcGetErrorString(topsrtcResult result)¶
- Returns a string message to describing the error which occurred. - See also - topsrtcResult - Warning - If the topsrtc result is defined, it will return “Invalid TOPSRTC error code” - Parameters
- result – [in] TOPSRTC API result code. 
- Returns
- const char message string for the given topsrtcResult code. 
 
- 
TOPS_PUBLIC_API topsrtcResult topsrtcVersion(int *major, int *minor)¶
- Sets the output parameters major and minor with the TOPSRTC version. - Parameters
- major – [out] TOPS Runtime Compilation major version number. 
- minor – [out] TOPS Runtime Compilation minor version number. 
 
 
- 
TOPS_PUBLIC_API topsrtcResult topsrtcAddNameExpression(topsrtcProgram program, const char *name_expression)¶
- Adds the given name exprssion to the runtime compilation program. - If const char pointer is NULL, it will return TOPSRTC_ERROR_INVALID_INPUT. - See also - topsrtcResult - Parameters
- program – [in] runtime compilation program instance. 
- name_expression – [in] const char pointer to the name expression. 
 
- Returns
- TOPSRTC_SUCCESS 
 
- 
TOPS_PUBLIC_API topsrtcResult topsrtcCompileProgram(topsrtcProgram program, int num_options, const char **options)¶
- Compiles the given runtime compilation program. - If the compiler failed to build the runtime compilation program, it will return TOPSRTC_ERROR_COMPILATION. - See also - topsrtcResult - Parameters
- program – [in] runtime compilation program instance. 
- num_options – [in] number of compiler options. 
- options – [in] compiler options as const array of strins. 
 
- Returns
- TOPSRTC_SUCCESS 
 
- 
TOPS_PUBLIC_API topsrtcResult topsrtcCreateProgram(topsrtcProgram *program, const char *source, const char *name, int num_headers, const char **headers, const char **include_names)¶
- Creates an instance of topsrtcProgram with the given input parameters, and sets the output topsrtcProgram program with it. - Any invalide input parameter, it will return TOPSRTC_ERROR_INVALID_INPUT or TOPSRTC_ERROR_INVALID_PROGRAM. - If failed to create the program, it will return TOPSRTC_ERROR_PROGRAM_CREATION_FAILURE. - See also - topsrtcResult - Parameters
- program – [inout] runtime compilation program instance. 
- source – [in] const char pointer to the program source. 
- name – [in] const char pointer to the program name. 
- num_headers – [in] number of headers. 
- headers – [in] array of strings pointing to headers. 
- include_names – [in] array of strings pointing to names included in program source. 
 
- Returns
- TOPSRTC_SUCCESS 
 
- 
TOPS_PUBLIC_API topsrtcResult topsrtcDestroyProgram(topsrtcProgram *program)¶
- Destroys an instance of given topsrtcProgram. - If program is NULL, it will return TOPSRTC_ERROR_INVALID_INPUT. - See also - topsrtcResult - Parameters
- program – [in] runtime compilation program instance. 
- Returns
- TOPSRTC_SUCCESS 
 
- 
TOPS_PUBLIC_API topsrtcResult topsrtcGetLoweredName(topsrtcProgram program, const char *name_expression, const char **lowered_name)¶
- Gets the lowered (mangled) name from an instance of topsrtcProgram with the given input parameters, and sets the output lowered_name with it. - If any invalide nullptr input parameters, it will return TOPSRTC_ERROR_INVALID_INPUT - If name_expression is not found, it will return TOPSRTC_ERROR_NAME_EXPRESSION_NOT_VALID - If failed to get lowered_name from the program, it will return TOPSRTC_ERROR_COMPILATION. - See also - topsrtcResult - Parameters
- program – [in] runtime compilation program instance. 
- name_expression – [in] const char pointer to the name expression. 
- lowered_name – [inout] const char array to the lowered (mangled) name. 
 
- Returns
- TOPSRTC_SUCCESS 
 
- 
TOPS_PUBLIC_API topsrtcResult topsrtcGetProgramLog(topsrtcProgram program, char *log)¶
- Gets the log generated by the runtime compilation program instance. - See also - topsrtcResult - Parameters
- program – [in] runtime compilation program instance. 
- log – [out] memory pointer to the generated log. 
 
- Returns
- TOPSRTC_SUCCESS 
 
- 
TOPS_PUBLIC_API topsrtcResult topsrtcGetProgramLogSize(topsrtcProgram program, size_t *log_size)¶
- Gets the size of log generated by the runtime compilation program instance. - See also - topsrtcResult - Parameters
- program – [in] runtime compilation program instance. 
- log_size – [out] size of generated log. 
 
- Returns
- TOPSRTC_SUCCESS 
 
- 
TOPS_PUBLIC_API topsrtcResult topsrtcGetCode(topsrtcProgram program, char *code)¶
- Gets the pointer of compilation binary by the runtime compilation program instance. - See also - topsrtcResult - Parameters
- program – [in] runtime compilation program instance. 
- code – [out] char pointer to binary. 
 
- Returns
- TOPSRTC_SUCCESS 
 
- 
TOPS_PUBLIC_API topsrtcResult topsrtcGetCodeSize(topsrtcProgram program, size_t *codeSizeRet)¶
- Gets the size of compilation binary by the runtime compilation program instance. - See also - topsrtcResult - Parameters
- program – [in] runtime compilation program instance. 
- code – [out] the size of binary. 
 
- Returns
- TOPSRTC_SUCCESS 
 
2.11. Extension¶
This section describes the runtime Ext functions of TOPS runtime API
- 
TOPS_PUBLIC_API topsError_t topsMemorySetDims(const void *devPtr, int64_t *dims, size_t dims_count)¶
- Set dimension for device memory. - Limitation: devPtr only support allocated by topsMalloc* - Parameters
- devPtr – [in] The device memory to be set. 
- dims – [in] The dimension to be set. 
- dims_count – [in] The dimension size to be set. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsMemoryGetDims(const void *devPtr, int64_t *dims, size_t *dims_count)¶
- Get dimension of device memory. - Limitation: devPtr only support allocated by topsMalloc* - Parameters
- devPtr – [in] The device memory to be set. 
- dims – [out] The dimension pointer list to be get. 
- dims_count – [out] The dimension rank pointer list to be get. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsCreateExecutable(topsExecutable_t *exe, const void *bin, size_t size)¶
- Create an executable with specified binary data and size. - Note: Ownership of the pointer to new executable is transferred to the caller. Caller need call topsDestroyExecutable to destroy this pointer when no longer use it. - Parameters
- bin – [in] The pointer to the binary data. 
- size – [in] The size of the binary data. 
- exe – [out] Pointer to get new executable. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsCreateExecutableFromFile(topsExecutable_t *exe, const char *filepath)¶
- Create an executable with specified file. - Note: Ownership of the pointer to new executable is transferred to the caller. Caller need call topsDestroyExecutable to destroy this pointer when no longer use it. - Parameters
- filepath – [in] The name of binary file. 
- exe – [out] Pointer to get new executable. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsDestroyExecutable(topsExecutable_t exe)¶
- Destroy and clean up an executable. - Parameters
- exe – [in] Pointer to executable to destroy. 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsCreateResource(topsResource_t *res, topsResourceRequest_t req)¶
- Create a resource bundle with specified request. If device is in reset status, this method will retry until reset finish. - Note: Ownership of the pointer to new resource bundle is transferred to the caller. Caller need call topsDestroyResource to destroy this pointer when no longer use it. - Parameters
- res – [out] Pointer to get new resource bundle or nullptr if failed. 
- req – [in] Requested resource of allocated resource bundle. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsCreateResourceForExecutable(topsResource_t *res, topsExecutable_t exe)¶
- Create a new resource bundle with specified target resource. If device is in reset status, this method will retry until reset finish. - Note: Ownership of the pointer to new resource bundle is transferred to the caller. Caller need call topsDestroyResource to destroy this pointer when no longer use it. - Parameters
- res – [out] Pointer to get new resource bundle or nullptr if failed. 
- exe – [in] Pointer to the executable to get target pointer which contains the requested resource. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsDestroyResource(topsResource_t res)¶
- Destroy and clean up a resource bundle. - Parameters
- res – [in] Pointer to resource bundle to destroy. 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsResourceBundleGetAttribute(topsResource_t res, topsResourceBundleInfoType_t type, uint64_t *data)¶
- Query resource bundle attribute. - Parameters
- res – [in] Pointer to resource bundle to query. 
- type – [in] Type to query. 
- data – [out] Pointer to query output data. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsMallocForResource(void **ptr, size_t size, topsResource_t res)¶
- Allocate memory on the res_bundle affinity memory bank. - If size is 0, no memory is allocated, *ptr returns non-nullptr, and topsSuccess is returned. - Use topsFree to release ptr - Parameters
- ptr – [out] Pointer to the allocated memory 
- size – [in] Requested memory size 
- res – [in] res_bundle 
 
- Returns
- topsSuccess, #topsErrorOutOfMemory, topsErrorInvalidValue (bad context, null *ptr) 
 
- 
TOPS_PUBLIC_API topsError_t topsLaunchExecutableV2(topsExecutable_t exe, topsResource_t res, void **inputs, size_t input_count, int64_t *input_dims, size_t *input_rank, void **outputs, size_t output_count, topsStream_t stream)¶
- Asynchronously run a executable. - Run an executable with given inputs and outputs for training. - Note: default use runExe with Operator mode (not support dynamic shape), and if res is nullptr, use default res_bundle - Parameters
- exe – [in] Pointer to executable object. 
- res – [in] Pointer to res_bundle object. 
- inputs – [in] Inputs of executable. 
- input_count – [in] Inputs count of executable. 
- input_dims – [in] Inputs dims List of executable. 
- input_rank – [in] Inputs dims rank of executable. 
- outputs – [in] Outputs of executable. 
- output_count – [in] Outputs count of executable. 
- stream – [in] stream identifier. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsLaunchExecutableV3(topsExecutable_t exe, topsResource_t res, void **inputs, size_t input_count, int64_t *input_dims, size_t *input_rank, void **outputs, size_t output_count, int64_t *output_dims, size_t *output_rank, topsStream_t stream)¶
- Asynchronously run a executable. - Run an executable with given inputs and outputs for training. - Limitation: for performance, output_dims/output_rank should be set to nullptr. if users set output_dims/output_rank as no-zero, topsLaunchExecutableV3 will insert a blocking mode hostcallback operation to get output_dims and output_rank - Note: default use runExe with Graph mode(support dynamic shape), and if res is nullptr, use default res_bundle - Parameters
- exe – [in] Pointer to executable object. 
- res – [in] Pointer to res_bundle object. 
- inputs – [in] Inputs of executable. 
- input_count – [in] Inputs count of executable. 
- input_dims – [in] Inputs dims List of executable. 
- input_rank – [in] Inputs dims rank of executable. 
- outputs – [in] Outputs of executable. 
- output_count – [in] Outputs count of executable. 
- output_dims – [out] Outputs dims List of executable. 
- output_rank – [out] Outputs dims rank of executable. 
- stream – [in] stream identifier. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsLaunchExecutable(topsExecutable_t exe, topsResource_t res, void **inputs, size_t input_count, int64_t *input_dims, size_t *input_rank, void **outputs, size_t output_count, int64_t *output_dims, size_t *output_rank, topsStream_t stream)¶
- Asynchronously run a executable. - Run an executable with given inputs and outputs for training. - Limitation: for performance, output_dims/output_rank should be set to nullptr. if user set output_dims/output_rank no-zero, topsLaunchExecutable may synchronize to get output_dims and output_rank - Note: default use runExe with Graph mode(support dynamic shape), and if res is nullptr, use default res_bundle - Parameters
- exe – [in] Pointer to executable object. 
- res – [in] Pointer to res_bundle object. 
- inputs – [in] Inputs of executable. 
- input_count – [in] Inputs count of executable. 
- input_dims – [in] Inputs dims List of executable. 
- input_rank – [in] Inputs dims rank of executable. 
- outputs – [in] Outputs of executable. 
- output_count – [in] Outputs count of executable. 
- output_dims – [out] Outputs dims List of executable. 
- output_rank – [out] Outputs dims rank of executable. 
- stream – [in] stream identifier. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsLaunchExecutableWithConstData(topsExecutable_t exe, topsResource_t res, void **inputs, size_t input_count, int64_t *input_dims, size_t *input_rank, void **outputs, size_t output_count, int64_t *output_dims, size_t *output_rank, void **const_datas, size_t const_datas_count, topsStream_t stream)¶
- Asynchronously run a executable. - Run an executable with given inputs and outputs for training. - Limitation: for performance, output_dims/output_rank should be set to nullptr. if user set output_dims/output_rank no-zero, topsLaunchExecutableWithConstData may synchronize to get output_dims and output_rank - Note: default use runExe with Graph mode(support dynamic shape), and if res is nullptr, use default res_bundle - Parameters
- exe – [in] Pointer to executable object. 
- res – [in] Pointer to res_bundle object. 
- inputs – [in] Inputs of executable. 
- input_count – [in] Inputs count of executable. 
- input_dims – [in] Inputs dims List of executable. 
- input_rank – [in] Inputs dims rank of executable. 
- outputs – [in] Outputs of executable. 
- output_count – [in] Outputs count of executable. 
- output_dims – [out] Outputs dims List of executable. 
- output_rank – [out] Outputs dims rank of executable. 
- const_datas – [in] Const_datas of executable. 
- const_datas_count – [in] Const_datas count of executable. 
- stream – [in] stream identifier. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsLaunchExecutableWithConstDataV2(topsExecutable_t exe, topsResource_t res, void **inputs, size_t input_count, int64_t *input_dims, size_t *input_rank, void **outputs, size_t output_count, void **const_datas, size_t const_datas_count, topsStream_t stream)¶
- Asynchronously run a executable. - Run an executable with given inputs and outputs for training. - Note: default use runExe with Operator mode (not support dynamic shape), and if res is nullptr, use default res_bundle - Parameters
- exe – [in] Pointer to executable object. 
- res – [in] Pointer to res_bundle object. 
- inputs – [in] Inputs of executable. 
- input_count – [in] Inputs count of executable. 
- input_dims – [in] Inputs dims List of executable. 
- input_rank – [in] Inputs dims rank of executable. 
- outputs – [in] Outputs of executable. 
- output_count – [in] Outputs count of executable. 
- const_datas – [in] Const_datas of executable. 
- const_datas_count – [in] Const_datas count of executable. 
- stream – [in] stream identifier. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsLaunchExecutableWithConstDataV3(topsExecutable_t exe, topsResource_t res, void **inputs, size_t input_count, int64_t *input_dims, size_t *input_rank, void **outputs, size_t output_count, int64_t *output_dims, size_t *output_rank, void **const_datas, size_t const_datas_count, topsStream_t stream)¶
- Asynchronously run a executable. - Run an executable with given inputs and outputs for training. - Limitation: for performance, output_dims/output_rank should be set to nullptr. if users set output_dims/output_rank as no-zero, topsLaunchExecutableWithConstDataV3 will insert a blocking mode hostcallback operation to get output_dims and output_rank - Note: default use runExe with Graph mode(support dynamic shape), and if res is nullptr, use default res_bundle - Parameters
- exe – [in] Pointer to executable object. 
- res – [in] Pointer to res_bundle object. 
- inputs – [in] Inputs of executable. 
- input_count – [in] Inputs count of executable. 
- input_dims – [in] Inputs dims List of executable. 
- input_rank – [in] Inputs dims rank of executable. 
- outputs – [in] Outputs of executable. 
- output_count – [in] Outputs count of executable. 
- output_dims – [out] Outputs dims List of executable. 
- output_rank – [out] Outputs dims rank of executable. 
- const_datas – [in] Const_datas of executable. 
- const_datas_count – [in] Const_datas count of executable. 
- stream – [in] stream identifier. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsExecutableGetConstManagedData(topsExecutable_t exe, unsigned int *numOptions, char **name, void **address, uint64_t *size, int64_t *uid, void **flag)¶
- Get Const Managed Data. - Note: call twice, first get numOptions, Second get the rest - Parameters
- numOptions – [out] Pointer to ConstManagedData array size. 
- exe – [in] Pointer to executable object. 
- name – [out] ConstManaged pair of name. 
- address – [out] ConstManaged pair of address. 
- size – [out] ConstManaged address pair of size. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsExecutableUpdateConstantKey(topsExecutable_t exe)¶
- Update Constant Section Key. - Note: must called before save executable to file - Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsExecutableUpdateRuntimeResource(topsExecutable_t exe, topsResource_t res, topsStream_t stream)¶
- Update Executable Runtime Resource. - Note: call after refit, will update constant partially by h2d - Parameters
- exe – [in] Pointer to executable object. 
- res – [in] Pointer to res_bundle object. 
- stream – [in] stream identifier. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsExecutableLoadConstData(topsExecutable_t exe, topsResource_t res, void **&dev_ptr, size_t *dev_ptr_count)¶
- Load constant data. - NOTE: pointer variable dev_ptr must initialized to nullptr to call this API, call topsFree dev_ptr[index] to free each device memory, then delete dev_ptr when finish - Parameters
- exe – [in] Pointer to executable object. 
- res – [in] Pointer to res_bundle object. 
- dev_ptr – [out] Dev_mem of executable. 
- dev_ptr_count – [out] Dev_mem count of executable. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsExecutableGetSubFuncInfo(topsExecutable_t exe, char **info_raw_data, size_t *info_size, int **param_info, int *param_count)¶
- get excutable sub function information - Note: use only in refit stage 2, user should parse the Info_RawData to get detailed sub function information example: param_info[2, 1, 3, 2, 3, 1] means subfuncA need 2 inputs, 1 output; subfuncB need 3 inputs, 2 outputs; subfuncC need 3 inputs, 1 output - Parameters
- exe – [in] Pointer to executable object 
- Info_RawData – [out] raw data of sub function information 
- Info_size – [out] size of Info_RawData 
- param_info – [out] count list of input and output for each sub function 
- param_count – [out] total count of input and output 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsExecutableGetRefitFlag(topsExecutable_t exe, int *refit_flag)¶
- get excutable refit flag - Parameters
- exe – [in] Pointer to executable object. 
- refit_flag – [out] the flag indicate if this executable is refitable 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsExecutableCallSubFunc(topsExecutable_t exe, const char *func_name, void **inputs, size_t input_count, int64_t *input_dims, size_t *input_rank, void **outputs, size_t output_count, topsStream_t stream)¶
- call sub function in executable - Limitation: inputs and outputs only support alloc by topsMalloc/topsHostMalloc - Note: use only in refit stage 2 for constant preprocessing - Parameters
- exe – [in] Pointer to executable object. 
- func_name – [in] sub function name 
- inputs – [in] Inputs of sub function. 
- input_count – [in] Inputs count of sub function. 
- input_dims – [in] Inputs dims List of sub function. 
- input_rank – [in] Inputs dims rank of sub function. 
- outputs – [in] Outputs of sub function. 
- output_count – [in] Outputs count of sub function. 
- stream – [in] stream identifier. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsDeviceSetResource(topsResource_t res)¶
- Set default resource_bundle. - Limitation: set current thread global resource_bundle, set resource_bundle before all tops APIs - Parameters
- res – [in] Pointer to res_bundle object. 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsConstBufferGet(uint64_t hash_key, void *init_data, size_t size, void **ptr, topsStream_t stream)¶
- Get const buffer. - Get or init const buffer with special hash_key - Parameters
- hash_key – [in] value to const buffer special key. 
- init_data – [in] Pointer to const raw buffer. 
- size – [in] Size to const buffer size. 
- ptr – [out] Pointer to get const buffer. 
- stream – [in] stream identifier. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsConstBufferPut(uint64_t hash_key, topsStream_t stream)¶
- Put const buffer. - Parameters
- hash_key – [in] value to const buffer special key. 
- stream – [in] stream identifier. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsExecutableQueryInfo(topsExecutable_t exe, topsExecutableInfoType_t info_type, uint64_t *data)¶
- Query executable info. - Limitation: user need to allocate memory for data - Parameters
- exe – [in] Pointer to executable object. 
- info_type – [in] Type to query. 
- data – [out] Pointer to query output data. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsExecutableQueryInfoV2(topsExecutable_t exe, topsExecutableInfoType_t info_type, int64_t *data)¶
- Query executable info. - Limitation: user need to allocate memory for data - Parameters
- exe – [in] Pointer to executable object. 
- info_type – [in] Type to query. 
- data – [out] Pointer to query output data. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsExecutableQueryInputName(topsExecutable_t exe, int index, char **name)¶
- Query executable input name. - Parameters
- exe – [in] Pointer to executable object. 
- index – [in] Specify which input to query. 
- name – [out] The name of the input. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsExecutableQueryOutputName(topsExecutable_t exe, int index, char **name)¶
- Query executable output name. - Parameters
- exe – [in] Pointer to executable object. 
- index – [in] Specify which output to query. 
- name – [out] The name of the output. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsExecutableSaveToFile(topsExecutable_t exe, const char *path)¶
- Save executable to a specified file. - Parameters
- exe – [in] Pointer to executable object. 
- path – [in] The name of the file to be used to save executable. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsExtMallocWithBank(void **ptr, size_t size, uint64_t bank)¶
- Allocate memory on the specify memory bank. - If size is 0, no memory is allocated, *ptr returns non-nullptr, and topsSuccess is returned. - NOTE: default bank is 0 - Use topsFree to release ptr - Parameters
- ptr – [out] Pointer to the allocated memory 
- size – [in] Requested memory size 
- bank – [in] memory bank 
 
- Returns
- topsSuccess, #topsErrorOutOfMemory, topsErrorInvalidValue (bad context, null *ptr) 
 
- 
TOPS_PUBLIC_API topsError_t topsExtMallocWithBankV2(void **ptr, size_t size, uint64_t bank, unsigned int flags)¶
- Allocate memory on the specify memory bank. - If size is 0, no memory is allocated, *ptr returns non-nullptr, and topsSuccess is returned. - NOTE: default bank is 0 - Use topsFree to release ptr - Parameters
- ptr – [out] Pointer to the allocated memory 
- size – [in] Requested memory size 
- bank – [in] memory bank 
- flags – [in] Type of memory allocation. flags only support topsDeviceMallocDefault/topsMallocTopDown/topsMallocForbidMergeMove 
 
- Returns
- topsSuccess, #topsErrorOutOfMemory, topsErrorInvalidValue (bad context, null *ptr) 
 
- 
TOPS_PUBLIC_API topsError_t topsExtMallocWithAffinity(void **ptr, size_t size, uint64_t bank, unsigned int flags)¶
- Allocate memory on the logical memory bank. - If size is 0, no memory is allocated, *ptr returns non-nullptr, and topsSuccess is returned. - NOTE: default bank is 0, the logical bank will be mapped to physical bank - Use topsFree to release ptr - Parameters
- ptr – [out] Pointer to the allocated memory 
- size – [in] Requested memory size 
- bank – [in] memory bank 
- flags – [in] Type of memory allocation. flags only support topsDeviceMallocDefault/topsMallocTopDown/topsMallocForbidMergeMove 
 
- Returns
- topsSuccess, #topsErrorOutOfMemory, topsErrorInvalidValue (bad context, null *ptr) 
 
- 
TOPS_PUBLIC_API topsError_t topsExtLaunchCooperativeKernelMultiCluster(const topsExtLaunchParams_t *launchParamsList, int numClusters, dim3 gridDim, dim3 blockDim, topsStream_t stream)¶
- Asynchronously launch kernel. - Parameters
- launchParamsList – [in] Pointer to launch params List. 
- numClusters – [in] Size to use cluster count. 
- gridDim – [in] Grid Dimension to use. 
- blockDim – [in] Block Dimension to use. 
- stream – [in] stream identifier. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsExtSetProfileMeta(uint8_t *meta, uint32_t size, int64_t compilation_id)¶
- Set profile meta data. - Parameters
- Pointer – [in] to profile meta data. 
- profile – [in] meta data size in bytes. 
- compilation – [in] id. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsScatterMemoryGetInfo(const void *dev_ptr, int index, int64_t *win_pos, size_t *win_size, int64_t *map_ctrl, size_t *map_size)¶
- get scatter memory config: win_pos and map_ctrl. - This interface only works when dev_ptr is created with topsMallocForScatter and init with topsScatterSetSubMem. - Limitation: user need to allocate memory win_pos/map_ctrl array - Parameters
- dev_ptr – [in] Pointer to scatter memory. 
- index – [in] Index to sub mem. 
- win_pos – [in] Pointer to sub memory window position. 
- win_size – [in] Pointer to sub memory window position size. 
- map_ctrl – [in] Pointer to sub memory map ctrl. 
- map_size – [in] Pointer to sub memory map ctrl size. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsScatterMemoryGetSubNum(const void *dev_ptr, size_t *size)¶
- Get the sub memory number in a scatter memory. - This interface only works when dev_ptr is created with topsMallocScatter and init with topsScatterSetSubMem. - Parameters
- dev_ptr – [in] Pointer to scatter memory. 
- Size – [in] Pointer to sub memory number size. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsMallocScatter(void **ptr, size_t size)¶
- Creates scatter memory on local device. - Creates device memory object for holding a list of scattered DeviceMemory objects. Memory is not actually allocated. User can call topsScatterPopulateSub to allocate sub device memory and topsScatterGetSubMem to query sub memory objects. Scatter DeviceMemory handles will be automatically processed by runtime API such like topsMemcpy, it can be viewed as a plain memory buffer. - Use topsFree to release ptr - A scatter DeviceMemory is invalid until it’s fully constructed with correctly splitted sub DeviceMemory objects both in size and dimension (all related DeviceMemory objects have invoked SetDims). - Parameters
- ptr – [in] Pointer to the allocated memory 
- size – [in] Requested creation size in bytes. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsScatterClearSubMemory(const void *dev_ptr)¶
- clear submemory. - This interface will clear all submemory belonged to scatter memory. - Note: This API won’t be supported in future. - Parameters
- dev_ptr – [in] Pointer to scatter memory. 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsScatterSetSubMem(const void *dev_ptr, void *sub_dptr, int64_t *win_pos, size_t win_size, int64_t *map_ctrl, size_t map_size)¶
- Set a submemory to construct scatter memory. - This interface only works when dev_ptr is created with topsMallocScatter API. - Note: This API won’t be supported in future. - Parameters
- dev_ptr – [in] Pointer to scatter memory. 
- sub_dptr – [in] The submemory object that user creates. must have invoked SetDims 
- win_pos – [in] The anchor of window in the scatter memory holding this submemory. 
- win_size – [in] The size of window position, the max size is 8. 
- map_ctrl – [in] The dimension remap of reshaping. It’s a natural number up to rank. 
- map_size – [in] The size of map ctrl, the max size is 8. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsScatterPopulateSub(const void *dev_ptr, size_t bpe, uint64_t bank, int64_t *dims, int64_t *win_pos, int64_t *map_ctrl, size_t rank_size)¶
- Populate scatter memory with sub-memory. - This interface only works when dev_ptr is created with topsMallocScatter API. Each invocation populates one sub-memory. The parent scatter memory owns this sub-memory, that all populated sub-memory are freed if parent is free. topsScatterGetSubMem WON’T retain the sub-memory. - Parameters
- dev_ptr – [in] Pointer to scatter memory. 
- bpe – [in] memory bpe of single sub-memory. 
- bank – [in] memory bank of single sub-memory. 
- dims – [in] The dimension to be set. 
- win_pos – [in] The anchor of window in the scatter memory holding this submemory. 
- map_ctrl – [in] The dimension remap of reshaping. It’s a natural number up to rank. 
- rank_size – [in] The size of map ctrl/dims/win_position, the max size is 8. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsScatterInplace(const void *dev_ptr, size_t sub_count, size_t bpe, uint64_t *bank_list, int64_t **dims_list, int64_t **win_pos_list, int64_t **map_ctrl_list, size_t rank_size)¶
- Inplace scatter memory with sub-memory description. - This interface only works when dev_ptr is created with topsMallocScatter API. The parent scatter memory owns this sub-memory, that all populated sub-memory are freed if parent is free. - topsScatterGetSubMem WON’T retain the sub-memory. - Parameters
- dev_ptr – [in] Pointer to scatter memory, if dev_ptr has sub_memory, sub memory must be allocated by topsScatterPopulateSub 
- sub_count – [in] Number of sub memory 
- bpe – [in] memory bpe of single sub-memory. 
- bank_list – [in] memory bank of sub-memory list. 
- dims_list – [in] The dimension list to be set. 
- win_pos_list – [in] The anchor of window list in the scatter memory holding this submemory. 
- map_ctrl_list – [in] The dimension remap of reshaping list. It’s a natural number up to rank. 
- rank_size – [in] The size of map ctrl/dims/win_position, the max size is 8. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsScatterGetSubMem(const void *dev_ptr, int index, void **sub_dptr)¶
- Get a submemory from scatter memory . - This interface only works when DeviceMemory is created with topsMallocScatter API. In fact, user can hold the sub DeviceMemory instead of getting it from scatter. - Parameters
- dev_ptr – [in] Pointer to scatter memory. 
- index – [in] The index returned by ScatterSetSubMemory. 
- sub_dptr – [in] The submemory object that user set by topsScatterSetSubMem. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsMemoryReduceAsync(void *lhs, void *rhs, void *result, topsReduceOpType op, topsReduceDataType dtype, uint32_t element_cnt, topsStream_t stream)¶
- Asynchronously request reduce service. - Request reduce service on Device. The lhs, rhs and result should be on the same device - Parameters
- lhs – [in] Left Hand Side device memory address object. 
- rhs – [in] Right Hand Side device memory object. 
- result – [in] Result device memory object. 
- op – [in] Reduce op type. 
- dtype – [in] Reduce data type. 
- element_cnt – [in] Count of data to do the calculation 
- stream – [in] stream identifier. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsMemCachePrefetch(topsDeviceptr_t dptr, size_t size)¶
- Prefetch the contents of a device memory to L2 buffer. The API is available for GCU 3.0 product only, no effect for other products. A device memory range is defined through device memory pointer and size. A device memory range can be equal to original device memory or a sub range of original device memory. Upper-level software needs to guarantee sub memories have no overlap. Before prefetch, need to make sure the contents on device memory is ready. - Parameters
- dptr – [in] Global device pointer 
- size – [in] Global size in bytes 
 
- Returns
- topsSuccess, topsErrorInvalidValue, topsErrorMemoryAllocation 
 
- 
TOPS_PUBLIC_API topsError_t topsMemCacheInvalidate(topsDeviceptr_t dptr, size_t size)¶
- Inavlidate L2 buffer cache data. The API is available for GCU 3.0 product only, no effect for other products. The API deletes L2 buffer cache data which it is prefetched through API topsMemCachePrefetch(). The original device memory keeps no change. A device memory range is defined through device memory pointer and size. - Parameters
- dptr – [out] Global device pointer 
- size – [out] Global size in bytes 
 
- Returns
- topsSuccess, topsErrorInvalidValue, topsErrorMemoryAllocation 
 
- 
TOPS_PUBLIC_API topsError_t topsExtGetMcAvailableMemSize(uint64_t mc, uint64_t *size)¶
- Get available memory size for mc - Parameters
- mc – [in] mc index. 
- size – [out] Pointer to available size for this mc. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsExecutableGetSectionCount(topsExecutable_t exe, topsExtExecutableSectionHeaderType_t sh_type, uint64_t *count)¶
- Get section count of the executable - Note: call with sh_type = topsExtExecutableSectionHeaderType::SHT_LAST_TYPE will return total section counts of executable - Parameters
- exe – [in] Pointer to executable object. 
- sh_type – [in] Section header type. 
- count – [out] Pointer to count of the section. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsExecutableGetSectionInfo(topsExecutable_t exe, topsExtExecutableSectionHeaderType_t sh_type, int mc, topsExtExecutableSectionInfo_t *info)¶
- Get specific section information of the executable - Parameters
- exe – [in] Pointer to executable object. 
- sh_type – [in] Section header type. 
- mc – [in] mc index. 
- info – [out] section information. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsGetMemUsageInfo(size_t *current_used_bytes, size_t *peek_used_bytes, int bank_total)¶
- Get memory usage information. - Parameters
- current_used_bytes – [out] Current size of used memory in bytes. 
- peak_used_bytes – [out] Peak size of used memory in bytes. 
- bank_total – [in] The total number of memory bank. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsGetAffinityBankList(int *affinity_bank_list, int bank_total)¶
- Get memory bank list of affinity - Parameters
- affinity_bank_list – [out] Affinity bank list of memory. 
- bank_total – [in] The total number of memory bank. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsGetAffinityBankListV2(topsResource_t res, int *affinity_bank_list, int bank_total)¶
- Get memory bank list of affinity - Parameters
- res – [in] Pointer to res_bundle object. 
- affinity_bank_list – [out] Affinity bank list of memory. 
- bank_total – [in] The total number of memory bank. 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsMemGetInfoExt(size_t *used_list, size_t *free_list, size_t *max_available_list, size_t *total_list, int bank_total)¶
- Query memory usage info list. - Return list of free, used, max available, and total memory on the memory bank. - Parameters
- used_list – [out] Pointer to used memory list. 
- free_list – [out] Pointer to free memory list. 
- max_available_list – [out] Pointer to max_available memory list. 
- total_list – [out] Pointer to total memory list. 
- bank_total – [in] The total number of memory bank. 
 
- Returns
- topsSuccess, topsErrorInvalidDevice, topsErrorInvalidValue 
 
- 
TOPS_PUBLIC_API topsError_t topsKernelSignalMaxNumGet(int *num)¶
- get the maximum kernel signal number on current device. - Parameters
- num – [out] a pointer to the variable alloced by caller. 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsKernelSignalAvailableNumGet(int *num)¶
- get the available kernel signal number on current device. - Parameters
- num – [out] a pointer to the variable alloced by caller. 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsKernelSignalAlloc(int *handle)¶
- get a kernel signal handle alloced by kmd on current device. - Parameters
- handle – [out] put output of handle on here 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsKernelSignalFree(int handle)¶
- free a kernel signal handle alloced by kmd on current device. - Parameters
- handle – [in] the kernel signal handle 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsKernelSignalRead(int handle, uint32_t *value)¶
- get a kernel signal value indicated by handle. - Parameters
- handle – [in] the kernel signal handle 
- value – [out] put output of value on here 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsKernelSignalWrite(int handle, uint32_t value)¶
- set a kernel signal value indicated by handle. - Parameters
- handle – [in] the kernel signal handle 
- value – [in] the value of kernel signal 
 
- Returns
- topsSuccess on success, or other on failure. 
 
- 
TOPS_PUBLIC_API topsError_t topsSetDeviceAndResourceReservation(int deviceId, topsResourceRequestV2_t *resRequest)¶
- Set default device to be used for subsequent tops API calls from this thread. - Sets - deviceas the default device for the calling host thread. Valid device id’s are 0… (topsGetDeviceCount()-1).- Many TOPS APIs implicitly use the “default device” : - Any device memory subsequently allocated from this host thread (using topsMalloc) will be allocated on device. 
- Any streams or events created from this host thread will be associated with device. 
- Any kernels launched from this host thread (using topsLaunchKernel) will be executed on device (unless a specific stream is specified, in which case the device associated with that stream will be used). 
 - This function may be called from any host thread. Multiple host threads may use the same device. This function does no synchronization with the previous or new device, and has very little runtime overhead. Applications can use topsSetDevice to quickly switch the default device before making a TOPS runtime call which uses the default device. - The default device is stored in thread-local-storage for each thread. Thread-pool implementations may inherit the default device of the previous thread. A good practice is to always call topsSetDevice at the start of TOPS coding sequency to establish a known standard device. - See also - Parameters
- deviceId – [in] Valid device in range 0…topsGetDeviceCount(). 
- resRequest – [in] Create a device by applying for a specified number of threads. 
 
- Returns
- topsSuccess, topsErrorInvalidDevice, #topsErrorDeviceAlreadyInUse