2. API Function¶
2.1. Driver¶
This section describes the Driver Initialization and Version.
-
TOPS_PUBLIC_API topsError_t topsInit(unsigned int flags)¶
Explicitly initializes the TOPS runtime.
Most TOPS APIs implicitly initialize the TOPS runtime. This API provides control over the timing of the initialization.
-
TOPS_PUBLIC_API topsError_t topsDriverGetVersion(int *driverVersion)¶
Returns the approximate TOPS driver version.
The version is returned as (1000 major + 10 minor). For example, topsrider 2.2 would be represented by 2020.
See also
- Parameters
driverVersion – [out]
- Returns
topsSuccess, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsRuntimeGetVersion(int *runtimeVersion)¶
Returns the approximate TOPS Runtime version.
The version is returned as (1000 major + 10 minor). For example, topsrider 2.2 would be represented by 2020.
See also
- Parameters
runtimeVersion – [out]
- Returns
topsSuccess, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsDeviceGet(topsDevice_t *device, int ordinal)¶
Returns a handle to a compute device.
- Parameters
device – [out]
ordinal – [in]
- Returns
topsSuccess, topsErrorInvalidDevice
-
TOPS_PUBLIC_API topsError_t topsDeviceComputeCapability(int *major, int *minor, topsDevice_t device)¶
Returns the compute capability of the device.
- Parameters
major – [out]
minor – [out]
device – [in]
- Returns
topsSuccess, topsErrorInvalidDevice
-
TOPS_PUBLIC_API topsError_t topsDeviceGetName(char *name, int len, topsDevice_t device)¶
Returns an identifier string for the device.
Warning
these versions are ignored.
- Parameters
name – [out]
len – [in]
device – [in]
- Returns
topsSuccess, topsErrorInvalidDevice
-
TOPS_PUBLIC_API topsError_t topsDeviceGetPCIBusId(char *pciBusId, int len, int device)¶
Returns a PCI Bus Id string for the device, overloaded to take int device ID.
- Parameters
pciBusId – [out]
len – [in]
device – [in]
- Returns
topsSuccess, topsErrorInvalidDevice
-
TOPS_PUBLIC_API topsError_t topsDeviceGetByPCIBusId(int *device, const char *pciBusId)¶
Returns a handle to a compute device.
- Parameters
device – [out] handle
pciBusId – [in]
- Returns
topsSuccess, topsErrorInvalidDevice, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsDeviceTotalMem(size_t *bytes, topsDevice_t device)¶
Returns the total amount of memory on the device.
- Parameters
bytes – [out]
device – [in]
- Returns
topsSuccess, topsErrorInvalidDevice
2.2. Device¶
This section describes the device management functions of TOPS runtime API.
-
TOPS_PUBLIC_API topsError_t topsDeviceSynchronize(void)¶
Waits on all active streams on current device.
When this command is invoked, the host thread gets blocked until all the commands associated with streams associated with the device. TOPS does not support multiple blocking modes (yet!).
See also
- Returns
topsSuccess
-
TOPS_PUBLIC_API topsError_t topsDeviceReset(void)¶
The state of current device is discarded and updated to a fresh state.
Calling this function deletes all streams created, memory allocated, kernels running, events created. Make sure that no other thread is using the device or streams, memory, kernels, events associated with the current device.
See also
- Returns
topsSuccess
-
TOPS_PUBLIC_API topsError_t topsSetDevice(int deviceId)¶
Set default device to be used for subsequent tops API calls from this thread.
Sets
device
as the default device for the calling host thread. Valid device id’s are 0… (topsGetDeviceCount()-1).Many TOPS APIs implicitly use the “default device” :
Any device memory subsequently allocated from this host thread (using topsMalloc) will be allocated on device.
Any streams or events created from this host thread will be associated with device.
Any kernels launched from this host thread (using topsLaunchKernel) will be executed on device (unless a specific stream is specified, in which case the device associated with that stream will be used).
This function may be called from any host thread. Multiple host threads may use the same device. This function does no synchronization with the previous or new device, and has very little runtime overhead. Applications can use topsSetDevice to quickly switch the default device before making a TOPS runtime call which uses the default device.
The default device is stored in thread-local-storage for each thread. Thread-pool implementations may inherit the default device of the previous thread. A good practice is to always call topsSetDevice at the start of TOPS coding sequency to establish a known standard device.
See also
- Parameters
deviceId – [in] Valid device in range 0…(topsGetDeviceCount()-1).
- Returns
topsSuccess, topsErrorInvalidDevice, #topsErrorDeviceAlreadyInUse
-
TOPS_PUBLIC_API topsError_t topsGetDevice(int *deviceId)¶
Return the default device id for the calling host thread.
TOPS maintains an default device for each thread using thread-local-storage. This device is used implicitly for TOPS runtime APIs called by this thread. topsGetDevice returns in *
device
the default device for the calling host thread.- Parameters
deviceId – [out] *deviceId is written with the default device
- Returns
topsSuccess, topsErrorInvalidDevice, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsGetDeviceCount(int *count)¶
Return number of compute-capable devices.
Returns in
*count
the number of devices that have ability to run compute commands. If there are no such devices, then topsGetDeviceCount will return topsErrorNoDevice. If 1 or more devices can be found, then topsGetDeviceCount returns topsSuccess.- Parameters
[output] – count Returns number of compute-capable devices.
- Returns
topsSuccess, topsErrorNoDevice
-
TOPS_PUBLIC_API topsError_t topsDeviceGetAttribute(int *pi, topsDeviceAttribute_t attr, int deviceId)¶
Query for a specific device attribute.
- Parameters
pi – [out] pointer to value to return
attr – [in] attribute to query
deviceId – [in] which device to query for information
- Returns
topsSuccess, topsErrorInvalidDevice, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsGetDeviceProperties(topsDeviceProp_t *prop, int deviceId)¶
Returns device properties.
Populates topsGetDeviceProperties with information for the specified device.
- Parameters
prop – [out] written with device properties
deviceId – [in] which device to query for information
- Returns
topsSuccess, topsErrorInvalidDevice
-
TOPS_PUBLIC_API topsError_t topsResourceReservationGetAttribute(void *data, topsResourceReservationAttribute_t attr, int clusterId)¶
Query for a attribute of current device’s resource reservation.
- Parameters
data – [out] pointer to value to return
attr – [in] attribute to query
clusterId – [in] which cluster to query for information
- Returns
topsSuccess, topsErrorInvalidDevice, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsDeviceGetLimit(size_t *pValue, enum topsLimit_t limit)¶
Get Resource limits of current device.
- Parameters
pValue – [out]
limit – [in]
- Returns
topsSuccess, #topsErrorUnsupportedLimit, topsErrorInvalidValue Note: Currently, only topsLimitMallocHeapSize is available
-
TOPS_PUBLIC_API topsError_t topsGetDeviceFlags(unsigned int *flags)¶
Gets the flags set for current device.
- Parameters
flags – [out]
- Returns
topsSuccess, topsErrorInvalidDevice, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsSetDeviceFlags(unsigned flags)¶
The current device behavior is changed according the flags passed.
topsDeviceScheduleSpin : TOPS runtime will actively spin in the thread which submitted the work until the command completes. This offers the lowest latency, but will consume a CPU core and may increase power.
topsDeviceScheduleYield : The TOPS runtime will yield the CPU to system so that other tasks can use it. This may increase latency to detect the completion but will consume less power and is friendlier to other tasks in the system.
topsDeviceScheduleBlockingSync : This is a synonym for topsDeviceScheduleYield.
topsDeviceScheduleAuto : Use a heuristic to select between Spin and Yield modes. If the number of TOPS contexts is greater than the number of logical processors in the system, use Spin scheduling. Else use Yield scheduling.
topsDeviceMapHost : Allow mapping host memory. On GCU, this is always allowed and the flag is ignored.
topsDeviceLmemResizeToMax :
Warning
GCU silently ignores this flag.
- Parameters
flags – [in] The schedule flags impact how TOPS waits for the completion of a command running on a device.
- Returns
topsSuccess, topsErrorInvalidDevice, #topsErrorSetOnActiveProcess
-
TOPS_PUBLIC_API topsError_t topsChooseDevice(int *device, const topsDeviceProp_t *prop)¶
Device which matches topsDeviceProp_t is returned.
- Parameters
device – [out] The device ID
prop – [in] The device properties pointer
- Returns
topsSuccess, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsIpcGetMemHandle(topsIpcMemHandle_t *handle, void *devPtr)¶
Gets an interprocess memory handle for an existing device memory allocation.
Takes a pointer to the base of an existing device memory allocation created with topsMalloc and exports it for use in another process. This is a lightweight operation and may be called multiple times on an allocation without adverse effects.
If a region of memory is freed with topsFree and a subsequent call to topsMalloc returns memory with the same device address, topsIpcGetMemHandle will return a unique handle for the new memory.
- Parameters
handle – - Pointer to user allocated topsIpcMemHandle to return the handle in.
devPtr – - Base pointer to previously allocated device memory
- Returns
topsSuccess, topsErrorInvalidHandle, topsErrorOutOfMemory, topsErrorMapFailed,
-
TOPS_PUBLIC_API topsError_t topsIpcOpenMemHandle(void **devPtr, topsIpcMemHandle_t handle, unsigned int flags)¶
Opens an interprocess memory handle exported from another process and returns a device pointer usable in the local process.
Maps memory exported from another process with topsIpcGetMemHandle into the current device address space. For contexts on different devices topsIpcOpenMemHandle can attempt to enable peer access between the devices as if the user called topsDeviceEnablePeerAccess. This behavior is controlled by the topsIpcMemLazyEnablePeerAccess flag. topsDeviceCanAccessPeer can determine if a mapping is possible.
Contexts that may open topsIpcMemHandles are restricted in the following way. topsIpcMemHandles from each device in a given process may only be opened by one context per device per other process.
Memory returned from topsIpcOpenMemHandle must be freed with topsIpcCloseMemHandle.
Calling topsFree on an exported memory region before calling topsIpcCloseMemHandle in the importing context will result in undefined behavior.
Note
No guarantees are made about the address returned in
*devPtr
. In particular, multiple processes may not receive the same address for the samehandle
.- Parameters
devPtr – - Returned device pointer
handle – - topsIpcMemHandle to open
flags – - Flags for this operation. currently only flag 0 is supported
- Returns
topsSuccess, topsErrorMapFailed, topsErrorInvalidHandle, topsErrorTooManyPeers
-
TOPS_PUBLIC_API topsError_t topsIpcCloseMemHandle(void *devPtr)¶
Close memory mapped with topsIpcOpenMemHandle.
Unmaps memory returned by topsIpcOpenMemHandle. The original allocation in the exporting process as well as imported mappings in other processes will be unaffected.
Any resources used to enable peer access will be freed if this is the last mapping using them.
- Parameters
devPtr – - Device pointer returned by topsIpcOpenMemHandle
- Returns
topsSuccess, topsErrorMapFailed, topsErrorInvalidHandle,
-
TOPS_PUBLIC_API topsError_t topsIpcGetEventHandle(topsIpcEventHandle_t *handle, topsEvent_t event)¶
Gets an opaque interprocess handle for an event.
This opaque handle may be copied into other processes and opened with topsIpcOpenEventHandle. Then topsEventRecord, topsEventSynchronize, topsEventQuery may be used in remote processes. The topsStreamWaitEvent is called in local processes only. Operations on the imported event after the exported event has been freed with topsEventDestroy will result in undefined behavior.
- Parameters
handle – [out] Pointer to topsIpcEventHandle to return the opaque event handle
event – [in] Event allocated with topsEventInterprocess and topsEventDisableTiming flags
- Returns
topsSuccess, #topsErrorInvalidConfiguration, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsIpcOpenEventHandle(topsEvent_t *event, topsIpcEventHandle_t handle)¶
Opens an interprocess event handle.
Opens an interprocess event handle exported from another process with topsIpcGetEventHandle. The returned topsEvent_t behaves like a locally created event with the topsEventDisableTiming flag specified. This event need be freed with topsEventDestroy. Operations on the imported event after the exported event has been freed with topsEventDestroy will result in undefined behavior. If the function is called within the same process where handle is returned by topsIpcGetEventHandle, it will return topsErrorInvalidContext.
- Parameters
event – [out] Pointer to topsEvent_t to return the event
handle – [in] The opaque interprocess handle to open
- Returns
topsSuccess, topsErrorInvalidValue, topsErrorInvalidContext
-
TOPS_PUBLIC_API topsError_t topsIpcOpenEventHandleExt(topsEvent_t *event, topsIpcEventHandle_t handle, topsTopologyMapType map, int port)¶
Opens an interprocess event handle with topsTopologyMapType.
Opens an interprocess event handle exported from another process with topsIpcGetEventHandle. The returned topsEvent_t behaves like a locally created event with the topsEventDisableTiming flag specified. This event need be freed with topsEventDestroy. Operations on the imported event after the exported event has been freed with topsEventDestroy will result in undefined behavior. If the function is called within the same process where handle is returned by topsIpcGetEventHandle, it will return topsErrorInvalidContext.
Note. This function is only supported when the event is exported for a remote process on a different device. If the event is exported for a remote process on the same device, please use topsIpcOpenEventHandle instead.
- Parameters
event – [out] Pointer to topsEvent_t to return the event
handle – [in] The opaque interprocess handle to open
map – [in] The link that is expected to pass
port – [in] ESL port id
- Returns
topsSuccess, topsErrorInvalidValue, topsErrorInvalidContext
-
TOPS_PUBLIC_API topsError_t topsOpenEventHandle(topsEvent_t *event, topsEvent_t handle)¶
Opens an event for the same process.
Opens a same process event handle exported from another device with topsEventCreateWithFlags. The returned topsEvent_t behaves like a locally created event with the topsEventDisableTiming flag specified. This event need be freed with topsEventDestroy. Operations on the imported event after the exported event has been freed with topsEventDestroy will result in undefined behavior.
- Parameters
event – [out] Pointer to topsEvent_t to return the event
handle – [in] The another device event handle to open
- Returns
topsSuccess, topsErrorInvalidValue, topsErrorInvalidContext
-
TOPS_PUBLIC_API topsError_t topsOpenEventHandleExt(topsEvent_t *event, topsEvent_t handle, topsTopologyMapType map, int port)¶
Opens an event for the same process with topsTopologyMapType.
Opens a same process event handle exported from another device with topsEventCreateWithFlags. The returned topsEvent_t behaves like a locally created event with the topsEventDisableTiming flag specified. This event need be freed with topsEventDestroy. Operations on the imported event after the exported event has been freed with topsEventDestroy will result in undefined behavior.
- Parameters
event – [out] Pointer to topsEvent_t to return the event
handle – [in] The another device event handle to open
map – [in] The link that is expected to pass
port – [in] ESL port id
- Returns
topsSuccess, topsErrorInvalidValue, topsErrorInvalidContext
2.3. Error¶
This section describes the error handling functions of TOPS runtime API.
-
TOPS_PUBLIC_API topsError_t topsGetLastError(void)¶
Return last error returned by any TOPS runtime API call and resets the stored error code to topsSuccess.
Returns the last error that has been returned by any of the runtime calls in the same host thread, and then resets the saved error to topsSuccess.
See also
topsGetErrorString, topsGetLastError, topsPeekAtLastError, topsError_t
- Returns
return code from last TOPS called from the active host thread
-
TOPS_PUBLIC_API topsError_t topsPeekAtLastError(void)¶
Return last error returned by any TOPS runtime API call.
Returns the last error that has been returned by any of the runtime calls in the same host thread. Unlike topsGetLastError, this function does not reset the saved error code.
See also
topsGetErrorString, topsGetLastError, topsPeekAtLastError, topsError_t
- Returns
topsSuccess
-
TOPS_PUBLIC_API const char *topsGetErrorName(topsError_t tops_error)¶
Return name of the specified error code in text form.
See also
topsGetErrorString, topsGetLastError, topsPeekAtLastError, topsError_t
- Parameters
tops_error – Error code to convert to name.
- Returns
const char pointer to the NULL-terminated error name
-
TOPS_PUBLIC_API const char *topsGetErrorString(topsError_t topsError)¶
Return handy text string message to explain the error which occurred.
See also
topsGetErrorName, topsGetLastError, topsPeekAtLastError, topsError_t
Warning
: This function returns the name of the error (same as topsGetErrorName)
- Parameters
topsError – Error code to convert to string.
- Returns
const char pointer to the NULL-terminated error string
2.4. Stream¶
This section describes the stream management functions of TOPS runtime API.
-
typedef void (*topsStreamCallback_t)(topsStream_t stream, topsError_t status, void *userData)¶
Stream CallBack struct
-
TOPS_PUBLIC_API topsError_t topsStreamCreate(topsStream_t *stream)¶
Create an asynchronous stream.
Create a new asynchronous stream.
stream
returns an opaque handle that can be used to reference the newly created stream in subsequent topsStream* commands. The stream is allocated on the heap and will remain allocated even if the handle goes out-of-scope. To release the memory used by the stream, application must call topsStreamDestroy.- Parameters
stream – [inout] Pointer to new stream
- Returns
topsSuccess, topsErrorInvalidValue
- Returns
topsSuccess, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsStreamCreateWithFlags(topsStream_t *stream, unsigned int flags)¶
Create an asynchronous stream.
Create a new asynchronous stream.
stream
returns an opaque handle that can be used to reference the newly created stream in subsequent topsStream* commands. The stream is allocated on the heap and will remain allocated even if the handle goes out-of-scope. To release the memory used by the stream, application must call topsStreamDestroy. Flags controls behavior of the stream. See topsStreamDefault, topsStreamNonBlocking.- Parameters
stream – [inout] Pointer to new stream
flags – [in] to control stream creation.
- Returns
topsSuccess, topsErrorInvalidValue
- Returns
topsSuccess, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsStreamDestroy(topsStream_t stream)¶
Destroys the specified stream.
Destroys the specified stream.
If commands are still executing on the specified stream, some may complete execution before the queue is deleted.
The queue may be destroyed while some commands are still inflight, or may wait for all commands queued to the stream before destroying it.
Note: stream resource should be released before process exit
- Parameters
stream – [in] stream to destroy
- Returns
topsSuccess #topsErrorInvalidHandle
-
TOPS_PUBLIC_API topsError_t topsStreamQuery(topsStream_t stream)¶
Return topsSuccess if all of the operations in the specified
stream
have completed, or topsErrorNotReady if not.This is thread-safe and returns a snapshot of the current state of the queue. However, if other host threads are sending work to the stream, the status may change immediately after the function is called. It is typically used for debug.
- Parameters
stream – [in] stream to query
- Returns
topsSuccess, topsErrorNotReady, #topsErrorInvalidHandle
-
TOPS_PUBLIC_API topsError_t topsStreamSynchronize(topsStream_t stream)¶
Wait for all commands in stream to complete.
This command is host-synchronous : the host will block until the specified stream is empty.
This command follows standard null-stream semantics. Specifically, specifying the null stream will cause the command to wait for other streams on the same device to complete all pending operations.
This command honors the topsDeviceLaunchBlocking flag, which controls whether the wait is active or blocking.
See also
- Parameters
stream – [in] stream identifier.
- Returns
topsSuccess, #topsErrorInvalidHandle
-
TOPS_PUBLIC_API topsError_t topsStreamWaitEvent(topsStream_t stream, topsEvent_t event, unsigned int flags)¶
Make the specified compute stream wait for an event.
This function inserts a wait operation into the specified stream. All future work submitted to
stream
will wait untilevent
reports completion before beginning execution.This function only waits for commands in the current stream to complete. Notably,, this function does not implicit wait for commands in the default stream to complete, even if the specified stream is created with topsStreamNonBlocking = 0.
- Parameters
stream – [in] stream to make wait.
event – [in] event to wait on
flags – [in] control operation [must be 0]
- Returns
topsSuccess, #topsErrorInvalidHandle
-
TOPS_PUBLIC_API topsError_t topsStreamAddCallback(topsStream_t stream, topsStreamCallback_t callback, void *userData, unsigned int flags)¶
Adds a callback to be called on the host after all currently enqueued items in the stream have completed. For each topsStreamAddCallback call, a callback will be executed exactly once. The callback will block later work in the stream until it is finished.
See also
topsStreamCreate, topsStreamQuery, topsStreamSynchronize, topsStreamWaitEvent, topsStreamDestroy
- Parameters
stream – [in] - Stream to add callback to
callback – [in] - The function to call once preceding stream operations are complete
userData – [in] - User specified data to be passed to the callback function
flags – [in] - topsStreamDefault: non-blocking stream execution; topsStreamCallbackBlocking: stream blocks until callback is completed.
- Returns
topsSuccess, #topsErrorInvalidHandle, topsErrorNotSupported
-
TOPS_PUBLIC_API topsError_t topsStreamWriteValue32(topsDeviceptr_t dst, int value, unsigned int flags)¶
Write a value to local device memory.
- Parameters
dst – [in] - The device address to write to.
value – [in] - The value to write.
flags – [in] - Reserved for future expansion; must be 0.
- Returns
topsSuccess, topsErrorInvalidDevicePointer
- TOPS_PUBLIC_API topsError_t topsStreamWriteValue32Async (topsDeviceptr_t dst, int value, unsigned int flags, topsStream_t stream __dparm(0))
Write a value to local device memory async.
- Parameters
dst – [in] - The device address to write to.
value – [in] - The value to compare with the memory location.
flags – [in] - Reserved for future expansion; must be 0.
stream – [in] - The stream to synchronize on the memory location.
- Returns
topsSuccess, topsErrorInvalidDevicePointer
-
TOPS_PUBLIC_API topsError_t topsStreamWaitValue32(topsDeviceptr_t dst, int value, unsigned int flags)¶
Wait on a memory location.
- Parameters
dst – [in] - The memory location to wait on.
value – [in] - The value to compare with the memory location.
flags – [in] - Reserved for future expansion; must be 0.
- Returns
topsSuccess, topsErrorInvalidDevicePointer
- TOPS_PUBLIC_API topsError_t topsStreamWaitValue32Async (topsDeviceptr_t dst, int value, unsigned int flags, topsStream_t stream __dparm(0))
Wait on a memory location async.
- Parameters
dst – [in] - The memory location to wait on.
value – [in] - The value to compare with the memory location.
flags – [in] - Reserved for future expansion; must be 0.
stream – [in] - The stream to synchronize on the memory location.
- Returns
topsSuccess, topsErrorInvalidDevicePointer
2.5. Event¶
This section describes the event management functions of TOPS runtime API.
-
TOPS_PUBLIC_API topsError_t topsEventCreateWithFlags(topsEvent_t *event, unsigned flags)¶
Create an event object with the specified flags.
Creates an event object for the current device with the specified flags. Valid values include: -topsEventDefault: Default event create flag. The event will use active synchronization and will support timing. Blocking synchronization provides lowest possible latency at the expense of dedicating a CPU to poll on the event. -topsEventBlockingSync: Specifies that event should use blocking synchronization. A host thread that uses topsEventSynchronize() to wait on an event created with this flag will block until the event actually completes. -topsEventDisableTiming: Specifies that the created event does not need to record timing data. Events created with this flag specified and the topsEventBlockingSync flag not specified will provide the best performance when used with topsStreamWaitEvent() and topsEventQuery(). -topsEventInterprocess: Specifies that the created event may be used as an interprocess event by topsIpcGetEventHandle(). topsEventInterprocess must be specified along with topsEventDisableTiming.
Note
Note that this function may also return error codes from previous, asynchronous launches.
- Parameters
event – [inout] Returns the newly created event.
flags – [in] Flags to control event behavior.
- Returns
topsSuccess, #topsErrorNotInitialized, topsErrorInvalidValue, topsErrorLaunchFailure, #topsErrorOutOfMemory
-
TOPS_PUBLIC_API topsError_t topsEventCreate(topsEvent_t *event)¶
Create an event object.
Creates an event object for the current device using topsEventDefault.
See also
topsEventRecord, topsEventQuery, topsEventSynchronize, topsEventDestroy, topsEventElapsedTime
- Parameters
event – [inout] Returns the newly created event.
- Returns
topsSuccess, #topsErrorNotInitialized, topsErrorInvalidValue, topsErrorLaunchFailure, #topsErrorOutOfMemory
-
TOPS_PUBLIC_API topsError_t topsEventRecord(topsEvent_t event, topsStream_t stream)¶
Record an event in the specified stream.
Captures in event the contents of stream at the time of this call. event and stream must be on the same TOPS context. Calls such as topsEventQuery() or topsStreamWaitEvent() will then examine or wait for completion of the work that was captured. Uses of stream after this call do not modify event.
topsEventRecord() can be called multiple times on the same event and will overwrite the previously captured state. Other APIs such as topsStreamWaitEvent() use the most recently captured state at the time of the API call, and are not affected by later calls to topsEventRecord(). Before the first call to topsEventRecord(), an event represents an empty set of work, so for example topsEventQuery() would return topsSuccess.
topsEventQuery() or topsEventSynchronize() must be used to determine when the event transitions from “recording” (after topsEventRecord() is called) to “recorded” (when timestamps are set, if requested).
Events which are recorded in a non-NULL stream will transition to from recording to “recorded” state when they reach the head of the specified stream, after all previous commands in that stream have completed executing.
If topsEventRecord() has been previously called on this event, then this call will overwrite any existing state in event.
If this function is called on an event that is currently being recorded, results are undefined
either outstanding recording may save state into the event, and the order is not guaranteed.
See also
topsEventCreate, topsEventQuery, topsEventSynchronize, topsEventDestroy, topsEventElapsedTime
- Parameters
event – [in] event to record.
stream – [in] stream in which to record event.
- Returns
topsSuccess, topsErrorInvalidValue, #topsErrorNotInitialized, #topsErrorInvalidHandle, topsErrorLaunchFailure
-
TOPS_PUBLIC_API topsError_t topsEventDestroy(topsEvent_t event)¶
Destroy the specified event.
Releases memory associated with the event. An event may be destroyed before it is complete (i.e., while topsEventQuery() would return topsErrorNotReady). If the event is recording but has not completed recording when topsEventDestroy() is called, the function will return immediately and any associated resources will automatically be released asynchronously at completion.
See also
topsEventCreate, topsEventQuery, topsEventSynchronize, topsEventRecord, topsEventElapsedTime
Note
Use of the handle after this call is undefined behavior.
- Parameters
event – [in] Event to destroy.
- Returns
topsSuccess, #topsErrorNotInitialized, topsErrorInvalidValue, topsErrorLaunchFailure
-
TOPS_PUBLIC_API topsError_t topsEventSynchronize(topsEvent_t event)¶
Wait for an event to complete.
This function will block until the event is ready, waiting for all previous work in the stream specified when event was recorded with topsEventRecord().
If topsEventRecord() has not been called on
event
, this function returns immediately.Note:This function needs to support topsEventBlockingSync parameter.
- Parameters
event – [in] Event on which to wait.
- Returns
topsSuccess, topsErrorInvalidValue, #topsErrorNotInitialized, #topsErrorInvalidHandle, topsErrorLaunchFailure
-
TOPS_PUBLIC_API topsError_t topsEventElapsedTime(float *ms, topsEvent_t start, topsEvent_t stop)¶
Return the elapsed time between two events.
Computes the elapsed time between two events. Time is computed in ms, with a resolution of approximately 1 us.
Events which are recorded in a NULL stream will block until all commands on all other streams complete execution, and then record the timestamp.
Events which are recorded in a non-NULL stream will record their timestamp when they reach the head of the specified stream, after all previous commands in that stream have completed executing. Thus the time that the event recorded may be significantly after the host calls topsEventRecord().
If topsEventRecord() has not been called on either event, then #topsErrorInvalidHandle is returned. If topsEventRecord() has been called on both events, but the timestamp has not yet been recorded on one or both events (that is, topsEventQuery() would return topsErrorNotReady on at least one of the events), then topsErrorNotReady is returned.
Note, for TOPS Events used in kernel dispatch using topsExtLaunchKernelGGL/topsExtLaunchKernel, events passed in topsExtLaunchKernelGGL/topsExtLaunchKernel are not explicitly recorded and should only be used to get elapsed time for that specific launch. In case events are used across multiple dispatches, for example, start and stop events from different topsExtLaunchKernelGGL/ topsExtLaunchKernel calls, they will be treated as invalid unrecorded events, TOPS will throw error “topsErrorInvalidHandle” from topsEventElapsedTime.
- Parameters
ms – [out] : Return time between start and stop in ms.
start – [in] : Start event.
stop – [in] : Stop event.
- Returns
topsSuccess, topsErrorInvalidValue, topsErrorNotReady, #topsErrorInvalidHandle, #topsErrorNotInitialized, topsErrorLaunchFailure
-
TOPS_PUBLIC_API topsError_t topsEventQuery(topsEvent_t event)¶
Query event status.
Query the status of the specified event. This function will return topsErrorNotReady if all commands in the appropriate stream (specified to topsEventRecord()) have completed. If that work has not completed, or if topsEventRecord() was not called on the event, then topsSuccess is returned.
See also
topsEventCreate, topsEventRecord, topsEventDestroy, topsEventSynchronize, topsEventElapsedTime
- Parameters
event – [in] Event to query.
- Returns
topsSuccess, topsErrorNotReady, #topsErrorInvalidHandle, topsErrorInvalidValue, #topsErrorNotInitialized, topsErrorLaunchFailure
2.6. Memory¶
This section describes the memory management functions of TOPS runtime API.
-
TOPS_PUBLIC_API topsError_t topsPointerGetAttributes(topsPointerAttribute_t *attributes, const void *ptr)¶
Return attributes for the specified pointer.
See also
- Parameters
attributes – [out] attributes for the specified pointer
ptr – [in] pointer to get attributes for
- Returns
topsSuccess, topsErrorInvalidDevice, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsPointerGetAttribute(void *data, topsPointer_attribute attribute, topsDeviceptr_t ptr)¶
Returns information about the specified pointer.
See also
- Parameters
data – [inout] returned pointer attribute value
attribute – [in] attribute to query for
ptr – [in] pointer to get attributes for
- Returns
topsSuccess, topsErrorInvalidDevice, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsDrvPointerGetAttributes(unsigned int numAttributes, topsPointer_attribute *attributes, void **data, topsDeviceptr_t ptr)¶
Returns information about the specified pointer.
See also
- Parameters
numAttributes – [in] number of attributes to query for
attributes – [in] attributes to query for
data – [inout] a two-dimensional containing pointers to memory locations where the result of each attribute query will be written to
ptr – [in] pointer to get attributes for
- Returns
topsSuccess, topsErrorInvalidDevice, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsMalloc(void **ptr, size_t size)¶
Allocate memory on the default accelerator.
If size is 0, no memory is allocated, *ptr returns non-nullptr, and topsSuccess is returned.
See also
- Parameters
ptr – [out] Pointer to the allocated memory
size – [in] Requested memory size
- Returns
topsSuccess, #topsErrorOutOfMemory, topsErrorInvalidValue (bad context, null *ptr)
-
TOPS_PUBLIC_API topsError_t topsExtCodecMemHandle(void **pointer, uint64_t dev_addr, size_t size)¶
convert device memory to efcodec memory handle
If size is 0, no memory is allocated, *ptr returns non-nullptr, and topsSuccess is returned.
See also
- Parameters
ptr – [out] Pointer to the allocated memory handle
dev_addr – [in] Requested memory device address
size – [in] Requested memory size
- Returns
topsSuccess, #topsErrorOutOfMemory, topsErrorInvalidValue (bad context, null *ptr)
-
TOPS_PUBLIC_API topsError_t topsExtMallocWithFlags(void **ptr, size_t sizeBytes, unsigned int flags)¶
Allocate memory on the default accelerator.
If size is 0, no memory is allocated, *ptr returns non-nullptr, and topsSuccess is returned.
See also
- Parameters
ptr – [out] Pointer to the allocated memory
sizeBytes – [in] Requested memory size
flags – [in] Type of memory allocation flags only support topsDeviceMallocDefault/topsMallocTopDown/ topsMallocForbidMergeMove/topsMallocPreferHighSpeedMem
- Returns
topsSuccess, #topsErrorOutOfMemory, topsErrorInvalidValue (bad context, null *ptr)
-
TOPS_PUBLIC_API topsError_t topsHostMalloc(void **ptr, size_t size, unsigned int flags)¶
Allocate device accessible page locked host memory.
If size is 0, no memory is allocated, *ptr returns nullptr, and topsSuccess is returned.
See also
- Parameters
ptr – [out] Pointer to the allocated host pinned memory
size – [in] Requested memory size
flags – [in] Type of host memory allocation
- Returns
topsSuccess, #topsErrorOutOfMemory
-
TOPS_PUBLIC_API topsError_t topsHostGetDevicePointer(void **devPtr, void *hostPtr, unsigned int flags)¶
Get Device pointer from Host Pointer allocated through topsHostMalloc.
See also
- Parameters
devPtr – [out] Device Pointer mapped to passed host pointer
hostPtr – [in] Host Pointer allocated through topsHostMalloc
flags – [in] Flags to be passed for extension
- Returns
topsSuccess, topsErrorInvalidValue, #topsErrorOutOfMemory
-
TOPS_PUBLIC_API topsError_t topsHostGetFlags(unsigned int *flagsPtr, void *hostPtr)¶
Return flags associated with host pointer.
See also
- Parameters
flagsPtr – [out] Memory location to store flags
hostPtr – [in] Host Pointer allocated through topsHostMalloc
- Returns
topsSuccess, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsHostRegister(void *hostPtr, size_t sizeBytes, unsigned int flags)¶
Register host memory so it can be accessed from the current device.
Flags:
topsHostRegisterDefault Memory is Mapped and Portable
topsHostRegisterPortable Memory is considered registered by all contexts. TOPS only supports one context so this is always assumed true.
topsHostRegisterMapped Map the allocation into the address space for the current device. The device pointer can be obtained with topsHostGetDevicePointer.
After registering the memory, use topsHostGetDevicePointer to obtain the mapped device pointer. On many systems, the mapped device pointer will have a different value than the mapped host pointer. Applications must use the device pointer in device code, and the host pointer in device code.
On some systems, registered memory is pinned. On some systems, registered memory may not be actually be pinned but uses OS or hardware facilities to all GCU access to the host memory.
Developers are strongly encouraged to register memory blocks which are aligned to the host cache-line size. (typically 64-bytes but can be obtains from the CPUID instruction).
If registering non-aligned pointers, the application must take care when register pointers from the same cache line on different devices. TOPS’s coarse-grained synchronization model does not guarantee correct results if different devices write to different parts of the same cache block - typically one of the writes will “win” and overwrite data from the other registered memory region.
- Parameters
hostPtr – [out] Pointer to host memory to be registered.
sizeBytes – [in] size of the host memory
flags. – [in] See below.
- Returns
topsSuccess, #topsErrorOutOfMemory
-
TOPS_PUBLIC_API topsError_t topsHostUnregister(void *hostPtr)¶
Un-register host pointer.
See also
- Parameters
hostPtr – [in] Host pointer previously registered with topsHostRegister
- Returns
Error code
-
TOPS_PUBLIC_API topsError_t topsFree(void *ptr)¶
Free memory allocated by the tops memory allocation API. This API performs an implicit topsDeviceSynchronize() call. If pointer is NULL, the tops runtime is initialized and topsSuccess is returned.
See also
- Parameters
ptr – [in] Pointer to memory to be freed
- Returns
topsSuccess
- Returns
topsErrorInvalidDevicePointer (if pointer is invalid, including host pointers allocated with topsHostMalloc)
-
TOPS_PUBLIC_API topsError_t topsHostFree(void *ptr)¶
Free memory allocated by the tops host memory allocation API This API performs an implicit topsDeviceSynchronize() call. If pointer is NULL, the tops runtime is initialized and topsSuccess is returned.
See also
- Parameters
ptr – [in] Pointer to memory to be freed
- Returns
topsSuccess, topsErrorInvalidValue (if pointer is invalid, including device pointers allocated with topsMalloc)
-
TOPS_PUBLIC_API topsError_t topsMemcpy(void *dst, const void *src, size_t sizeBytes, topsMemcpyKind kind)¶
Copy data from src to dst.
It supports memory from host to device, device to host, device to device and host to host The src and dst must not overlap.
For topsMemcpy, the copy is always performed by the current device (set by topsSetDevice). For multi-gcu or peer-to-peer configurations, it is recommended to set the current device to the device where the src data is physically located. For optimal peer-to-peer copies, the copy device must be able to access the src and dst pointers (by calling topsDeviceEnablePeerAccess with copy agent as the current device and src/dest as the peerDevice argument. if this is not done, the topsMemcpy will still work, but will perform the copy using a staging buffer on the host. Calling topsMemcpy with dst and src pointers that do not match the topsMemcpyKind results in undefined behavior.
See also
topsMalloc, topsFree, topsHostMalloc, topsHostFree, topsMemGetAddressRange, topsMemGetInfo, topsHostGetDevicePointer, topsMemcpyDtoD, topsMemcpyDtoDAsync, topsMemcpyDtoH, topsMemcpyDtoHAsync, topsMemcpyHtoD, topsMemcpyHtoDAsync
- Parameters
dst – [out] Data being copy to
src – [in] Data being copy from
sizeBytes – [in] Data size in bytes
kind – [in] Memory copy type
- Returns
topsSuccess, topsErrorInvalidValue, #topsErrorMemoryFree, #topsErrorUnknown
-
TOPS_PUBLIC_API topsError_t topsMemcpyWithStream(void *dst, const void *src, size_t sizeBytes, topsMemcpyKind kind, topsStream_t stream)¶
Copy data from src to dst.
It supports memory from host to device, device to host, device to device and host to host The src and dst must not overlap.
See also
topsMalloc, topsFree, topsHostMalloc, topsHostFree, topsMemGetAddressRange, topsMemGetInfo, topsHostGetDevicePointer, topsMemcpyDtoD, topsMemcpyDtoDAsync, topsMemcpyDtoH, topsMemcpyDtoHAsync, topsMemcpyHtoD, topsMemcpyHtoDAsync
- Parameters
dst – [out] Data being copy to
src – [in] Data being copy from
sizeBytes – [in] Data size in bytes
kind – [in] Memory copy type
stream – [in] Stream to enqueue this operation.
- Returns
topsSuccess, topsErrorInvalidValue, #topsErrorMemoryFree, #topsErrorUnknown
-
TOPS_PUBLIC_API topsError_t topsMemcpyHtoD(topsDeviceptr_t dst, void *src, size_t sizeBytes)¶
Copy data from Host to Device.
See also
topsMalloc, topsFree, topsHostMalloc, topsHostFree, topsMemGetAddressRange, topsMemGetInfo, topsHostGetDevicePointer, topsMemcpyDtoD, topsMemcpyDtoDAsync, topsMemcpyDtoH, topsMemcpyDtoHAsync, topsMemcpyHtoD, topsMemcpyHtoDAsync
- Parameters
dst – [out] Data being copy to
src – [in] Data being copy from
sizeBytes – [in] Data size in bytes
- Returns
topsSuccess, #topsErrorDeInitialized, #topsErrorNotInitialized, topsErrorInvalidContext, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsMemcpyDtoH(void *dst, topsDeviceptr_t src, size_t sizeBytes)¶
Copy data from Device to Host.
See also
topsMalloc, topsFree, topsHostMalloc, topsHostFree, topsMemGetAddressRange, topsMemGetInfo, topsHostGetDevicePointer, topsMemcpyDtoD, topsMemcpyDtoDAsync, topsMemcpyDtoHAsync, topsMemcpyHtoD, topsMemcpyHtoDAsync
- Parameters
dst – [out] Data being copy to
src – [in] Data being copy from
sizeBytes – [in] Data size in bytes
- Returns
topsSuccess, #topsErrorDeInitialized, #topsErrorNotInitialized, topsErrorInvalidContext, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsMemcpyDtoD(topsDeviceptr_t dst, topsDeviceptr_t src, size_t sizeBytes)¶
Copy data from Device to Device.
See also
topsMalloc, topsFree, topsHostMalloc, topsHostFree, topsMemGetAddressRange, topsMemGetInfo, topsHostGetDevicePointer, topsMemcpyDtoDAsync, topsMemcpyDtoH, topsMemcpyDtoHAsync, topsMemcpyHtoD, topsMemcpyHtoDAsync
- Parameters
dst – [out] Data being copy to
src – [in] Data being copy from
sizeBytes – [in] Data size in bytes
- Returns
topsSuccess, #topsErrorDeInitialized, #topsErrorNotInitialized, topsErrorInvalidContext, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsMemcpyHtoDAsync(topsDeviceptr_t dst, void *src, size_t sizeBytes, topsStream_t stream)¶
Copy data from Host to Device asynchronously.
See also
topsMalloc, topsFree, topsHostMalloc, topsHostFree, topsMemGetAddressRange, topsMemGetInfo, topsHostGetDevicePointer, topsMemcpyDtoD, topsMemcpyDtoDAsync, topsMemcpyDtoH, topsMemcpyDtoHAsync, topsMemcpyHtoD
- Parameters
dst – [out] Data being copy to
src – [in] Data being copy from
sizeBytes – [in] Data size in bytes
stream – [in] Stream to enqueue this operation.
- Returns
topsSuccess, #topsErrorDeInitialized, #topsErrorNotInitialized, topsErrorInvalidContext, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsMemcpyDtoHAsync(void *dst, topsDeviceptr_t src, size_t sizeBytes, topsStream_t stream)¶
Copy data from Device to Host asynchronously.
See also
topsMalloc, topsFree, topsHostMalloc, topsHostFree, topsMemGetAddressRange, topsMemGetInfo, topsHostGetDevicePointer, topsMemcpyDtoD, topsMemcpyDtoDAsync, topsMemcpyDtoH, topsMemcpyHtoD, topsMemcpyHtoDAsync
- Parameters
dst – [out] Data being copy to
src – [in] Data being copy from
sizeBytes – [in] Data size in bytes
stream – [in] Stream to enqueue this operation.
- Returns
topsSuccess, #topsErrorDeInitialized, #topsErrorNotInitialized, topsErrorInvalidContext, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsMemcpyDtoDAsync(topsDeviceptr_t dst, topsDeviceptr_t src, size_t sizeBytes, topsStream_t stream)¶
Copy data from Device to Device asynchronously.
See also
topsMalloc, topsFree, topsHostMalloc, topsHostFree, topsMemGetAddressRange, topsMemGetInfo, topsHostGetDevicePointer, topsMemcpyDtoD, topsMemcpyDtoH, topsMemcpyDtoHAsync, topsMemcpyHtoD, topsMemcpyHtoDAsync
- Parameters
dst – [out] Data being copy to
src – [in] Data being copy from
sizeBytes – [in] Data size in bytes
stream – [in] Stream to enqueue this operation.
- Returns
topsSuccess, #topsErrorDeInitialized, #topsErrorNotInitialized, topsErrorInvalidContext, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsModuleGetGlobal(topsDeviceptr_t *dptr, size_t *bytes, topsModule_t hmod, const char *name)¶
Returns a global pointer from a module. Returns in *dptr and *bytes the pointer and size of the global symbol located in module hmod. If no variable of that name exists, it returns topsErrorNotFound. Both parameters dptr and bytes are optional. If one of them is NULL, it is ignored and topsSuccess is returned.
- Parameters
dptr – [out] Returns global device pointer
bytes – [out] Returns global size in bytes
hmod – [in] Module to retrieve global from
name – [in] Name of global to retrieve
- Returns
topsSuccess, topsErrorInvalidValue, #topsErrorNotFound, topsErrorInvalidContext
-
TOPS_PUBLIC_API topsError_t topsGetSymbolAddress(void **devPtr, const void *symbol)¶
Gets device pointer associated with symbol on the device.
- Parameters
devPtr – [out] pointer to the device associated the symbol
symbol – [in] pointer to the symbol of the device
- Returns
topsSuccess, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsGetSymbolSize(size_t *size, const void *symbol)¶
Gets the size of the given symbol on the device.
- Parameters
symbol – [in] pointer to the device symbol
size – [out] pointer to the size
- Returns
topsSuccess, topsErrorInvalidValue
- TOPS_PUBLIC_API topsError_t topsMemcpyToSymbol (const void *symbol, const void *src, size_t sizeBytes, size_t offset __dparm(0), topsMemcpyKind kind __dparm(topsMemcpyHostToDevice))
Copies data to the given symbol on the device. Symbol TOPS APIs allow a kernel to define a device-side data symbol which can be accessed on the host side. The symbol can be in __constant or device space. Note that the symbol name needs to be encased in the TOPS_SYMBOL macro. This also applies to topsMemcpyFromSymbol, topsGetSymbolAddress, and topsGetSymbolSize.
- Parameters
symbol – [out] pointer to the device symbol
src – [in] pointer to the source address
sizeBytes – [in] size in bytes to copy
offset – [in] offset in bytes from start of symbol
kind – [in] type of memory transfer
- Returns
topsSuccess, topsErrorInvalidValue
- TOPS_PUBLIC_API topsError_t topsMemcpyToSymbolAsync (const void *symbol, const void *src, size_t sizeBytes, size_t offset, topsMemcpyKind kind, topsStream_t stream __dparm(0))
Copies data to the given symbol on the device asynchronously.
- Parameters
symbol – [out] pointer to the device symbol
src – [in] pointer to the source address
sizeBytes – [in] size in bytes to copy
offset – [in] offset in bytes from start of symbol
kind – [in] type of memory transfer
stream – [in] stream identifier
- Returns
topsSuccess, topsErrorInvalidValue
- TOPS_PUBLIC_API topsError_t topsMemcpyFromSymbol (void *dst, const void *symbol, size_t sizeBytes, size_t offset __dparm(0), topsMemcpyKind kind __dparm(topsMemcpyDeviceToHost))
Copies data from the given symbol on the device.
- Parameters
dptr – [out] Returns pointer to destination memory address
symbol – [in] pointer to the symbol address on the device
sizeBytes – [in] size in bytes to copy
offset – [in] offset in bytes from the start of symbol
kind – [in] type of memory transfer
- Returns
topsSuccess, topsErrorInvalidValue
- TOPS_PUBLIC_API topsError_t topsMemcpyFromSymbolAsync (void *dst, const void *symbol, size_t sizeBytes, size_t offset, topsMemcpyKind kind, topsStream_t stream __dparm(0))
Copies data from the given symbol on the device asynchronously.
- Parameters
dptr – [out] Returns pointer to destination memory address
symbol – [in] pointer to the symbol address on the device
sizeBytes – [in] size in bytes to copy
offset – [in] offset in bytes from the start of symbol
kind – [in] type of memory transfer
stream – [in] stream identifier
- Returns
topsSuccess, topsErrorInvalidValue
- TOPS_PUBLIC_API topsError_t topsMemcpyAsync (void *dst, const void *src, size_t sizeBytes, topsMemcpyKind kind, topsStream_t stream __dparm(0))
Copy data from src to dst asynchronously.
For multi-gcu or peer-to-peer configurations, it is recommended to use a stream which is a attached to the device where the src data is physically located. For optimal peer-to-peer copies, the copy device must be able to access the src and dst pointers (by calling topsDeviceEnablePeerAccess with copy agent as the current device and src/dest as the peerDevice argument. if this is not done, the topsMemcpy will still work, but will perform the copy using a staging buffer on the host.
See also
topsMalloc, topsFree, topsHostMalloc, topsHostFree, topsMemGetAddressRange, topsMemGetInfo, topsHostGetDevicePointer, topsMemcpyDtoD, topsMemcpyDtoH, topsMemcpyDtoHAsync, topsMemcpyHtoD, topsMemcpyHtoDAsync
Warning
If host or dest are not pinned, the memory copy will be performed synchronously. For best performance, use topsHostMalloc to allocate host memory that is transferred asynchronously.
Warning
topsMemcpyAsync does not support overlapped H2D and D2H copies. For topsMemcpy, the copy is always performed by the device associated with the specified stream.
- Parameters
dst – [out] Data being copy to
src – [in] Data being copy from
sizeBytes – [in] Data size in bytes
kind – [in] type of memory transfer
stream – [in] stream identifier
- Returns
topsSuccess, topsErrorInvalidValue, #topsErrorMemoryFree, #topsErrorUnknown
-
TOPS_PUBLIC_API topsError_t topsMemset(void *dst, int value, size_t sizeBytes)¶
Fills the first sizeBytes bytes of the memory area pointed to by dest with the constant byte value value.
- Parameters
dst – [out] Dst Data being filled
value – [in] Constant value to be set
sizeBytes – [in] Data size in bytes
- Returns
topsSuccess, topsErrorInvalidValue, #topsErrorNotInitialized
-
TOPS_PUBLIC_API topsError_t topsMemsetD8(topsDeviceptr_t dest, unsigned char value, size_t count)¶
Fills the first sizeBytes bytes of the memory area pointed to by dest with the constant byte value value.
- Parameters
dst – [out] Data ptr to be filled
value – [in] Constant value to be set
count – [in] Number of values to be set
- Returns
topsSuccess, topsErrorInvalidValue, #topsErrorNotInitialized
- TOPS_PUBLIC_API topsError_t topsMemsetD8Async (topsDeviceptr_t dest, unsigned char value, size_t count, topsStream_t stream __dparm(0))
Fills the first sizeBytes bytes of the memory area pointed to by dest with the constant byte value value.
topsMemsetD8Async() is asynchronous with respect to the host, so the call may return before the memset is complete. The operation can optionally be associated to a stream by passing a non-zero stream argument. If stream is non-zero, the operation may overlap with operations in other streams.
- Parameters
dest – [out] Data ptr to be filled
value – [in] Constant value to be set
count – [in] Number of values to be set
stream – [in] - Stream identifier
- Returns
topsSuccess, topsErrorInvalidValue, #topsErrorNotInitialized
-
TOPS_PUBLIC_API topsError_t topsMemsetD16(topsDeviceptr_t dest, unsigned short value, size_t count)¶
Fills the first sizeBytes bytes of the memory area pointed to by dest with the constant short value value.
- Parameters
dest – [out] Data ptr to be filled
value – [in] Constant value to be set
count – [in] Number of values to be set
- Returns
topsSuccess, topsErrorInvalidValue, #topsErrorNotInitialized
- TOPS_PUBLIC_API topsError_t topsMemsetD16Async (topsDeviceptr_t dest, unsigned short value, size_t count, topsStream_t stream __dparm(0))
Fills the first sizeBytes bytes of the memory area pointed to by dest with the constant short value value.
topsMemsetD16Async() is asynchronous with respect to the host, so the call may return before the memset is complete. The operation can optionally be associated to a stream by passing a non-zero stream argument. If stream is non-zero, the operation may overlap with operations in other streams.
- Parameters
dest – [out] Data ptr to be filled
value – [in] Constant value to be set
count – [in] Number of values to be set
stream – [in] - Stream identifier
- Returns
topsSuccess, topsErrorInvalidValue, #topsErrorNotInitialized
-
TOPS_PUBLIC_API topsError_t topsMemsetD32(topsDeviceptr_t dest, int value, size_t count)¶
Fills the memory area pointed to by dest with the constant integer value for specified number of times.
- Parameters
dest – [out] Data being filled
value – [in] Constant value to be set
count – [in] Number of values to be set
- Returns
topsSuccess, topsErrorInvalidValue, #topsErrorNotInitialized
- TOPS_PUBLIC_API topsError_t topsMemsetAsync (void *dst, int value, size_t sizeBytes, topsStream_t stream __dparm(0))
Fills the first sizeBytes bytes of the memory area pointed to by dev with the constant byte value value.
topsMemsetAsync() is asynchronous with respect to the host, so the call may return before the memset is complete. The operation can optionally be associated to a stream by passing a non-zero stream argument. If stream is non-zero, the operation may overlap with operations in other streams.
- Parameters
dst – [out] Pointer to device memory
value – [in] - Value to set for each byte of specified memory
sizeBytes – [in] - Size in bytes to set
stream – [in] - Stream identifier
- Returns
topsSuccess, topsErrorInvalidValue, #topsErrorMemoryFree
- TOPS_PUBLIC_API topsError_t topsMemsetD32Async (topsDeviceptr_t dst, int value, size_t count, topsStream_t stream __dparm(0))
Fills the memory area pointed to by dev with the constant integer value for specified number of times.
topsMemsetD32Async() is asynchronous with respect to the host, so the call may return before the memset is complete. The operation can optionally be associated to a stream by passing a non-zero stream argument. If stream is non-zero, the operation may overlap with operations in other streams.
- Parameters
dst – [out] Pointer to device memory
value – [in] - Value to set for each byte of specified memory
count – [in] - number of values to be set
stream – [in] - Stream identifier
- Returns
topsSuccess, topsErrorInvalidValue, #topsErrorMemoryFree
-
TOPS_PUBLIC_API topsError_t topsMemGetInfo(size_t *free, size_t *total)¶
Query memory info.
Return snapshot of free memory, and total allocatable memory on the device.
Returns in *free a snapshot of the current free memory.
Warning
The free memory only accounts for memory allocated by this process and may be optimistic.
- Returns
topsSuccess, topsErrorInvalidDevice, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsMemPtrGetInfo(void *ptr, size_t *size)¶
Query memory pointer info. Return size of the memory pointer.
- Parameters
size – [out] The size of memory pointer.
ptr – [in] Pointer to memory for query.
- Returns
topsSuccess, topsErrorInvalidDevice, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsMemGetAddressRange(topsDeviceptr_t *pbase, size_t *psize, topsDeviceptr_t dptr)¶
Get information on memory allocations.
- Parameters
pbase – [out] - Base pointer address
psize – [out] - Size of allocation
dptr- – [in] Device Pointer
- Returns
topsSuccess, topsErrorInvalidDevicePointer
2.7. PeerToPeer¶
This section describes the PeerToPeer device memory access functions of TOPS runtime API.
-
TOPS_PUBLIC_API topsError_t topsDeviceCanAccessPeer(int *canAccess, int deviceId, int peerDeviceId)¶
Checks if peer/esl access between two devices is possible.
- Parameters
deviceId – [in] - device id
peerDeviceId – [in] - peer device id
canAccess – [out] - access between two devices, bit[7~0] : Each bit indicating corresponding port status: 1 link, 0 no-link. bit[15~8] : p2p link type: bit8 PCIe switch link, bit9 RCs link, bit10 Die to Die, all 0 means no-p2p-link. bit[23~16] :cluster as device type: 1 cluster as device.
- Returns
topsSuccess, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsDeviceEnablePeerAccess(int peerDeviceId, unsigned int flags)¶
Set peer/esl access property.
- Parameters
peerDeviceId – [in] - peer device id
flags – [in] - access property
- Returns
topsSuccess, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsDeviceEnablePeerAccessRegion(int peerDeviceId, void *peerDevPtr, size_t size, void **devPtr)¶
Set peer access property for peer device’s region.
- Parameters
peerDeviceId – [in] - peer device id
peerDevPtr – [in] - the start device ptr of the peer device’s region
size – [in] - the size of the peer device’s region
devPtr – [out] - the p2p mapped device address
- Returns
topsSuccess, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsDeviceDisablePeerAccessRegion(int peerDeviceId, void *peerDevPtr, size_t size)¶
destroy peer access property for peer device’s region.
- Parameters
peerDeviceId – [in] - peer device id
peerDevPtr – [in] - the start device ptr of the peer device’s region
size – [in] - the size of the peer device’s region
- Returns
topsSuccess, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsMemcpyPeer(void *dst, int dstDevice, const void *src, int srcDevice, size_t sizeBytes)¶
Copies memory from one device to memory on another device.
For topsMemcpyPeer, the copy is always performed by the current device (set by topsSetDevice). For multi-gcu or peer-to-peer configurations, it is recommended to set the current device to the device where the src data is physically located. For optimal peer-to-peer copies, the copy device must be able to access the src and dst pointers (by calling topsDeviceEnablePeerAccess with copy agent as the current device and src/dest as the peerDevice argument.
- Parameters
dst – [out] Data being copy to
dstDevice – [in] Dst device id
src – [in] Data being copy from
srcDevice – [in] Src device id
sizeBytes – [in] Data size in bytes
- Returns
topsSuccess, topsErrorInvalidValue, #topsErrorMemoryFree, #topsErrorUnknown
-
TOPS_PUBLIC_API topsError_t topsMemcpyPeerAsync(void *dst, int dstDevice, const void *src, int srcDevice, size_t sizeBytes, topsStream_t stream)¶
Copies memory from one device to memory on another device asynchronously.
For multi-gcu or peer-to-peer configurations, it is recommended to use a stream which is a attached to the device where the src data is physically located. For optimal peer-to-peer copies, the copy device must be able to access the src and dst pointers (by calling topsDeviceEnablePeerAccess with copy agent as the current device and src/dest as the peerDevice argument.
- Parameters
dst – [out] Data being copy to
dstDevice – [in] Dst device id
src – [in] Data being copy from
srcDevice – [in] Src device id
sizeBytes – [in] Data size in bytes
stream – [in] Stream identifier
- Returns
topsSuccess, topsErrorInvalidValue, #topsErrorMemoryFree, #topsErrorUnknown
-
TOPS_PUBLIC_API topsError_t topsMemcpyPeerExt(void *dst, int dstDevice, const void *src, int srcDevice, size_t sizeBytes, topsTopologyMapType map, int port)¶
Copies memory from one device to memory on another device with special priority.
For topsMemcpyPeerExt, the copy is always performed by the current device (set by topsSetDevice). For multi-gcu or peer-to-peer configurations, it is recommended to set the current device to the device where the src data is physically located. For optimal peer-to-peer copies, the copy device must be able to access the src and dst pointers (by calling topsDeviceEnablePeerAccess with copy agent as the current device and src/dest as the peerDevice argument.
- Parameters
dst – [out] Data being copy to
dstDevice – [in] Dst device id
src – [in] Data being copy from
srcDevice – [in] Src device id
sizeBytes – [in] Data size in bytes
map – [in] The link that is expected to pass
port – [in] ESL port id
- Returns
topsSuccess, topsErrorInvalidValue, #topsErrorMemoryFree, #topsErrorUnknown
-
TOPS_PUBLIC_API topsError_t topsMemcpyPeerExtAsync(void *dst, int dstDevice, const void *src, int srcDevice, size_t sizeBytes, topsTopologyMapType map, int port, topsStream_t stream)¶
Copies memory from one device to memory on another device asynchronously with special priority.
For topsMemcpyPeerExtAsync, the copy is always performed by the current device (set by topsSetDevice). For multi-gcu or peer-to-peer configurations, it is recommended to set the current device to the device where the src data is physically located. For optimal peer-to-peer copies, the copy device must be able to access the src and dst pointers (by calling topsDeviceEnablePeerAccess with copy agent as the current device and src/dest as the peerDevice argument.
- Parameters
dst – [out] Data being copy to
dstDevice – [in] Dst device id
src – [in] Data being copy from
srcDevice – [in] Src device id
sizeBytes – [in] Data size in bytes
map – [in] The link that is expected to pass
port – [in] ESL port id
stream – [in] Stream identifier
- Returns
topsSuccess, topsErrorInvalidValue, #topsErrorMemoryFree, #topsErrorUnknown
2.8. Module¶
This section describes the module management functions of TOPS runtime API.
-
TOPS_PUBLIC_API topsError_t topsModuleLoad(topsModule_t *module, const char *fname)¶
Loads code object from file into a topsModule_t.
- Parameters
fname – [in]
module – [out]
- Returns
topsSuccess, topsErrorInvalidValue, topsErrorInvalidContext, topsErrorFileNotFound, topsErrorOutOfMemory, topsErrorSharedObjectInitFailed, topsErrorNotInitialized
-
TOPS_PUBLIC_API topsError_t topsModuleUnload(topsModule_t module)¶
Frees the module.
- Parameters
module – [in]
- Returns
topsSuccess, topsErrorInvalidValue module is freed and the code objects associated with it are destroyed
-
TOPS_PUBLIC_API topsError_t topsModuleGetFunction(topsFunction_t *function, topsModule_t module, const char *kname)¶
Function with kname will be extracted if present in module.
- Parameters
module – [in]
kname – [in]
function – [out]
- Returns
topsSuccess, topsErrorInvalidValue, topsErrorInvalidContext, topsErrorNotInitialized, topsErrorNotFound,
-
TOPS_PUBLIC_API topsError_t topsFuncGetAttributes(struct topsFuncAttributes *attr, const void *func)¶
Find out attributes for a given function.
- Parameters
attr – [out]
func – [in]
- Returns
topsSuccess, topsErrorInvalidValue, topsErrorInvalidDeviceFunction
-
TOPS_PUBLIC_API topsError_t topsFuncGetAttribute(int *value, topsFunction_attribute attrib, topsFunction_t hfunc)¶
Find out a specific attribute for a given function.
- Parameters
value – [out]
attrib – [in]
hfunc – [in]
- Returns
topsSuccess, topsErrorInvalidValue, topsErrorInvalidDeviceFunction
-
TOPS_PUBLIC_API topsError_t topsModuleLoadData(topsModule_t *module, const void *image)¶
builds module from code object which resides in host memory. Image is pointer to that location.
- Parameters
image – [in]
module – [out]
- Returns
topsSuccess, topsErrorNotInitialized, topsErrorOutOfMemory, topsErrorNotInitialized
-
TOPS_PUBLIC_API topsError_t topsModuleLoadDataEx(topsModule_t *module, const void *image, unsigned int numOptions, topsJitOption *options, void **optionValues)¶
builds module from code object which resides in host memory. Image is pointer to that location. Options are not used. topsModuleLoadData is called.
- Parameters
image – [in]
module – [out]
numOptions – [in] Number of options
options – [in] Options for JIT
optionValues – [in] Option values for JIT
- Returns
topsSuccess, topsErrorNotInitialized, topsErrorOutOfMemory, topsErrorNotInitialized
-
TOPS_PUBLIC_API topsError_t topsModuleLaunchKernel(topsFunction_t f, unsigned int gridDimX, unsigned int gridDimY, unsigned int gridDimZ, unsigned int blockDimX, unsigned int blockDimY, unsigned int blockDimZ, unsigned int sharedMemBytes, topsStream_t stream, void **kernelParams, void **extra)¶
launches kernel f with launch parameters and shared memory on stream with arguments passed to kernelparams or extra
Please note, TOPS does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32. So gridDim.x * blockDim.x, gridDim.y * blockDim.y and gridDim.z * blockDim.z are always less than 2^32.
Warning
kernellParams argument is not yet implemented in TOPS. Please use extra instead. Please refer to tops_porting_driver_api.md for sample usage.
- Parameters
f – [in] Kernel to launch.
gridDimX – [in] X grid dimension specified as multiple of blockDimX.
gridDimY – [in] Y grid dimension specified as multiple of blockDimY.
gridDimZ – [in] Z grid dimension specified as multiple of blockDimZ.
blockDimX – [in] X block dimensions specified in work-items
blockDimY – [in] Y grid dimension specified in work-items
blockDimZ – [in] Z grid dimension specified in work-items
sharedMemBytes – [in] Amount of dynamic shared memory to allocate for this kernel. The TOPS-Clang compiler provides support for extern shared declarations.
stream – [in] Stream where the kernel should be dispatched. May be 0, in which case the default stream is used with associated synchronization rules.
kernelParams – [in]
extra – [in] Pointer to kernel arguments. These are passed directly to the kernel and must be in the memory layout and alignment expected by the kernel.
- Returns
topsSuccess, topsInvalidDevice, topsErrorNotInitialized, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsLaunchCooperativeKernel(const void *f, dim3 gridDim, dim3 blockDimX, void **kernelParams, size_t sharedMemBytes, topsStream_t stream)¶
launches kernel f with launch parameters and shared memory on stream with arguments passed to kernelparams or extra, where thread blocks can cooperate and synchronize as they execute
Please note, TOPS does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32.
- Parameters
f – [in] Kernel to launch.
gridDim – [in] Grid dimensions specified as multiple of blockDim.
blockDim – [in] Block dimensions specified in work-items
kernelParams – [in] A list of kernel arguments
sharedMemBytes – [in] Amount of dynamic shared memory to allocate for this kernel. The TOPS-Clang compiler provides support for extern shared declarations.
stream – [in] Stream where the kernel should be dispatched. May be 0, in which case the default stream is used with associated synchronization rules.
- Returns
topsSuccess, topsInvalidDevice, topsErrorNotInitialized, topsErrorInvalidValue, topsErrorCooperativeLaunchTooLarge
2.9. Clang¶
This section describes the API to support the triple-chevron syntax.
- TOPS_PUBLIC_API topsError_t topsConfigureCall (dim3 gridDim, dim3 blockDim, size_t sharedMem __dparm(0), topsStream_t stream __dparm(0))
Configure a kernel launch.
Please note, TOPS does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32.
- Parameters
gridDim – [in] grid dimension specified as multiple of blockDim.
blockDim – [in] block dimensions specified in work-items
sharedMem – [in] Amount of dynamic shared memory to allocate for this kernel. The TOPS-Clang compiler provides support for extern shared declarations.
stream – [in] Stream where the kernel should be dispatched. May be 0, in which case the default stream is used with associated synchronization rules.
- Returns
topsSuccess, topsInvalidDevice, topsErrorNotInitialized, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsSetupArgument(const void *arg, size_t size, size_t offset)¶
Set a kernel argument.
- Parameters
arg – [in] Pointer the argument in host memory.
size – [in] Size of the argument.
offset – [in] Offset of the argument on the argument stack.
- Returns
topsSuccess, topsInvalidDevice, topsErrorNotInitialized, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsLaunchByPtr(const void *func)¶
Launch a kernel.
- Parameters
func – [in] Kernel to launch.
- Returns
topsSuccess, topsInvalidDevice, topsErrorNotInitialized, topsErrorInvalidValue
- TOPS_PUBLIC_API topsError_t __topsPushCallConfiguration (dim3 gridDim, dim3 blockDim, size_t sharedMem __dparm(0), topsStream_t stream __dparm(0))
Push configuration of a kernel launch.
Please note, TOPS does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32.
- Parameters
gridDim – [in] grid dimension specified as multiple of blockDim.
blockDim – [in] block dimensions specified in work-items
sharedMem – [in] Amount of dynamic shared memory to allocate for this kernel. The TOPS-Clang compiler provides support for extern shared declarations.
stream – [in] Stream where the kernel should be dispatched. May be 0, in which case the default stream is used with associated synchronization rules.
- Returns
topsSuccess, topsInvalidDevice, topsErrorNotInitialized, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t __topsPopCallConfiguration(dim3 *gridDim, dim3 *blockDim, size_t *sharedMem, topsStream_t *stream)¶
Pop configuration of a kernel launch.
Please note, TOPS does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32.
- Parameters
gridDim – [out] grid dimension specified as multiple of blockDim.
blockDim – [out] block dimensions specified in work-items
sharedMem – [out] Amount of dynamic shared memory to allocate for this kernel. The TOPS-Clang compiler provides support for extern shared declarations.
stream – [out] Stream where the kernel should be dispatched. May be 0, in which case the default stream is used with associated synchronization rules.
- Returns
topsSuccess, topsInvalidDevice, topsErrorNotInitialized, topsErrorInvalidValue
- TOPS_PUBLIC_API topsError_t topsLaunchKernel (const void *function_address, dim3 numBlocks, dim3 dimBlocks, void **args, size_t sharedMemBytes __dparm(0), topsStream_t stream __dparm(0))
C compliant kernel launch API.
- Parameters
function_address – [in] - kernel stub function pointer.
numBlocks – [in] - number of blocks
dimBlocks – [in] - dimension of a block
args – [in] - kernel arguments
sharedMemBytes – [in] - Amount of dynamic shared memory to allocate for this kernel. The TOPS-Clang compiler provides support for extern shared declarations.
stream – [in] - Stream where the kernel should be dispatched. May be 0, in which case the default stream is used with associated synchronization rules.
- Returns
topsSuccess, topsErrorInvalidValue, topsInvalidDevice
-
TOPS_PUBLIC_API topsError_t topsLaunchKernelExC(const topsLaunchConfig_t *config, const void *func, void **args)¶
C compliant kernel launch API.
- Parameters
config – [in] - Launch Configuration.
func – [in] - Kernel to launch
args – [in] - Array of pointers to kernel parameters
- Returns
topsSuccess, topsErrorInvalidValue, topsInvalidDevice
2.10. Runtime¶
This section describes the runtime compilation functions of TOPS runtime API
-
enum topsrtcResult¶
Values:
-
enumerator TOPSRTC_SUCCESS¶
-
enumerator TOPSRTC_ERROR_OUT_OF_MEMORY¶
-
enumerator TOPSRTC_ERROR_PROGRAM_CREATION_FAILURE¶
-
enumerator TOPSRTC_ERROR_INVALID_INPUT¶
-
enumerator TOPSRTC_ERROR_INVALID_PROGRAM¶
-
enumerator TOPSRTC_ERROR_INVALID_OPTION¶
-
enumerator TOPSRTC_ERROR_COMPILATION¶
-
enumerator TOPSRTC_ERROR_BUILTIN_OPERATION_FAILURE¶
-
enumerator TOPSRTC_ERROR_NAME_EXPRESSION_NOT_VALID¶
-
enumerator TOPSRTC_ERROR_INTERNAL_ERROR¶
-
enumerator TOPSRTC_SUCCESS¶
-
typedef enum topsrtcResult topsrtcResult
-
typedef struct _topsrtcProgram *topsrtcProgram¶
-
TOPS_PUBLIC_API const char *topsrtcGetErrorString(topsrtcResult result)¶
Returns a string message to describing the error which occurred.
See also
topsrtcResult
Warning
If the topsrtc result is defined, it will return “Invalid TOPSRTC error code”
- Parameters
result – [in] TOPSRTC API result code.
- Returns
const char message string for the given topsrtcResult code.
-
TOPS_PUBLIC_API topsrtcResult topsrtcVersion(int *major, int *minor)¶
Sets the output parameters major and minor with the TOPSRTC version.
- Parameters
major – [out] TOPS Runtime Compilation major version number.
minor – [out] TOPS Runtime Compilation minor version number.
-
TOPS_PUBLIC_API topsrtcResult topsrtcAddNameExpression(topsrtcProgram program, const char *name_expression)¶
Adds the given name exprssion to the runtime compilation program.
If const char pointer is NULL, it will return TOPSRTC_ERROR_INVALID_INPUT.
See also
topsrtcResult
- Parameters
program – [in] runtime compilation program instance.
name_expression – [in] const char pointer to the name expression.
- Returns
TOPSRTC_SUCCESS
-
TOPS_PUBLIC_API topsrtcResult topsrtcCompileProgram(topsrtcProgram program, int num_options, const char **options)¶
Compiles the given runtime compilation program.
If the compiler failed to build the runtime compilation program, it will return TOPSRTC_ERROR_COMPILATION.
See also
topsrtcResult
- Parameters
program – [in] runtime compilation program instance.
num_options – [in] number of compiler options.
options – [in] compiler options as const array of strins.
- Returns
TOPSRTC_SUCCESS
-
TOPS_PUBLIC_API topsrtcResult topsrtcCreateProgram(topsrtcProgram *program, const char *source, const char *name, int num_headers, const char **headers, const char **include_names)¶
Creates an instance of topsrtcProgram with the given input parameters, and sets the output topsrtcProgram program with it.
Any invalide input parameter, it will return TOPSRTC_ERROR_INVALID_INPUT or TOPSRTC_ERROR_INVALID_PROGRAM.
If failed to create the program, it will return TOPSRTC_ERROR_PROGRAM_CREATION_FAILURE.
See also
topsrtcResult
- Parameters
program – [inout] runtime compilation program instance.
source – [in] const char pointer to the program source.
name – [in] const char pointer to the program name.
num_headers – [in] number of headers.
headers – [in] array of strings pointing to headers.
include_names – [in] array of strings pointing to names included in program source.
- Returns
TOPSRTC_SUCCESS
-
TOPS_PUBLIC_API topsrtcResult topsrtcDestroyProgram(topsrtcProgram *program)¶
Destroys an instance of given topsrtcProgram.
If program is NULL, it will return TOPSRTC_ERROR_INVALID_INPUT.
See also
topsrtcResult
- Parameters
program – [in] runtime compilation program instance.
- Returns
TOPSRTC_SUCCESS
-
TOPS_PUBLIC_API topsrtcResult topsrtcGetLoweredName(topsrtcProgram program, const char *name_expression, const char **lowered_name)¶
Gets the lowered (mangled) name from an instance of topsrtcProgram with the given input parameters, and sets the output lowered_name with it.
If any invalide nullptr input parameters, it will return TOPSRTC_ERROR_INVALID_INPUT
If name_expression is not found, it will return TOPSRTC_ERROR_NAME_EXPRESSION_NOT_VALID
If failed to get lowered_name from the program, it will return TOPSRTC_ERROR_COMPILATION.
See also
topsrtcResult
- Parameters
program – [in] runtime compilation program instance.
name_expression – [in] const char pointer to the name expression.
lowered_name – [inout] const char array to the lowered (mangled) name.
- Returns
TOPSRTC_SUCCESS
-
TOPS_PUBLIC_API topsrtcResult topsrtcGetProgramLog(topsrtcProgram program, char *log)¶
Gets the log generated by the runtime compilation program instance.
See also
topsrtcResult
- Parameters
program – [in] runtime compilation program instance.
log – [out] memory pointer to the generated log.
- Returns
TOPSRTC_SUCCESS
-
TOPS_PUBLIC_API topsrtcResult topsrtcGetProgramLogSize(topsrtcProgram program, size_t *log_size)¶
Gets the size of log generated by the runtime compilation program instance.
See also
topsrtcResult
- Parameters
program – [in] runtime compilation program instance.
log_size – [out] size of generated log.
- Returns
TOPSRTC_SUCCESS
-
TOPS_PUBLIC_API topsrtcResult topsrtcGetCode(topsrtcProgram program, char *code)¶
Gets the pointer of compilation binary by the runtime compilation program instance.
See also
topsrtcResult
- Parameters
program – [in] runtime compilation program instance.
code – [out] char pointer to binary.
- Returns
TOPSRTC_SUCCESS
-
TOPS_PUBLIC_API topsrtcResult topsrtcGetCodeSize(topsrtcProgram program, size_t *codeSizeRet)¶
Gets the size of compilation binary by the runtime compilation program instance.
See also
topsrtcResult
- Parameters
program – [in] runtime compilation program instance.
code – [out] the size of binary.
- Returns
TOPSRTC_SUCCESS
2.11. Extension¶
This section describes the runtime Ext functions of TOPS runtime API
-
TOPS_PUBLIC_API topsError_t topsMemorySetDims(const void *devPtr, int64_t *dims, size_t dims_count)¶
Set dimension for device memory.
Limitation: devPtr only support allocated by topsMalloc*
- Parameters
devPtr – [in] The device memory to be set.
dims – [in] The dimension to be set.
dims_count – [in] The dimension size to be set.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsMemoryGetDims(const void *devPtr, int64_t *dims, size_t *dims_count)¶
Get dimension of device memory.
Limitation: devPtr only support allocated by topsMalloc*
- Parameters
devPtr – [in] The device memory to be set.
dims – [out] The dimension pointer list to be get.
dims_count – [out] The dimension rank pointer list to be get.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsCreateExecutable(topsExecutable_t *exe, const void *bin, size_t size)¶
Create an executable with specified binary data and size.
Note: Ownership of the pointer to new executable is transferred to the caller. Caller need call topsDestroyExecutable to destroy this pointer when no longer use it.
- Parameters
bin – [in] The pointer to the binary data.
size – [in] The size of the binary data.
exe – [out] Pointer to get new executable.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsCreateExecutableFromFile(topsExecutable_t *exe, const char *filepath)¶
Create an executable with specified file.
Note: Ownership of the pointer to new executable is transferred to the caller. Caller need call topsDestroyExecutable to destroy this pointer when no longer use it.
- Parameters
filepath – [in] The name of binary file.
exe – [out] Pointer to get new executable.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsDestroyExecutable(topsExecutable_t exe)¶
Destroy and clean up an executable.
- Parameters
exe – [in] Pointer to executable to destroy.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsCreateResource(topsResource_t *res, topsResourceRequest_t req)¶
Create a resource bundle with specified request. If device is in reset status, this method will retry until reset finish.
Note: Ownership of the pointer to new resource bundle is transferred to the caller. Caller need call topsDestroyResource to destroy this pointer when no longer use it.
- Parameters
res – [out] Pointer to get new resource bundle or nullptr if failed.
req – [in] Requested resource of allocated resource bundle.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsCreateResourceForExecutable(topsResource_t *res, topsExecutable_t exe)¶
Create a new resource bundle with specified target resource. If device is in reset status, this method will retry until reset finish.
Note: Ownership of the pointer to new resource bundle is transferred to the caller. Caller need call topsDestroyResource to destroy this pointer when no longer use it.
- Parameters
res – [out] Pointer to get new resource bundle or nullptr if failed.
exe – [in] Pointer to the executable to get target pointer which contains the requested resource.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsDestroyResource(topsResource_t res)¶
Destroy and clean up a resource bundle.
- Parameters
res – [in] Pointer to resource bundle to destroy.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsResourceBundleGetAttribute(topsResource_t res, topsResourceBundleInfoType_t type, uint64_t *data)¶
Query resource bundle attribute.
- Parameters
res – [in] Pointer to resource bundle to query.
type – [in] Type to query.
data – [out] Pointer to query output data.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsMallocForResource(void **ptr, size_t size, topsResource_t res)¶
Allocate memory on the res_bundle affinity memory bank.
If size is 0, no memory is allocated, *ptr returns non-nullptr, and topsSuccess is returned.
Use topsFree to release ptr
- Parameters
ptr – [out] Pointer to the allocated memory
size – [in] Requested memory size
res – [in] res_bundle
- Returns
topsSuccess, #topsErrorOutOfMemory, topsErrorInvalidValue (bad context, null *ptr)
-
TOPS_PUBLIC_API topsError_t topsLaunchExecutableV2(topsExecutable_t exe, topsResource_t res, void **inputs, size_t input_count, int64_t *input_dims, size_t *input_rank, void **outputs, size_t output_count, topsStream_t stream)¶
Asynchronously run a executable.
Run an executable with given inputs and outputs for training.
Note: default use runExe with Operator mode (not support dynamic shape), and if res is nullptr, use default res_bundle
- Parameters
exe – [in] Pointer to executable object.
res – [in] Pointer to res_bundle object.
inputs – [in] Inputs of executable.
input_count – [in] Inputs count of executable.
input_dims – [in] Inputs dims List of executable.
input_rank – [in] Inputs dims rank of executable.
outputs – [in] Outputs of executable.
output_count – [in] Outputs count of executable.
stream – [in] stream identifier.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsLaunchExecutableV3(topsExecutable_t exe, topsResource_t res, void **inputs, size_t input_count, int64_t *input_dims, size_t *input_rank, void **outputs, size_t output_count, int64_t *output_dims, size_t *output_rank, topsStream_t stream)¶
Asynchronously run a executable.
Run an executable with given inputs and outputs for training.
Limitation: for performance, output_dims/output_rank should be set to nullptr. if users set output_dims/output_rank as no-zero, topsLaunchExecutableV3 will insert a blocking mode hostcallback operation to get output_dims and output_rank
Note: default use runExe with Graph mode(support dynamic shape), and if res is nullptr, use default res_bundle
- Parameters
exe – [in] Pointer to executable object.
res – [in] Pointer to res_bundle object.
inputs – [in] Inputs of executable.
input_count – [in] Inputs count of executable.
input_dims – [in] Inputs dims List of executable.
input_rank – [in] Inputs dims rank of executable.
outputs – [in] Outputs of executable.
output_count – [in] Outputs count of executable.
output_dims – [out] Outputs dims List of executable.
output_rank – [out] Outputs dims rank of executable.
stream – [in] stream identifier.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsLaunchExecutableV4(topsExecutable_t exe, topsResource_t res, void **inputs, size_t input_count, int64_t *input_dims, size_t *input_rank, void **outputs, size_t output_count, int64_t *output_dims, size_t *output_rank, void **stack_datas, size_t stacks_count, topsStream_t stream)¶
Asynchronously run a executable.
Run an executable with given inputs and outputs for training.
Limitation: for performance, output_dims/output_rank should be set to nullptr. if users set output_dims/output_rank as no-zero, topsLaunchExecutableV3 will insert a blocking mode hostcallback operation to get output_dims and output_rank
Note: default use runExe with Graph mode(support dynamic shape), and if res is nullptr, use default res_bundle
- Parameters
exe – [in] Pointer to executable object.
res – [in] Pointer to res_bundle object.
inputs – [in] Inputs of executable.
input_count – [in] Inputs count of executable.
input_dims – [in] Inputs dims List of executable.
input_rank – [in] Inputs dims rank of executable.
outputs – [in] Outputs of executable.
output_count – [in] Outputs count of executable.
output_dims – [out] Outputs dims List of executable.
output_rank – [out] Outputs dims rank of executable.
stack_datas – [in] stack_datas of executable.
stacks_count – [in] stacks_datas count of executable.
stream – [in] stream identifier.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsLaunchExecutable(topsExecutable_t exe, topsResource_t res, void **inputs, size_t input_count, int64_t *input_dims, size_t *input_rank, void **outputs, size_t output_count, int64_t *output_dims, size_t *output_rank, topsStream_t stream)¶
Asynchronously run a executable.
Run an executable with given inputs and outputs for training.
Limitation: for performance, output_dims/output_rank should be set to nullptr. if user set output_dims/output_rank no-zero, topsLaunchExecutable may synchronize to get output_dims and output_rank
Note: default use runExe with Graph mode(support dynamic shape), and if res is nullptr, use default res_bundle
- Parameters
exe – [in] Pointer to executable object.
res – [in] Pointer to res_bundle object.
inputs – [in] Inputs of executable.
input_count – [in] Inputs count of executable.
input_dims – [in] Inputs dims List of executable.
input_rank – [in] Inputs dims rank of executable.
outputs – [in] Outputs of executable.
output_count – [in] Outputs count of executable.
output_dims – [out] Outputs dims List of executable.
output_rank – [out] Outputs dims rank of executable.
stream – [in] stream identifier.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsLaunchExecutableWithConstData(topsExecutable_t exe, topsResource_t res, void **inputs, size_t input_count, int64_t *input_dims, size_t *input_rank, void **outputs, size_t output_count, int64_t *output_dims, size_t *output_rank, void **const_datas, size_t const_datas_count, topsStream_t stream)¶
Asynchronously run a executable.
Run an executable with given inputs and outputs for training.
Limitation: for performance, output_dims/output_rank should be set to nullptr. if user set output_dims/output_rank no-zero, topsLaunchExecutableWithConstData may synchronize to get output_dims and output_rank
Note: default use runExe with Graph mode(support dynamic shape), and if res is nullptr, use default res_bundle
- Parameters
exe – [in] Pointer to executable object.
res – [in] Pointer to res_bundle object.
inputs – [in] Inputs of executable.
input_count – [in] Inputs count of executable.
input_dims – [in] Inputs dims List of executable.
input_rank – [in] Inputs dims rank of executable.
outputs – [in] Outputs of executable.
output_count – [in] Outputs count of executable.
output_dims – [out] Outputs dims List of executable.
output_rank – [out] Outputs dims rank of executable.
const_datas – [in] Const_datas of executable.
const_datas_count – [in] Const_datas count of executable.
stream – [in] stream identifier.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsLaunchExecutableWithConstDataV2(topsExecutable_t exe, topsResource_t res, void **inputs, size_t input_count, int64_t *input_dims, size_t *input_rank, void **outputs, size_t output_count, void **const_datas, size_t const_datas_count, topsStream_t stream)¶
Asynchronously run a executable.
Run an executable with given inputs and outputs for training.
Note: default use runExe with Operator mode (not support dynamic shape), and if res is nullptr, use default res_bundle
- Parameters
exe – [in] Pointer to executable object.
res – [in] Pointer to res_bundle object.
inputs – [in] Inputs of executable.
input_count – [in] Inputs count of executable.
input_dims – [in] Inputs dims List of executable.
input_rank – [in] Inputs dims rank of executable.
outputs – [in] Outputs of executable.
output_count – [in] Outputs count of executable.
const_datas – [in] Const_datas of executable.
const_datas_count – [in] Const_datas count of executable.
stream – [in] stream identifier.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsLaunchExecutableWithConstDataV3(topsExecutable_t exe, topsResource_t res, void **inputs, size_t input_count, int64_t *input_dims, size_t *input_rank, void **outputs, size_t output_count, int64_t *output_dims, size_t *output_rank, void **const_datas, size_t const_datas_count, topsStream_t stream)¶
Asynchronously run a executable.
Run an executable with given inputs and outputs for training.
Limitation: for performance, output_dims/output_rank should be set to nullptr. if users set output_dims/output_rank as no-zero, topsLaunchExecutableWithConstDataV3 will insert a blocking mode hostcallback operation to get output_dims and output_rank
Note: default use runExe with Graph mode(support dynamic shape), and if res is nullptr, use default res_bundle
- Parameters
exe – [in] Pointer to executable object.
res – [in] Pointer to res_bundle object.
inputs – [in] Inputs of executable.
input_count – [in] Inputs count of executable.
input_dims – [in] Inputs dims List of executable.
input_rank – [in] Inputs dims rank of executable.
outputs – [in] Outputs of executable.
output_count – [in] Outputs count of executable.
output_dims – [out] Outputs dims List of executable.
output_rank – [out] Outputs dims rank of executable.
const_datas – [in] Const_datas of executable.
const_datas_count – [in] Const_datas count of executable.
stream – [in] stream identifier.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsLaunchExecutableWithConstDataV4(topsExecutable_t exe, topsResource_t res, void **inputs, size_t input_count, int64_t *input_dims, size_t *input_rank, void **outputs, size_t output_count, int64_t *output_dims, size_t *output_rank, void **const_datas, size_t const_datas_count, void **stack_datas, size_t stacks_count, topsStream_t stream)¶
Asynchronously run a executable.
Run an executable with given inputs and outputs for training.
Limitation: for performance, output_dims/output_rank should be set to nullptr. if users set output_dims/output_rank as no-zero, topsLaunchExecutableWithConstDataV3 will insert a blocking mode hostcallback operation to get output_dims and output_rank
Note: default use runExe with Graph mode(support dynamic shape), and if res is nullptr, use default res_bundle
- Parameters
exe – [in] Pointer to executable object.
res – [in] Pointer to res_bundle object.
inputs – [in] Inputs of executable.
input_count – [in] Inputs count of executable.
input_dims – [in] Inputs dims List of executable.
input_rank – [in] Inputs dims rank of executable.
outputs – [in] Outputs of executable.
output_count – [in] Outputs count of executable.
output_dims – [out] Outputs dims List of executable.
output_rank – [out] Outputs dims rank of executable.
const_datas – [in] Const_datas of executable.
const_datas_count – [in] Const_datas count of executable.
stack_datas – [in] stack_datas of executable.
stacks_count – [in] stacks_datas count of executable.
stream – [in] stream identifier.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExecutableGetConstManagedData(topsExecutable_t exe, unsigned int *numOptions, char **name, void **address, uint64_t *size)¶
Get Const Managed Data.
Note: call twice, first get numOptions, Second get the rest only support hp off
- Parameters
numOptions – [out] Pointer to ConstManagedData array size.
exe – [in] Pointer to executable object.
name – [out] ConstManaged pair of name.
address – [out] ConstManaged pair of address.
size – [out] ConstManaged address pair of size.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExecutableGetConstManagedDataV2(topsExecutable_t exe, unsigned int *numOptions, char **name, void **address, uint64_t *size, int64_t *uid, void **flag)¶
Get Const Managed Data.
Note: call twice, first get numOptions, Second get the rest support both hp off and hp on
- Parameters
numOptions – [out] Pointer to ConstManagedData array size.
exe – [in] Pointer to executable object.
name – [out] ConstManaged pair of name.
address – [out] ConstManaged pair of address.
size – [out] ConstManaged address pair of size.
uid – [out] Constant uid
flag – [out] Flag used by user to indicate constant partial h2d
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExecutableUpdateConstantKey(topsExecutable_t exe)¶
Update Constant Section Key.
Note: must called before save executable to file
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExecutableUpdateRuntimeResource(topsExecutable_t exe, topsResource_t res, topsStream_t stream)¶
Update Executable Runtime Resource.
Note: call after refit, will update constant partially by h2d
- Parameters
exe – [in] Pointer to executable object.
res – [in] Pointer to res_bundle object.
stream – [in] stream identifier.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExecutableLoadConstData(topsExecutable_t exe, topsResource_t res, void **dev_ptr, size_t *dev_ptr_count)¶
Load constant data.
NOTE: is deprecated because of twice call
- Parameters
exe – [in] Pointer to executable object.
res – [in] Pointer to res_bundle object.
dev_ptr – [out] Dev_mem of executable.
dev_ptr_count – [out] Dev_mem count of executable.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExecutableLoadConstDataV2(topsExecutable_t exe, topsResource_t res, void **dev_ptr)¶
Load constant data.
NOTE: call topsExecutableQueryInfo to get device mem count to initialize dev_ptr
- Parameters
exe – [in] Pointer to executable object.
res – [in] Pointer to res_bundle object.
dev_ptr – [out] Dev_mem of executable.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExecutableGetRuntimeOutputShape(topsExecutable_t exe, topsShape_t *inputs_shape, topsShape_t *outputs_shape, bool *infer_success)¶
Get runtime dynamic output shape.
Note: flag indicate shape infer result, true for success and false for failed. for shape infer failed and legacy executable, return static output shape
- Parameters
exe – [in] Pointer to executable object.
inputs_shape – [in] Pointer to input shape.
outputs_shape – [out] Pointer to output shape.
infer_success – [out] flag to indicate shape infer result.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExecutableGetSubFuncInfo(topsExecutable_t exe, char **info_raw_data, size_t *info_size, int **param_info, int *param_count)¶
get excutable sub function information
Note: use only in refit stage 2, user should parse the Info_RawData to get detailed sub function information example: param_info[2, 1, 3, 2, 3, 1] means subfuncA need 2 inputs, 1 output; subfuncB need 3 inputs, 2 outputs; subfuncC need 3 inputs, 1 output
- Parameters
exe – [in] Pointer to executable object
Info_RawData – [out] raw data of sub function information
Info_size – [out] size of Info_RawData
param_info – [out] count list of input and output for each sub function
param_count – [out] total count of input and output
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExecutableGetRefitFlag(topsExecutable_t exe, int *refit_flag)¶
get excutable refit flag
- Parameters
exe – [in] Pointer to executable object.
refit_flag – [out] the flag indicate if this executable is refitable
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExecutableCallSubFunc(topsExecutable_t exe, const char *func_name, void **inputs, size_t input_count, int64_t *input_dims, size_t *input_rank, void **outputs, size_t output_count, topsStream_t stream)¶
call sub function in executable
Limitation: inputs and outputs only support alloc by topsMalloc/topsHostMalloc
Note: use only in refit stage 2 for constant preprocessing
- Parameters
exe – [in] Pointer to executable object.
func_name – [in] sub function name
inputs – [in] Inputs of sub function.
input_count – [in] Inputs count of sub function.
input_dims – [in] Inputs dims List of sub function.
input_rank – [in] Inputs dims rank of sub function.
outputs – [in] Outputs of sub function.
output_count – [in] Outputs count of sub function.
stream – [in] stream identifier.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsDeviceSetResource(topsResource_t res)¶
Set default resource_bundle.
Limitation: set current thread global resource_bundle, set resource_bundle before all tops APIs
- Parameters
res – [in] Pointer to res_bundle object.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsConstBufferGet(uint64_t hash_key, void *init_data, size_t size, void **ptr, topsStream_t stream)¶
Get const buffer.
Get or init const buffer with special hash_key
- Parameters
hash_key – [in] value to const buffer special key.
init_data – [in] Pointer to const raw buffer.
size – [in] Size to const buffer size.
ptr – [out] Pointer to get const buffer.
stream – [in] stream identifier.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsConstBufferPut(uint64_t hash_key, topsStream_t stream)¶
Put const buffer.
- Parameters
hash_key – [in] value to const buffer special key.
stream – [in] stream identifier.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExecutableQueryInfo(topsExecutable_t exe, topsExecutableInfoType_t info_type, uint64_t *data)¶
Query executable info.
Limitation: user need to allocate memory for data
- Parameters
exe – [in] Pointer to executable object.
info_type – [in] Type to query.
data – [out] Pointer to query output data.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExecutableQueryInfoV2(topsExecutable_t exe, topsExecutableInfoType_t info_type, int64_t *data)¶
Query executable info.
Limitation: user need to allocate memory for data
NOTE: bool infomation also return by data, 1:true, 0:false
- Parameters
exe – [in] Pointer to executable object.
info_type – [in] Type to query.
data – [out] Pointer to query output data.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExecutableQueryInfoV3(topsExecutable_t exe, topsExecutableInfoType_t info_type, char **data)¶
Query executable string info.
Limitation: user need to allocate memory for data
- Parameters
exe – [in] Pointer to executable object.
info_type – [in] Type to query.
data – [out] Pointer to query output string info.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExecutableQueryInputName(topsExecutable_t exe, int index, char **name)¶
Query executable input name.
- Parameters
exe – [in] Pointer to executable object.
index – [in] Specify which input to query.
name – [out] The name of the input.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExecutableQueryOutputName(topsExecutable_t exe, int index, char **name)¶
Query executable output name.
- Parameters
exe – [in] Pointer to executable object.
index – [in] Specify which output to query.
name – [out] The name of the output.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExecutableSaveToFile(topsExecutable_t exe, const char *path)¶
Save executable to a specified file.
- Parameters
exe – [in] Pointer to executable object.
path – [in] The name of the file to be used to save executable.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExtMallocWithBank(void **ptr, size_t size, uint64_t bank)¶
Allocate memory on the specify memory bank.
If size is 0, no memory is allocated, *ptr returns non-nullptr, and topsSuccess is returned.
NOTE: default bank is 0
Use topsFree to release ptr
- Parameters
ptr – [out] Pointer to the allocated memory
size – [in] Requested memory size
bank – [in] memory bank
- Returns
topsSuccess, #topsErrorOutOfMemory, topsErrorInvalidValue (bad context, null *ptr)
-
TOPS_PUBLIC_API topsError_t topsExtMallocWithBankV2(void **ptr, size_t size, uint64_t bank, unsigned int flags)¶
Allocate memory on the specify memory bank.
If size is 0, no memory is allocated, *ptr returns non-nullptr, and topsSuccess is returned.
NOTE: default bank is 0
Use topsFree to release ptr
- Parameters
ptr – [out] Pointer to the allocated memory
size – [in] Requested memory size
bank – [in] memory bank
flags – [in] Type of memory allocation. flags only support topsDeviceMallocDefault/topsMallocTopDown/topsMallocForbidMergeMove
- Returns
topsSuccess, #topsErrorOutOfMemory, topsErrorInvalidValue (bad context, null *ptr)
-
TOPS_PUBLIC_API topsError_t topsExtMallocWithAffinity(void **ptr, size_t size, uint64_t bank, unsigned int flags)¶
Allocate memory on the logical memory bank.
If size is 0, no memory is allocated, *ptr returns non-nullptr, and topsSuccess is returned.
NOTE: default bank is 0, the logical bank will be mapped to physical bank
Use topsFree to release ptr
- Parameters
ptr – [out] Pointer to the allocated memory
size – [in] Requested memory size
bank – [in] memory bank
flags – [in] Type of memory allocation. flags only support topsDeviceMallocDefault/topsMallocTopDown/topsMallocForbidMergeMove
- Returns
topsSuccess, #topsErrorOutOfMemory, topsErrorInvalidValue (bad context, null *ptr)
-
TOPS_PUBLIC_API topsError_t topsExtLaunchCooperativeKernelMultiCluster(const topsExtLaunchParams_t *launchParamsList, int numClusters, dim3 gridDim, dim3 blockDim, topsStream_t stream)¶
Asynchronously launch kernel.
- Parameters
launchParamsList – [in] Pointer to launch params List.
numClusters – [in] Size to use cluster count.
gridDim – [in] Grid Dimension to use.
blockDim – [in] Block Dimension to use.
stream – [in] stream identifier.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExtSetProfileMeta(uint8_t *meta, uint32_t size, int64_t compilation_id)¶
Set profile meta data.
- Parameters
Pointer – [in] to profile meta data.
profile – [in] meta data size in bytes.
compilation – [in] id.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExtSetProfileMetas(uint32_t meta_cnt, uint8_t *metas[], uint32_t size[], int64_t compilation_id[])¶
Set profile meta data with array.
- Parameters
meta – [in] data count.
Pointer – [in] to profile meta data array.
profile – [in] meta data size array in bytes.
compilation – [in] id array.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsScatterMemoryGetInfo(const void *dev_ptr, int index, int64_t *win_pos, size_t *win_size, int64_t *map_ctrl, size_t *map_size)¶
get scatter memory config: win_pos and map_ctrl.
This interface only works when dev_ptr is created with topsMallocForScatter and init with topsScatterSetSubMem.
Limitation: user need to allocate memory win_pos/map_ctrl array
- Parameters
dev_ptr – [in] Pointer to scatter memory.
index – [in] Index to sub mem.
win_pos – [in] Pointer to sub memory window position.
win_size – [in] Pointer to sub memory window position size.
map_ctrl – [in] Pointer to sub memory map ctrl.
map_size – [in] Pointer to sub memory map ctrl size.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsScatterMemoryGetSubNum(const void *dev_ptr, size_t *size)¶
Get the sub memory number in a scatter memory.
This interface only works when dev_ptr is created with topsMallocScatter and init with topsScatterSetSubMem.
- Parameters
dev_ptr – [in] Pointer to scatter memory.
Size – [in] Pointer to sub memory number size.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsMallocScatter(void **ptr, size_t size)¶
Creates scatter memory on local device.
Creates device memory object for holding a list of scattered DeviceMemory objects. Memory is not actually allocated. User can call topsScatterPopulateSub to allocate sub device memory and topsScatterGetSubMem to query sub memory objects. Scatter DeviceMemory handles will be automatically processed by runtime API such like topsMemcpy, it can be viewed as a plain memory buffer.
Use topsFree to release ptr
A scatter DeviceMemory is invalid until it’s fully constructed with correctly splitted sub DeviceMemory objects both in size and dimension (all related DeviceMemory objects have invoked SetDims).
- Parameters
ptr – [in] Pointer to the allocated memory
size – [in] Requested creation size in bytes.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsScatterPopulateSub(const void *dev_ptr, size_t bpe, uint64_t bank, int64_t *dims, int64_t *win_pos, int64_t *map_ctrl, size_t rank_size)¶
Populate scatter memory with sub-memory.
This interface only works when dev_ptr is created with topsMallocScatter API. Each invocation populates one sub-memory. The parent scatter memory owns this sub-memory, that all populated sub-memory are freed if parent is free. topsScatterGetSubMem WON’T retain the sub-memory.
- Parameters
dev_ptr – [in] Pointer to scatter memory.
bpe – [in] memory bpe of single sub-memory.
bank – [in] memory bank of single sub-memory.
dims – [in] The dimension to be set.
win_pos – [in] The anchor of window in the scatter memory holding this submemory.
map_ctrl – [in] The dimension remap of reshaping. It’s a natural number up to rank.
rank_size – [in] The size of map ctrl/dims/win_position, the max size is 8.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsScatterInplace(const void *dev_ptr, size_t sub_count, size_t bpe, uint64_t *bank_list, int64_t **dims_list, int64_t **win_pos_list, int64_t **map_ctrl_list, size_t rank_size)¶
Inplace scatter memory with sub-memory description.
This interface only works when dev_ptr is created with topsMallocScatter API. The parent scatter memory owns this sub-memory, that all populated sub-memory are freed if parent is free.
topsScatterGetSubMem WON’T retain the sub-memory.
- Parameters
dev_ptr – [in] Pointer to scatter memory, if dev_ptr has sub_memory, sub memory must be allocated by topsScatterPopulateSub
sub_count – [in] Number of sub memory
bpe – [in] memory bpe of single sub-memory.
bank_list – [in] memory bank of sub-memory list.
dims_list – [in] The dimension list to be set.
win_pos_list – [in] The anchor of window list in the scatter memory holding this submemory.
map_ctrl_list – [in] The dimension remap of reshaping list. It’s a natural number up to rank.
rank_size – [in] The size of map ctrl/dims/win_position, the max size is 8.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsScatterGetSubMem(const void *dev_ptr, int index, void **sub_dptr)¶
Get a submemory from scatter memory .
This interface only works when DeviceMemory is created with topsMallocScatter API. In fact, user can hold the sub DeviceMemory instead of getting it from scatter.
- Parameters
dev_ptr – [in] Pointer to scatter memory.
index – [in] The index returned by ScatterSetSubMemory.
sub_dptr – [in] The submemory object that user set by topsScatterSetSubMem.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsMemoryReduceAsync(void *lhs, void *rhs, void *result, enum topsReduceOpType op, enum topsReduceDataType dtype, uint32_t element_cnt, topsStream_t stream)¶
Asynchronously request reduce service.
Request reduce service on Device. The lhs, rhs and result should be on the same device
- Parameters
lhs – [in] Left Hand Side device memory address object.
rhs – [in] Right Hand Side device memory object.
result – [in] Result device memory object.
op – [in] Reduce op type.
dtype – [in] Reduce data type.
element_cnt – [in] Count of data to do the calculation
stream – [in] stream identifier.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsMemCachePrefetch(topsDeviceptr_t dptr, size_t size)¶
Prefetch the contents of a device memory to L2 buffer. The API is available for GCU 3.0 product only, no effect for other products. A device memory range is defined through device memory pointer and size. A device memory range can be equal to original device memory or a sub range of original device memory. Upper-level software needs to guarantee sub memories have no overlap. Before prefetch, need to make sure the contents on device memory is ready.
- Parameters
dptr – [in] Global device pointer
size – [in] Global size in bytes
- Returns
topsSuccess, topsErrorInvalidValue, topsErrorMemoryAllocation
-
TOPS_PUBLIC_API topsError_t topsMemCacheInvalidate(topsDeviceptr_t dptr, size_t size)¶
Inavlidate L2 buffer cache data. The API is available for GCU 3.0 product only, no effect for other products. The API deletes L2 buffer cache data which it is prefetched through API topsMemCachePrefetch(). The original device memory keeps no change. A device memory range is defined through device memory pointer and size.
- Parameters
dptr – [out] Global device pointer
size – [out] Global size in bytes
- Returns
topsSuccess, topsErrorInvalidValue, topsErrorMemoryAllocation
-
TOPS_PUBLIC_API topsError_t topsMemCacheInvalidateAll(topsStream_t stream)¶
Invalidate all caches of device memory.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsMemCacheFlushAll(topsStream_t stream)¶
Flush all caches of device memory.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExtGetMcAvailableMemSize(uint64_t mc, uint64_t *size)¶
Get available memory size for mc
- Parameters
mc – [in] mc index.
size – [out] Pointer to available size for this mc.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExecutableGetSectionCount(topsExecutable_t exe, topsExtExecutableSectionHeaderType_t sh_type, uint64_t *count)¶
Get section count of the executable
Note: call with sh_type = topsExtExecutableSectionHeaderType::SHT_LAST_TYPE will return total section counts of executable
- Parameters
exe – [in] Pointer to executable object.
sh_type – [in] Section header type.
count – [out] Pointer to count of the section.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExecutableGetStackSize(topsExecutable_t exe, uint64_t *stack_count, uint64_t *stack_size)¶
Get stack size of the executable
- Parameters
exe – [in] Pointer to executable object.
stack – [out] count pointer to count stack size
array – [out] to return stack size for every bank.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsExecutableGetSectionInfo(topsExecutable_t exe, topsExtExecutableSectionHeaderType_t sh_type, int mc, topsExtExecutableSectionInfo_t *info)¶
Get specific section information of the executable
- Parameters
exe – [in] Pointer to executable object.
sh_type – [in] Section header type.
mc – [in] mc index.
info – [out] section information.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsGetMemUsageInfo(size_t *current_used_bytes, size_t *peek_used_bytes, int bank_total)¶
Get memory usage information.
- Parameters
current_used_bytes – [out] Current size of used memory in bytes.
peak_used_bytes – [out] Peak size of used memory in bytes.
bank_total – [in] The total number of memory bank.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsGetAffinityBankList(int *affinity_bank_list, int bank_total)¶
Get memory bank list of affinity
- Parameters
affinity_bank_list – [out] Affinity bank list of memory.
bank_total – [in] The total number of memory bank.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsGetAffinityBankListV2(topsResource_t res, int *affinity_bank_list, int bank_total)¶
Get memory bank list of affinity
- Parameters
res – [in] Pointer to res_bundle object.
affinity_bank_list – [out] Affinity bank list of memory.
bank_total – [in] The total number of memory bank.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsMemGetInfoExt(size_t *used_list, size_t *free_list, size_t *max_available_list, size_t *total_list, int bank_total)¶
Query memory usage info list.
Return list of free, used, max available, and total memory on the memory bank.
- Parameters
used_list – [out] Pointer to used memory list.
free_list – [out] Pointer to free memory list.
max_available_list – [out] Pointer to max_available memory list.
total_list – [out] Pointer to total memory list.
bank_total – [in] The total number of memory bank.
- Returns
topsSuccess, topsErrorInvalidDevice, topsErrorInvalidValue
-
TOPS_PUBLIC_API topsError_t topsKernelSignalMaxNumGet(int *num)¶
get the maximum kernel signal number on current device.
- Parameters
num – [out] a pointer to the variable alloced by caller.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsKernelSignalAvailableNumGet(int *num)¶
get the available kernel signal number on current device.
- Parameters
num – [out] a pointer to the variable alloced by caller.
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsKernelSignalAlloc(int *handle)¶
get a kernel signal handle alloced by kmd on current device.
- Parameters
handle – [out] put output of handle on here
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsKernelSignalFree(int handle)¶
free a kernel signal handle alloced by kmd on current device.
- Parameters
handle – [in] the kernel signal handle
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsKernelSignalRead(int handle, uint32_t *value)¶
get a kernel signal value indicated by handle.
- Parameters
handle – [in] the kernel signal handle
value – [out] put output of value on here
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsKernelSignalWrite(int handle, uint32_t value)¶
set a kernel signal value indicated by handle.
- Parameters
handle – [in] the kernel signal handle
value – [in] the value of kernel signal
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsRoceCreateQueue(uint32_t port, topsDeviceptr_t sq_base_ptr, size_t sq_size, uint8_t sq_user, uint32_t *q_id)¶
create a roce sq.
- Parameters
port – [in] the ROCE port id
sq_base_ptr – [in] the base address of the sq’s ring buffer
sq_size – [in] the size of the sq’s ring buffer
sq_user – [in] the user of the sq
q_id – [out] the allocated sq id
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsRoceQueryQueue(uint32_t port, uint32_t q_id, uint64_t *mac, uint32_t *ip)¶
query the specified sq’s info.
- Parameters
port – [in] the ROCE port id
q_id – [in] the sq id
mac – [out] mac address binded with the sq
id – [out] ip address binded with the sq
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsRoceBindQueuePair(uint32_t port, uint32_t q_id, uint32_t remote_q_id, uint64_t remote_mac, uint32_t remote_ip)¶
Bind a queue pair.
- Parameters
port – [in] the ROCE port id
q_id – [in] the local sq id
remote_q_id – [in] the remote sq id
remote_mac – [in] the remote mac address binded with the remote sq
remote_ip – [in] the remote ip address binded with the remote sq
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsRoceDeleteQueue(uint32_t port, uint32_t q_id)¶
Delete a sq.
- Parameters
port – [in] the ROCE port id
q_id – [in] the local sq id
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsRoceWriteQueue(uint32_t port, uint32_t q_id, void *dst, void *src, size_t size)¶
emit a write request to sq for master mode.
- Parameters
port – [in] the ROCE port id
q_id – [in] the local sq id
dst – [in] destination device ptr
src – [in] source device ptr
size – [in] the write request size
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsGetGlobalRandomSeed(uint64_t *seed)¶
get global random seed.
- Parameters
seed – [out] global random seed
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsSetGlobalRandomSeed(uint64_t seed)¶
set global random seed.
- Parameters
seed – [in] global random seed
- Returns
topsSuccess on success, or other on failure.
-
TOPS_PUBLIC_API topsError_t topsSetDeviceAndResourceReservation(int deviceId, topsResourceReservation_t *rsvd_res)¶
Set default device to be used for subsequent tops API calls from this thread.
Sets
device
as the default device for the calling host thread. Valid device id’s are 0… (topsGetDeviceCount()-1).Many TOPS APIs implicitly use the “default device” :
Any device memory subsequently allocated from this host thread (using topsMalloc) will be allocated on device.
Any streams or events created from this host thread will be associated with device.
Any kernels launched from this host thread (using topsLaunchKernel) will be executed on device (unless a specific stream is specified, in which case the device associated with that stream will be used).
This function may be called from any host thread. Multiple host threads may use the same device. This function does no synchronization with the previous or new device, and has very little runtime overhead. Applications can use topsSetDevice to quickly switch the default device before making a TOPS runtime call which uses the default device.
The default device is stored in thread-local-storage for each thread. Thread-pool implementations may inherit the default device of the previous thread. A good practice is to always call topsSetDevice at the start of TOPS coding sequency to establish a known standard device.
See also
- Parameters
deviceId – [in] Valid device in range 0…(topsGetDeviceCount()-1).
resRequest – [in] Create a device by applying for a specified number of threads.
- Returns
topsSuccess, topsErrorInvalidDevice, #topsErrorDeviceAlreadyInUse