2. Class

2.1. Dims32

This section describes the TopsInference Dimension information API.

using Dims = Dims32

Alias for Dims32.

Warning

: This alias might change in the future.

class TopsInference::Dims32
#include <TopsInferRuntime.h>

Dimension information definition.

Public Members

int32_t nbDims = 0

The actual dimension.

int32_t dimension[MAX_DIMS] = {0}

The size in each dimension.

Public Static Attributes

static const int32_t MAX_DIMS = 8

The max dimension supported by the class.

2.2. IErrorManager

This section describes the TopsInference error manager information API.

class TopsInference::IErrorManager
#include <TopsInferRuntime.h>

Error Manager for recording the internal errors.

Note

Examples:

IParser *onnx_parser = create_parser(ParserType::TIF_ONNX);
IErrorManager *error_manager = create_error_manager();

try {
  const char *model = "add.onnx";
  INetwork *network = onnx_parser->readModel(model);
  if (network == nullptr) {
    int32_t error_count = error_manager->getErrorCount();
    for (int32_t i = 0; i < error_count; i++) {
      char* error_msg = error_manager->getErrorMsg(i);
      TIFStatus error_status = error_manager->getErrorStatus(i);
    }
    error_manager->clear();
  }
} catch (std::exception &e) {
  int32_t error_count = error_manager->getErrorCount();
  for (int32_t i = 0; i < error_count; i++) {
    char* error_msg = error_manager->getErrorMsg(i);
    TIFStatus error_status = error_manager->getErrorStatus(i);
}
  error_manager->clear();
}

Warning

Which is thread-safe.

Public Functions

virtual TIFStatus getErrorStatus(int32_t index) = 0

Get the Error Status.

See also

Status code.

Parameters

index – The index for queried error, index < getErrorCount().

Returns

TIFStatus.

virtual int32_t getErrorCount() = 0

Get the Error Count object.

Returns

int32_t The error count in error manager.

virtual const char *getErrorMsg(int32_t index) = 0

Get the Error Msg object.

Parameters

index – The index for queried error, index < getErrorCount().

Returns

const char* The error message.

virtual bool reportStatus(int32_t index) = 0

print the Error Status and Msg as String.

Parameters

index – The index for queried error, index < getErrorCount().

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual bool clear() = 0

Need to call after finishing to query.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

2.3. ITensor

This section describes the TopsInference ITensor API.

typedef class ITensor *TensorPtr_t
class TopsInference::ITensor
#include <TopsInferRuntime.h>

ITensor information definition. The attributes of input and output, including origin data pointer, shape, device type, etc., are recorded in ITensor.

Public Functions

virtual void *getOpaque() = 0

Get the data pointer of the ITensor.

Returns

The data pointer.

virtual bool setOpaque(void *opaque) = 0

Set the data pointer of the ITensor.

Parameters

opaque – The data pointer. If using DeviceType as HOST, opaque represent HOST memory pointer, if using DeviceType as Device, opaque should use pointer of DeviceMemory.

virtual Dims getDims() = 0

Get the dimension.

Returns

Dimension array.

virtual void setDims(Dims dims) = 0

Set dimension.

Parameters

dims – The dimension.

virtual DataDeviceType getDeviceType() = 0

Get the device type of the ITensor buffer(.

See also

DataDeviceType).

Returns

The device type of the ITensor buffer.

virtual void setDeviceType(DataDeviceType device_type) = 0

Set the device type of the ITensor buffer(.

See also

DataDeviceType).

Parameters

device_type – The device type of the ITensor buffer.

virtual void release() = 0

Release current ITensor, and this interface must be called once you want to delete the ITensor.

virtual DataType getDataType() = 0

Get the type of the ITensor buffer(.

See also

DataType).

Returns

The type of the ITensor buffer.

virtual void setDataType(DataType datatype) = 0

Set the type of the ITensor buffer(.

See also

DataType).

Returns

The type of the ITensor buffer.

2.4. IConfig

This section describes the TopsInference Config API.

class TopsInference::IConfig
#include <TopsInferRuntime.h>

Common config definition.

Subclassed by TopsInference::ICalibratorConfig, TopsInference::IEngineConfig, TopsInference::IOptimizerConfig, TopsInference::IParserConfig

Public Functions

virtual bool loadConfig(const char *proto) = 0

Load config from proto buffer file name.

Parameters

proto – The proto buffer file name.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual bool saveConfig(const char *proto) = 0

Save config into proto buffer file name.

Parameters

proto – The proto buffer file name.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

class TopsInference::IParserConfig : public TopsInference::IConfig
#include <TopsInferRuntime.h>

Config for IParser.

Public Functions

virtual void setSimplify(bool simplify) = 0

Simplify the network when parsing.

Parameters

simplify – If True, to simplify the network.

virtual bool getSimplify() = 0

Check the simplify status in parser.

Returns

bool.

Returns

  • true. – Config is simplify.

  • false. – Config is not simplify.

class TopsInference::IOptimizerConfig : public TopsInference::IConfig
#include <TopsInferRuntime.h>

Config for Optimizer.

Public Functions

virtual bool setCompileOptions(const char *options) = 0

Set the Compile Options.

Parameters

options – is json format.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual const char *getCompileOptions() const = 0

Get the compile options.

Returns

const char* Current compile options.

virtual bool setBuildFlag(BuildFlag flag) = 0

Set the build flag.

See also

BuildFlag.

Parameters

flag – build flag.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual bool setBuildFlag(int64_t flag) = 0

Set the build flag.

Parameters

flag – formed by composing multiple BuildFlag.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual int64_t getBuildFlag() = 0

Get the build flag.

Returns

int64_t BuildFlag.

virtual bool setMaxShapeRange(const char *op_shape) = 0

set the max shape dim for the specified op with json string.

Note

Examples:

Json::Value max_shape_range_setting;
Json::Value op_max_val;
op_max_val["main"].append("1,3,512,512");
max_shape_range_setting.append(op_max_val);
Json::Value min_shape_range_setting;
Json::Value op_min_val;
op_min_val["main"].append("1,3,112,112");
min_shape_range_setting.append(op_min_val);
char* max_setting_str = max_shape_range_setting.toStyledString();
char* min_setting_str = min_shape_range_setting.toStyledString();
assert((optimizer_config->setMaxShapeRange(max_setting_str.c_str()),
    "[Error] set max shape range failed!"));
assert((optimizer_config->setMinShapeRange(min_setting_str.c_str()),
    "[Error] set min shape range failed!"));

Parameters

op_shape – is json format.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual bool setMinShapeRange(const char *op_shape) = 0

set the min shape dim for the specified op with json string.

See also

setMaxShapeRange.

Parameters

op_shape – is json format.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual bool setInt8Calibrator(ICalibrator *calibrator) = 0

config the calibrator.

Parameters

calibrator – a pointer to the calibrator.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual void setRefitPreprocess(bool refit_preprocess_flag) = 0

Set the refit preprocess mode.

Parameters

refit_preprocess_flag – If true, enable refit with preprocess.

class TopsInference::IEngineConfig : public TopsInference::IConfig
#include <TopsInferRuntime.h>

Config for engine.

Public Functions

virtual const char *getEngineVersion() = 0

Get the engine version.

Returns

const char* engine version.

virtual void setAutoBatchMode(bool auto_mode) = 0

Set the auto batch mode.

Parameters

auto_mode – If true, set auto batch mode.

virtual bool isAutoBatchMode() = 0

Get the auto batch mode.

Returns

bool.

Returns

  • true. – Config is auto batch mode.

  • false. – Config is not auto batch mode.

class TopsInference::ICalibratorConfig : public TopsInference::IConfig
#include <TopsInferRuntime.h>

Config for Calibrator.

Public Functions

virtual bool setOpPrecision(const char *op_name, DataType dtype) = 0

set the op precision used for calibration.

Parameters
  • op_name – op name to be set.

  • dtype – set the op to dtype precision.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual bool setOpCalibrateAlgo(const char *op_name, CalibrationAlgoType algo) = 0

set the op algorithm used for calibration.

Parameters
  • op_name – op name to be set.

  • algo – set the op to algo algorithm.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual bool setOpThreshold(const char *op_name, double thres_val) = 0

set the op threshold for calibration.

Parameters
  • op_name – op name to be set.

  • thres_val – set the op to thres_val threshold.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual bool enableQuantizeOps(const char **op_types, int32_t num) = 0

set the op types to be quantized.

Parameters

op_types – op types to be set.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual bool getQuantizeOps(char **op_types, int32_t *num) = 0

get the op type which will be quantized.

Parameters
  • op_types – op types to be quantized.

  • num – the num of quantized op_type.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual bool disableQuantizeOps(const char **op_types, int32_t num) = 0

exclude the op types which will be quantized.

Parameters
  • op_types – op types to be set.

  • num – the num of quantized op_type.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual DataType getOpPrecision(const char *op_name) = 0

get the op precision set in calibration.

Parameters

op_name – op name.

Returns

DataType op precision.

virtual CalibrationAlgoType getOpCalibrateAlgo(const char *op_name) = 0

get the op algorithm set in calibration.

Parameters

op_name – op name.

Returns

CalibrationAlgoType op algorithm.

virtual double getOpThreshold(const char *op_name) = 0

get the op threshold set in calibration.

Parameters

op_name – op name.

Returns

double op threshold.

2.5. ILayer

This section describes the TopsInference Layer information API.

class TopsInference::ILayer
#include <TopsInferRuntime.h>

Layer definition. Base class for all layer classes in a network definition.

Public Functions

virtual LayerType getType() = 0

Get the layer type.

See also

LayerType.

Returns

LayerType.

virtual void setName(const char *name) = 0

Set the layer name, take effect at compile time.

Parameters

name – layer name.

virtual const char *getName() = 0

Get the layer name.

Returns

const char*.

virtual bool setPrecision(DataType dataType) = 0

Set the layer precision, take effect at compile time. In TIF_KTYPE_MIX_FP16 mode, user can set TIF_FP32 or TIF_FP16. In TIF_KTYPE_INT8_MIX_FP32 mode, user can set TIF_FP32 or TIF_INT8.

Parameters

dataType – set DataType::TIF_FP32 or DataType::TIF_FP16 or DataType::TIF_INT8.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual DataType getPrecision() = 0

Get the layer precision.

Returns

DataType.

virtual bool isPrecisionSet() = 0

Check whether the layer precision is set or not.

Returns

bool whether the precision has been set.

virtual void resetPrecision() = 0

Reset the layer precision, take effect at compile time.

2.6. INetwork

This section describes the TopsInference Network definition API.

class TopsInference::INetwork
#include <TopsInferRuntime.h>

Network definition. A network definition to the builder.

Public Functions

virtual int32_t getLayerNum() = 0

Get the layer number.

Returns

int32_t layer number.

virtual ILayer *getLayer(int32_t index) = 0

Get the layer according to the index.

Parameters

index – the index must be less than the layer number.

Returns

ILayer* Ilayer pointer.

virtual void dump() = 0

Dump the network information.

virtual ILayer **getLayer(const char *regex_str, int32_t *match_num) = 0

Get the layer according to the layer name, also support fuzzy match by regex expression.

Parameters
  • regex_str – regex expression.

  • match_num – matched layer number.

Returns

ILayer* Ilayer pointer.

2.7. IParser

This section describes the TopsInference Parser definition API.

class TopsInference::IParser
#include <TopsInferRuntime.h>

Parser definition. IParser is a compiler component that translate onnx model into TopsInference network definition.

Public Functions

virtual INetwork *readModel(const char *model) = 0

Read model file, support onnx file.

See also

{INetwork}.

Parameters

model – model file.

Returns

INetwork*.

virtual INetwork *readModelFromStr(const char *model, uint32_t model_size) = 0

Read model string.

See also

{INetwork}.

Parameters
  • model – model string.

  • model_size – model size.

Returns

INetwork*.

virtual INetwork *readModelObj(const void *model_obj) = 0

Read model object. This interface is not ABI compatibility. Please use abi1 onnx, compiled with _GLIBCXX_USE_CXX11_ABI=1, which is the default value for gcc with version numbers greater than 5.

See also

{INetwork}.

Parameters

model – model object.

Returns

INetwork*.

virtual INetwork *getModel() = 0

Get the model.

Returns

INetwork*.

virtual void setInputNames(const char *val) = 0

Set the input names before reading model, when there are multi inputs, names are separated with a comma, such as “a,b”. If not set, the original properties of the network are used.

Parameters

val – input names.

virtual void setInputDtypes(const char *val) = 0

Set the input data types before reading model, when there are multi inputs, types are separated with a comma, such as “TIF_FP32,TIF_FP32”. Subnets of the network can be intercepted after setting the input and output. If not set, the original properties of the network are used.

Parameters

val – input dtypes.

virtual void setInputShapes(const char *val) = 0

Set the input shapes before reading model, when there are multi inputs, shapes are separated with a colon, such as “3,4:3,4”. If not set, the original properties of the network are used.

Parameters

val – input shapes.

virtual void setOutputNames(const char *val) = 0

Set the output names before reading model, when there are multi outputs, names are separated with a comma, such as “a,b”. If not set, the original properties of the network are used.

Parameters

val – output names.

virtual void setOutputDtypes(const char *val) = 0

Set the output data types before reading model, when there are multi outputs, types are separated with a comma, such as “TIF_FP32,TIF_FP32”. Subnets of the network can be intercepted after setting the input and output. If not set, the original properties of the network are used.

Parameters

val – output dtypes.

virtual IParserConfig *getConfig() = 0

Get the config pointer used for parser, use IParserConfig*-> to set.

Returns

IParserConfig*.

2.8. IStream

This section describes the TopsInference Stream definition API.

using topsInferStream_t = IStream*
class TopsInference::IStream
#include <TopsInferRuntime.h>

Stream definition. Stream of asynchronous action.

Public Functions

virtual bool synchronize() = 0

Stream synchronize, Wait for any pending asynchronous action in stream.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

2.9. IFuture

This section describes the TopsInference Future definition API.

class TopsInference::IFuture
#include <TopsInferRuntime.h>

IFuture is used for asynchronous inference, Used to describe whether the current output data status is ready.

Public Functions

virtual void wait() = 0

wait until output data is ready.

virtual bool status() = 0

status of output data.

Returns

if output data was ready, return true, otherwise return false.

2.10. IEngine

This section describes the TopsInference Engine definition API.

class TopsInference::IEngine
#include <TopsInferRuntime.h>

Executable definition, the serialized engine contains the necessary copies of the weights, the parser, network definition.

See also

IOptimizer.

Public Functions

virtual bool saveExecutable(const char *name) = 0

Save engine to file.

Parameters

name – Engine file.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual bool loadExecutable(const char *name) = 0

Load engine from file.

Parameters

name – Engine file.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual bool loadExecFromBuffer(const void *blob, std::size_t size) = 0

Deserialize an engine from buffer.

Parameters
  • blob – The memory that holds the serialized executable.

  • size – The size of the memory.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual Dims getInputShape(int32_t index) = 0

Get the set input shape corresponding to the given index, the index argument must be less than the value of getInputNum(). if using dynamic shape mode, Dims may contain -1.

Parameters

index – Each engine may have several input nodes, The index-th input.

Returns

Dims.

virtual Dims getOutputShape(int32_t index) = 0

Get the output shape corresponding to the given index, the index argument must be less than the value of getOutputNum().

Parameters

index – Each engine may have several output nodes, The index-th output.

Returns

Dims.

virtual int32_t getInputNum() = 0

Get the input num.

Returns

int32_t.

virtual int32_t getOutputNum() = 0

Get the output num.

Returns

int32_t.

virtual DataType getInputDataType(int32_t index) = 0

Get the input data type corresponding to the given index, the index argument must be less than the value of getInputNum().

Parameters

index – Each engine may have several input nodes, The index-th input.

Returns

DataType.

virtual DataType getOutputDataType(int32_t index) = 0

Get the output data type corresponding to the given index, the index argument must be less than the value of getOutputNum().

Parameters

index – Each engine may have several output nodes, The index-th output.

Returns

DataType.

virtual Dims getMaxInputShape(uint32_t index) = 0

Get the maximum input shape by index, this interface is only used when the current engine’s index-th input has dynamic shape. if used in static shape mode, it will return static shape.

Parameters

index – The input id, the index of the input.

Returns

The maximum input dimension.

virtual Dims getMaxOutputShape(uint32_t index) = 0

Get the maximum output shape by index, this interface is only used when the current engine’s index-th output has dynamic or unknown shape.

Parameters

index – The output id, the index of the output.

Returns

The maximum output shape, if the maximum input shape has not been set when compiling, it will raise error.

virtual Dims getMinInputShape(uint32_t index) = 0

Get the minimum input shape by index, this interface is only used when the current engine’s index-th input has dynamic shape. if used in static shape mode, it will return static shape.

Parameters

index – The input id, the index of the input.

Returns

The minimum input dimension.

virtual IEngineConfig *getConfig() = 0

get the IEngine Config pointer, use IEngineConfig*-> to set.

Returns

IEngineConfig*.

virtual bool run(void **input, void **output, BufferType buf_type, topsInferStream_t stream = nullptr) = 0

run with specified device(cluster).

When running with auto batch mode, the async mode is not supported now, it means that you must keep stream nullptr.

See also

BufferType.

When doing inference with buf_type equal to IN_HOST_OUT_HOST, the async mode is not supported now, it means that you * must keep stream nullptr.

Parameters
  • input – The input buffer bound for current engine, data is arranged in Column-major order.

  • output – The output buffer bound for current engine, data is arranged in Column-major order.

  • buf_type – engine run mode.

  • stream – when stream is not nullptr, asynchronously.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual bool runWithBatch(std::size_t sample_num, void **inputs, void **outputs, BufferType buf_type, topsInferStream_t stream = nullptr, IFuture *future = nullptr) = 0

run with specified device(cluster) and dynamic batch.

See also

BufferType.

Parameters
  • sample_num – sample nums of runtime batch size.

  • input – The sample_num input buffer bound for current engine, data is arranged in Column-major order, [[sample1_input_1, sample2_input_1, … , sample_x_input_1],[sample1_input_2, sample2_input_2, … , sample_x_input_2], …].

  • output – The sample_num output buffer bound for current engine, data is arranged in Column-major order.

  • buf_type – engine run mode.

  • stream – when stream is not nullptr, run asynchronously.

  • future – Future is an optional parameter,When you are using asynchronous mode and you want to know the current request status, please use it.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual TIFStatus runV2 (IN TensorPtr_t *inputs, INOUT TensorPtr_t *outputs, topsInferStream_t stream=nullptr, IFuture *future=nullptr)=0

run with specified device(cluster) and dynamic batch.

See also

Status code.

Parameters
  • inputs – The input tensor list pointer for each input.

  • outputs – The output tensor list pointer for each output.

  • stream – When stream is not null, run with asynchronous mode, otherwise with synchronous mode.

  • future – Future is an optional parameter, please keep it not null When you want to know the current request status with asynchronous mode.

Returns

TIFStatus.

virtual bool runV3 (IN TensorPtr_t *inputs, INOUT TensorPtr_t *outputs, topsStream_t stream=nullptr, IFuture *future=nullptr)=0

run with specified device(cluster) and dynamic batch, and this interface can be mixed with rt3.0.

Parameters
  • inputs – The input tensor list pointer for each input.

  • outputs – The output tensor list pointer for each output.

  • stream – When stream is not null, run with asynchronous mode, otherwise with synchronous mode.

  • future – Future is an optional parameter, please keep it not null When you want to know the current request status with asynchronous mode.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual const char *getInputName(uint32_t index) = 0

Get the input name by index. the index argument must be less than the value of getInputNum() The life cycle of the pointer is consistent with that of the engine, The value may change in reload or rebuild.

Parameters

index – The input id, the index of the input.

Returns

The index-th input name.

virtual const char *getOutputName(uint32_t index) = 0

Get the output name by index. the index argument must be less than the value of getOutputNum() The life cycle of the pointer is consistent with that of the engine, The value may change in reload or rebuild.

Parameters

index – The output id, the index of the output.

Returns

The index-th output name.

virtual size_t getDeviceMemorySize() = 0

Get the memory size of gcu device runtime required.

Returns

The engine’s memory size required. Return 0 if fail to get memory size.

virtual bool shapeInfer (IN const TensorPtr_t *inputs, INOUT TensorPtr_t *outputs)=0

run with specified device(cluster) and dynamic batch If the model input is a dynamic shape, this API can help push out the output shape, if static shape, return the static output shape.

See also

TIF_SHAPE_INFER_FAILED

See also

TIF_SHAPE_INFER_INACCURATE

Parameters
  • input_dims – The input dim list pointer for each input.

  • output_dims – The output dim list pointer for each output.

Returns

bool if success return true otherwise return false, there are two types of errors:

inline virtual bool setIODimensionInfo(const int32_t *inputNIndices, const int32_t *outputNIndices)

Set the N-dimensional information for input and output tensors. If a dimension exists, specify the index of N in that dimension. If it doesn’t exist, set index to -1. It’s not thread safety. If you do not set the N index information after creating the engine, the N index for both inputs and outputs will default to 0, means NHWC/NCHW/NWHC/NXXX formats.

Parameters
  • inputNIndices – The input tensor list pointer for each input.

  • outputNIndices – The output tensor list pointer for each output.

Returns

bool if success return true, otherwise return false.

inline virtual void setMaxWorkspaceSize(std::size_t workspaceSize) noexcept

Set the maximum workspace size.

Parameters

workspaceSize – The maximum GCU temporary memory which the engine can use at execution time.

inline virtual std::size_t getMaxWorkspaceSize() const noexcept

Get the maximum workspace size. By default the workspace size is the size of total global memory in the device.

Returns

The maximum workspace size.

2.11. IOptimizer

This section describes the TopsInference Optimizer definition API.

class TopsInference::IOptimizer
#include <TopsInferRuntime.h>

Optimizer definition. The optimizer will do a series of optimizations on the layer.

Public Functions

virtual IEngine *build(INetwork *network) = 0

Build an engine from network.

See also

INetwork.

See also

IEngine.

Note

Examples:

TopsInference::IEngine* engine = optimizer->build(network);

Parameters

networkINetwork

Returns

IEngine* pointer of IEngine

virtual IOptimizerConfig *getConfig() = 0

Get the config pointer used for optimizer.

See also

BuildFlag use IOptimizerConfig*-> to set.

Returns

IOptimizerConfig*.

2.12. ICalibrator

This section describes the TopsInference Calibrator definition API.

class TopsInference::ICalibrator

Subclassed by TopsInference::IInt8EntropyCalibrator, TopsInference::IInt8MaxMinCalibrator, TopsInference::IInt8MaxMinEMACalibrator, TopsInference::IInt8PercentCalibrator

Public Functions

virtual int32_t getBatchSize() const noexcept = 0

Get the batch size used for calibration batches.

Returns

The batch size.

virtual bool getBatch(TensorPtr_t data[], const char *names[], int32_t num) noexcept = 0

Get a batch of input for calibration The batch size of the input must match the batch size returned by getBatchSize().

Parameters
  • data – An array of pointers to host memory that containing each network input data.

  • names – The names of the network input for each pointer in the binding array.

  • num – The number of pointers in the binding array.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if there are no more batches for calibration.

virtual const void *readCalibrationCache(int64_t &length)

Load a calibration cache. Calibration is potentially expensive, so it can be useful to generate the calibration data once, then use it on subsequent builds of the network. The cache includes the regression cutoff and quantized values used to generate it, and will not be used if these do not batch the settings of the current calibrator. However, the network should also be recalibrated if its structure changes, or the input data set changes, and it is the responsibility of the application to ensure this.

Parameters

length – The length of the cached data, If there is no data,this should be zero.

Returns

A pointer to the cache, or nullptr if there is no data.

virtual bool writeCalibrationCache(const void *ptr, int64_t length)

Save a calibration cache.

Parameters
  • ptr – A pointer to the data to cache.

  • length – The length in bytes of the data to cache.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual CalibrationAlgoType getAlgorithm() noexcept = 0

Get the algorithm used by this calibrator.

Returns

CalibrationAlgoType The algorithm used by the calibrator.

virtual ICalibratorConfig *getConfig()

Get the config pointer used for calibrator.

Returns

ICalibratorConfig*.

class TopsInference::IInt8EntropyCalibrator : public TopsInference::ICalibrator

Public Functions

inline virtual CalibrationAlgoType getAlgorithm() noexcept override

Get the algorithm used by this calibrator.

Returns

CalibrationAlgoType The algorithm used by the calibrator.

class TopsInference::IInt8MaxMinCalibrator : public TopsInference::ICalibrator

Public Functions

inline virtual CalibrationAlgoType getAlgorithm() noexcept override

Get the algorithm used by this calibrator.

Returns

CalibrationAlgoType The algorithm used by the calibrator.

class TopsInference::IInt8MaxMinEMACalibrator : public TopsInference::ICalibrator

Public Functions

inline virtual CalibrationAlgoType getAlgorithm() noexcept override

Get the algorithm used by this calibrator.

Returns

CalibrationAlgoType The algorithm used by the calibrator.

class TopsInference::IInt8PercentCalibrator : public TopsInference::ICalibrator

Public Functions

inline virtual CalibrationAlgoType getAlgorithm() noexcept override

Get the algorithm used by this calibrator.

Returns

CalibrationAlgoType The algorithm used by the calibrator.

class IInt8Calibrator
#include <TopsInferRuntime.h>

Application-implemented interface for calibration.

2.13. IRefitter

This section describes the TopsInference Refitter definition API.

class TopsInference::IRefitter
#include <TopsInferRuntime.h>

Updates weights in an engine.

Public Functions

virtual int32_t getAllWeights(int32_t size, const char **weightsNames) = 0

Get names of all weights that could be refit.

Parameters
  • size[in] The number of weights names that can be safely written to.

  • weightsNames[out] The names of the weights to be updated, or nullptr for unnamed weights.

Returns

The number of Weights that could be refit. It should be call twice. The first use of getAllWeights(0, nullptr) returns size, initializes the weightsNames with the size, and passes in the size and weightsNames to get all weights.

virtual int32_t getMissingWeights(int32_t size, const char **weightsNames) = 0

Get names of missing weights. For example, if some Weights have been set, but the engine was optimized in a way that combines weights, any unsupplied Weights in the combination are considered missing.

Parameters
  • size[in] The number of weights names that can be safely written to.

  • weightsNames[out] The names of the weights to be updated, or nullptr for unnamed weights.

Returns

The number of missing Weights. It should be call twice. The first use of getMissingWeights(0, nullptr) returns size, initializes the weightsNames with the size, and passes in the size and weightsNames to get missing weights.

virtual bool setNamedWeights(const char *name, Weights weights) = 0

Specify new weights of given name.

Parameters
  • name[in] The name of the weights to be updated.

  • weights[in] new weight.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if new weights are rejected.

virtual bool getNamedWeights(const char *name, Weights &weights) = 0

Obtain weights of given name.

See also

Weights.

Parameters
  • name[in] The name of the weights to be obtain.

  • weights[inout] weights must be initialized with type.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if new weights are rejected.

virtual bool refitEngine() = 0

Updates associated engine.

Returns

bool.

Returns

  • true – Return true if succeed.

  • false – Return false if fail.

virtual bool isSupportPreprocess() = 0

Check whether the current refitter supports automatic preprocessing.