2. Class¶

2.1. Dims32¶

This section describes the TopsInference Dimension information API.

using Dims = Dims32¶: Alias for Dims32.

Warning

: This alias might change in the future.

class TopsInference::Dims32¶

#include <TopsInferRuntime.h>

Dimension information definition.

Public Members

int32_t nbDims = 0¶: The actual dimension.

int32_t dimension[MAX_DIMS] = {0}¶: The size in each dimension.

Public Static Attributes

static const int32_t MAX_DIMS = 8¶: The max dimension supported by the class.

2.2. IErrorManager¶

This section describes the TopsInference error manager information API.

class TopsInference::IErrorManager¶

#include <TopsInferRuntime.h>

Error Manager for recording the internal errors.

Note

Examples:

IParser *onnx_parser = create_parser(ParserType::TIF_ONNX);
IErrorManager *error_manager = create_error_manager();

try {
  const char *model = "add.onnx";
  INetwork *network = onnx_parser->readModel(model);
  if (network == nullptr) {
    int32_t error_count = error_manager->getErrorCount();
    for (int32_t i = 0; i < error_count; i++) {
      char* error_msg = error_manager->getErrorMsg(i);
      TIFStatus error_status = error_manager->getErrorStatus(i);
    }
    error_manager->clear();
  }
} catch (std::exception &e) {
  int32_t error_count = error_manager->getErrorCount();
  for (int32_t i = 0; i < error_count; i++) {
    char* error_msg = error_manager->getErrorMsg(i);
    TIFStatus error_status = error_manager->getErrorStatus(i);
}
  error_manager->clear();
}

Warning

Which is thread-safe.

Public Functions

virtual TIFStatus getErrorStatus(int32_t index) = 0¶

Get the Error Status.

See also

Status code.

Parameters: index – The index for queried error, index < getErrorCount().
Returns: TIFStatus.

virtual int32_t getErrorCount() = 0¶

Get the Error Count object.

Returns: int32_t The error count in error manager.

virtual const char *getErrorMsg(int32_t index) = 0¶

Get the Error Msg object.

Parameters: index – The index for queried error, index < getErrorCount().
Returns: const char* The error message.

virtual bool reportStatus(int32_t index) = 0¶

print the Error Status and Msg as String.

Parameters

index – The index for queried error, index < getErrorCount().

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual bool clear() = 0¶

Need to call after finishing to query.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

2.3. ITensor¶

This section describes the TopsInference ITensor API.

typedef class ITensor *TensorPtr_t¶

class TopsInference::ITensor¶

#include <TopsInferRuntime.h>

ITensor information definition. The attributes of input and output, including origin data pointer, shape, device type, etc., are recorded in ITensor.

Public Functions

virtual void *getOpaque() = 0¶

Get the data pointer of the ITensor.

Returns: The data pointer.

virtual bool setOpaque(void *opaque) = 0¶

Set the data pointer of the ITensor.

Parameters: opaque – The data pointer. If using DeviceType as HOST, opaque represent HOST memory pointer, if using DeviceType as Device, opaque should use pointer of DeviceMemory.

virtual Dims getDims() = 0¶

Get the dimension.

Returns: Dimension array.

virtual void setDims(Dims dims) = 0¶

Set dimension.

Parameters: dims – The dimension.

virtual DataDeviceType getDeviceType() = 0¶

Get the device type of the ITensor buffer(.

See also

DataDeviceType).

Returns: The device type of the ITensor buffer.

virtual void setDeviceType(DataDeviceType device_type) = 0¶

Set the device type of the ITensor buffer(.

See also

DataDeviceType).

Parameters: device_type – The device type of the ITensor buffer.

virtual void release() = 0¶: Release current ITensor, and this interface must be called once you want to delete the ITensor.

virtual DataType getDataType() = 0¶

Get the type of the ITensor buffer(.

See also

DataType).

Returns: The type of the ITensor buffer.

virtual void setDataType(DataType datatype) = 0¶

Set the type of the ITensor buffer(.

See also

DataType).

Returns: The type of the ITensor buffer.

2.4. IConfig¶

This section describes the TopsInference Config API.

class TopsInference::IConfig¶

#include <TopsInferRuntime.h>

Common config definition.

Subclassed by TopsInference::ICalibratorConfig, TopsInference::IEngineConfig, TopsInference::IOptimizerConfig, TopsInference::IParserConfig

Public Functions

virtual bool loadConfig(const char *proto) = 0¶

Load config from proto buffer file name.

Parameters

proto – The proto buffer file name.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual bool saveConfig(const char *proto) = 0¶

Save config into proto buffer file name.

Parameters

proto – The proto buffer file name.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

class TopsInference::IParserConfig : public TopsInference::IConfig ¶

#include <TopsInferRuntime.h>

Config for IParser.

Public Functions

virtual void setSimplify(bool simplify) = 0¶

Simplify the network when parsing.

Parameters: simplify – If True, to simplify the network.

virtual bool getSimplify() = 0¶

Check the simplify status in parser.

Returns

bool.

Returns

true. – Config is simplify.
false. – Config is not simplify.

class TopsInference::IOptimizerConfig : public TopsInference::IConfig ¶

#include <TopsInferRuntime.h>

Config for Optimizer.

Public Functions

virtual bool setCompileOptions(const char *options) = 0¶

Set the Compile Options.

Parameters

options – is json format.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual const char *getCompileOptions() const = 0¶

Get the compile options.

Returns: const char* Current compile options.

virtual bool setBuildFlag(BuildFlag flag) = 0¶

Set the build flag.

See also

BuildFlag.

Parameters

flag – build flag.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual bool setBuildFlag(int64_t flag) = 0¶

Set the build flag.

Parameters

flag – formed by composing multiple BuildFlag.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual int64_t getBuildFlag() = 0¶

Get the build flag.

Returns: int64_t BuildFlag.

virtual bool setMaxShapeRange(const char *op_shape) = 0¶

set the max shape dim for the specified op with json string.

Note

Examples:

Json::Value max_shape_range_setting;
Json::Value op_max_val;
op_max_val["main"].append("1,3,512,512");
max_shape_range_setting.append(op_max_val);
Json::Value min_shape_range_setting;
Json::Value op_min_val;
op_min_val["main"].append("1,3,112,112");
min_shape_range_setting.append(op_min_val);
char* max_setting_str = max_shape_range_setting.toStyledString();
char* min_setting_str = min_shape_range_setting.toStyledString();
assert((optimizer_config->setMaxShapeRange(max_setting_str.c_str()),
    "[Error] set max shape range failed!"));
assert((optimizer_config->setMinShapeRange(min_setting_str.c_str()),
    "[Error] set min shape range failed!"));

Parameters

op_shape – is json format.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual bool setMinShapeRange(const char *op_shape) = 0¶

set the min shape dim for the specified op with json string.

See also

setMaxShapeRange.

Parameters

op_shape – is json format.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual bool setInt8Calibrator(ICalibrator *calibrator) = 0¶

config the calibrator.

Parameters

calibrator – a pointer to the calibrator.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual void setRefitPreprocess(bool refit_preprocess_flag) = 0¶

Set the refit preprocess mode.

Parameters: refit_preprocess_flag – If true, enable refit with preprocess.

class TopsInference::IEngineConfig : public TopsInference::IConfig ¶

#include <TopsInferRuntime.h>

Config for engine.

Public Functions

virtual const char *getEngineVersion() = 0¶

Get the engine version.

Returns: const char* engine version.

virtual void setAutoBatchMode(bool auto_mode) = 0¶

Set the auto batch mode.

Parameters: auto_mode – If true, set auto batch mode.

virtual bool isAutoBatchMode() = 0¶

Get the auto batch mode.

Returns

bool.

Returns

true. – Config is auto batch mode.
false. – Config is not auto batch mode.

class TopsInference::ICalibratorConfig : public TopsInference::IConfig ¶

#include <TopsInferRuntime.h>

Config for Calibrator.

Public Functions

virtual bool setOpPrecision(const char *op_name, DataType dtype) = 0¶

set the op precision used for calibration.

Parameters

op_name – op name to be set.
dtype – set the op to dtype precision.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual bool setOpCalibrateAlgo(const char *op_name, CalibrationAlgoType algo) = 0¶

set the op algorithm used for calibration.

Parameters

op_name – op name to be set.
algo – set the op to algo algorithm.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual bool setOpThreshold(const char *op_name, double thres_val) = 0¶

set the op threshold for calibration.

Parameters

op_name – op name to be set.
thres_val – set the op to thres_val threshold.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual bool enableQuantizeOps(const char **op_types, int32_t num) = 0¶

set the op types to be quantized.

Parameters

op_types – op types to be set.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual bool getQuantizeOps(char **op_types, int32_t *num) = 0¶

get the op type which will be quantized.

Parameters

op_types – op types to be quantized.
num – the num of quantized op_type.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual bool disableQuantizeOps(const char **op_types, int32_t num) = 0¶

exclude the op types which will be quantized.

Parameters

op_types – op types to be set.
num – the num of quantized op_type.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual DataType getOpPrecision(const char *op_name) = 0¶

get the op precision set in calibration.

Parameters: op_name – op name.
Returns: DataType op precision.

virtual CalibrationAlgoType getOpCalibrateAlgo(const char *op_name) = 0¶

get the op algorithm set in calibration.

Parameters: op_name – op name.
Returns: CalibrationAlgoType op algorithm.

virtual double getOpThreshold(const char *op_name) = 0¶

get the op threshold set in calibration.

Parameters: op_name – op name.
Returns: double op threshold.

2.5. ILayer¶

This section describes the TopsInference Layer information API.

class TopsInference::ILayer¶

#include <TopsInferRuntime.h>

Layer definition. Base class for all layer classes in a network definition.

Public Functions

virtual LayerType getType() = 0¶

Get the layer type.

See also

LayerType.

Returns: LayerType.

virtual void setName(const char *name) = 0¶

Set the layer name, take effect at compile time.

Parameters: name – layer name.

virtual const char *getName() = 0¶

Get the layer name.

Returns: const char*.

virtual bool setPrecision(DataType dataType) = 0¶

Set the layer precision, take effect at compile time. In TIF_KTYPE_MIX_FP16 mode, user can set TIF_FP32 or TIF_FP16. In TIF_KTYPE_INT8_MIX_FP32 mode, user can set TIF_FP32 or TIF_INT8.

Parameters

dataType – set DataType::TIF_FP32 or DataType::TIF_FP16 or DataType::TIF_INT8.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual DataType getPrecision() = 0¶

Get the layer precision.

Returns: DataType.

virtual bool isPrecisionSet() = 0¶

Check whether the layer precision is set or not.

Returns: bool whether the precision has been set.

virtual void resetPrecision() = 0¶: Reset the layer precision, take effect at compile time.

2.6. INetwork¶

This section describes the TopsInference Network definition API.

class TopsInference::INetwork¶

#include <TopsInferRuntime.h>

Network definition. A network definition to the builder.

Public Functions

virtual int32_t getLayerNum() = 0¶

Get the layer number.

Returns: int32_t layer number.

virtual ILayer *getLayer(int32_t index) = 0¶

Get the layer according to the index.

Parameters: index – the index must be less than the layer number.
Returns: ILayer* Ilayer pointer.

virtual void dump() = 0¶: Dump the network information.

virtual ILayer **getLayer(const char *regex_str, int32_t *match_num) = 0¶

Get the layer according to the layer name, also support fuzzy match by regex expression.

Parameters

regex_str – regex expression.
match_num – matched layer number.

Returns

ILayer* Ilayer pointer.

2.7. IParser¶

This section describes the TopsInference Parser definition API.

class TopsInference::IParser¶

#include <TopsInferRuntime.h>

Parser definition. IParser is a compiler component that translate onnx model into TopsInference network definition.

Public Functions

virtual INetwork *readModel(const char *model) = 0¶

Read model file, support onnx file.

See also

{INetwork}.

Parameters: model – model file.
Returns: INetwork*.

virtual INetwork *readModelFromStr(const char *model, uint32_t model_size) = 0¶

Read model string.

See also

{INetwork}.

Parameters

model – model string.
model_size – model size.

Returns

INetwork*.

virtual INetwork *readModelObj(const void *model_obj) = 0¶

Read model object. This interface is not ABI compatibility. Please use abi1 onnx, compiled with _GLIBCXX_USE_CXX11_ABI=1, which is the default value for gcc with version numbers greater than 5.

See also

{INetwork}.

Parameters: model – model object.
Returns: INetwork*.

virtual INetwork *getModel() = 0¶

Get the model.

Returns: INetwork*.

virtual void setInputNames(const char *val) = 0¶

Set the input names before reading model, when there are multi inputs, names are separated with a comma, such as “a,b”. If not set, the original properties of the network are used.

Parameters: val – input names.

virtual void setInputDtypes(const char *val) = 0¶

Set the input data types before reading model, when there are multi inputs, types are separated with a comma, such as “TIF_FP32,TIF_FP32”. Subnets of the network can be intercepted after setting the input and output. If not set, the original properties of the network are used.

Parameters: val – input dtypes.

virtual void setInputShapes(const char *val) = 0¶

Set the input shapes before reading model, when there are multi inputs, shapes are separated with a colon, such as “3,4:3,4”. If not set, the original properties of the network are used.

Parameters: val – input shapes.

virtual void setOutputNames(const char *val) = 0¶

Set the output names before reading model, when there are multi outputs, names are separated with a comma, such as “a,b”. If not set, the original properties of the network are used.

Parameters: val – output names.

virtual void setOutputDtypes(const char *val) = 0¶

Set the output data types before reading model, when there are multi outputs, types are separated with a comma, such as “TIF_FP32,TIF_FP32”. Subnets of the network can be intercepted after setting the input and output. If not set, the original properties of the network are used.

Parameters: val – output dtypes.

virtual IParserConfig *getConfig() = 0¶

Get the config pointer used for parser, use IParserConfig*-> to set.

Returns: IParserConfig*.

2.8. IStream¶

This section describes the TopsInference Stream definition API.

using topsInferStream_t = IStream*¶

class TopsInference::IStream¶

#include <TopsInferRuntime.h>

Stream definition. Stream of asynchronous action.

Public Functions

virtual bool synchronize() = 0¶

Stream synchronize, Wait for any pending asynchronous action in stream.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

2.9. IFuture¶

This section describes the TopsInference Future definition API.

class TopsInference::IFuture¶

#include <TopsInferRuntime.h>

IFuture is used for asynchronous inference, Used to describe whether the current output data status is ready.

Public Functions

virtual void wait() = 0¶: wait until output data is ready.

virtual bool status() = 0¶

status of output data.

Returns: if output data was ready, return true, otherwise return false.

2.10. IEngine¶

This section describes the TopsInference Engine definition API.

class TopsInference::IEngine¶

#include <TopsInferRuntime.h>

Executable definition, the serialized engine contains the necessary copies of the weights, the parser, network definition.

See also

IOptimizer.

Public Functions

virtual bool saveExecutable(const char *name) = 0¶

Save engine to file.

Parameters

name – Engine file.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual bool loadExecutable(const char *name) = 0¶

Load engine from file.

Parameters

name – Engine file.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual bool loadExecFromBuffer(const void *blob, std::size_t size) = 0¶

Deserialize an engine from buffer.

Parameters

blob – The memory that holds the serialized executable.
size – The size of the memory.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual Dims getInputShape(int32_t index) = 0¶

Get the set input shape corresponding to the given index, the index argument must be less than the value of getInputNum(). if using dynamic shape mode, Dims may contain -1.

Parameters: index – Each engine may have several input nodes, The index-th input.
Returns: Dims.

virtual Dims getOutputShape(int32_t index) = 0¶

Get the output shape corresponding to the given index, the index argument must be less than the value of getOutputNum().

Parameters: index – Each engine may have several output nodes, The index-th output.
Returns: Dims.

virtual int32_t getInputNum() = 0¶

Get the input num.

Returns: int32_t.

virtual int32_t getOutputNum() = 0¶

Get the output num.

Returns: int32_t.

virtual DataType getInputDataType(int32_t index) = 0¶

Get the input data type corresponding to the given index, the index argument must be less than the value of getInputNum().

Parameters: index – Each engine may have several input nodes, The index-th input.
Returns: DataType.

virtual DataType getOutputDataType(int32_t index) = 0¶

Get the output data type corresponding to the given index, the index argument must be less than the value of getOutputNum().

Parameters: index – Each engine may have several output nodes, The index-th output.
Returns: DataType.

virtual Dims getMaxInputShape(uint32_t index) = 0¶

Get the maximum input shape by index, this interface is only used when the current engine’s index-th input has dynamic shape. if used in static shape mode, it will return static shape.

Parameters: index – The input id, the index of the input.
Returns: The maximum input dimension.

virtual Dims getMaxOutputShape(uint32_t index) = 0¶

Get the maximum output shape by index, this interface is only used when the current engine’s index-th output has dynamic or unknown shape.

Parameters: index – The output id, the index of the output.
Returns: The maximum output shape, if the maximum input shape has not been set when compiling, it will raise error.

virtual Dims getMinInputShape(uint32_t index) = 0¶

Get the minimum input shape by index, this interface is only used when the current engine’s index-th input has dynamic shape. if used in static shape mode, it will return static shape.

Parameters: index – The input id, the index of the input.
Returns: The minimum input dimension.

virtual IEngineConfig *getConfig() = 0¶

get the IEngine Config pointer, use IEngineConfig*-> to set.

Returns: IEngineConfig*.

virtual bool run(void **input, void **output, BufferType buf_type, topsInferStream_t stream = nullptr) = 0¶

run with specified device(cluster).

When running with auto batch mode, the async mode is not supported now, it means that you must keep stream nullptr.

See also

BufferType.

When doing inference with buf_type equal to IN_HOST_OUT_HOST, the async mode is not supported now, it means that you * must keep stream nullptr.

Parameters

input – The input buffer bound for current engine, data is arranged in Column-major order.
output – The output buffer bound for current engine, data is arranged in Column-major order.
buf_type – engine run mode.
stream – when stream is not nullptr, asynchronously.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual bool runWithBatch(std::size_t sample_num, void **inputs, void **outputs, BufferType buf_type, topsInferStream_t stream = nullptr, IFuture *future = nullptr) = 0¶

run with specified device(cluster) and dynamic batch.

See also

BufferType.

Parameters

sample_num – sample nums of runtime batch size.
input – The sample_num input buffer bound for current engine, data is arranged in Column-major order, [[sample1_input_1, sample2_input_1, … , sample_x_input_1],[sample1_input_2, sample2_input_2, … , sample_x_input_2], …].
output – The sample_num output buffer bound for current engine, data is arranged in Column-major order.
buf_type – engine run mode.
stream – when stream is not nullptr, run asynchronously.
future – Future is an optional parameter,When you are using asynchronous mode and you want to know the current request status, please use it.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual TIFStatus runV2 (IN TensorPtr_t *inputs, INOUT TensorPtr_t *outputs, topsInferStream_t stream=nullptr, IFuture *future=nullptr)=0

run with specified device(cluster) and dynamic batch.

See also

Status code.

Parameters

inputs – The input tensor list pointer for each input.
outputs – The output tensor list pointer for each output.
stream – When stream is not null, run with asynchronous mode, otherwise with synchronous mode.
future – Future is an optional parameter, please keep it not null When you want to know the current request status with asynchronous mode.

Returns

TIFStatus.

virtual bool runV3 (IN TensorPtr_t *inputs, INOUT TensorPtr_t *outputs, topsStream_t stream=nullptr, IFuture *future=nullptr)=0

run with specified device(cluster) and dynamic batch, and this interface can be mixed with rt3.0.

Parameters

inputs – The input tensor list pointer for each input.
outputs – The output tensor list pointer for each output.
stream – When stream is not null, run with asynchronous mode, otherwise with synchronous mode.
future – Future is an optional parameter, please keep it not null When you want to know the current request status with asynchronous mode.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual const char *getInputName(uint32_t index) = 0¶

Get the input name by index. the index argument must be less than the value of getInputNum() The life cycle of the pointer is consistent with that of the engine, The value may change in reload or rebuild.

Parameters: index – The input id, the index of the input.
Returns: The index-th input name.

virtual const char *getOutputName(uint32_t index) = 0¶

Get the output name by index. the index argument must be less than the value of getOutputNum() The life cycle of the pointer is consistent with that of the engine, The value may change in reload or rebuild.

Parameters: index – The output id, the index of the output.
Returns: The index-th output name.

virtual size_t getDeviceMemorySize() = 0¶

Get the memory size of gcu device runtime required.

Returns: The engine’s memory size required. Return 0 if fail to get memory size.

virtual bool shapeInfer (IN const TensorPtr_t *inputs, INOUT TensorPtr_t *outputs)=0

run with specified device(cluster) and dynamic batch If the model input is a dynamic shape, this API can help push out the output shape, if static shape, return the static output shape.

See also

TIF_SHAPE_INFER_FAILED

See also

TIF_SHAPE_INFER_INACCURATE

Parameters

input_dims – The input dim list pointer for each input.
output_dims – The output dim list pointer for each output.

Returns

bool if success return true otherwise return false, there are two types of errors:

inline virtual bool setIODimensionInfo(const int32_t *inputNIndices, const int32_t *outputNIndices)¶

Set the N-dimensional information for input and output tensors. If a dimension exists, specify the index of N in that dimension. If it doesn’t exist, set index to -1. It’s not thread safety. If you do not set the N index information after creating the engine, the N index for both inputs and outputs will default to 0, means NHWC/NCHW/NWHC/NXXX formats.

Parameters

inputNIndices – The input tensor list pointer for each input.
outputNIndices – The output tensor list pointer for each output.

Returns

bool if success return true, otherwise return false.

inline virtual void setMaxWorkspaceSize(std::size_t workspaceSize) noexcept¶

Set the maximum workspace size.

See also

getMaxWorkspaceSize().

Parameters: workspaceSize – The maximum GCU temporary memory which the engine can use at execution time.

inline virtual std::size_t getMaxWorkspaceSize() const noexcept¶

Get the maximum workspace size. By default the workspace size is the size of total global memory in the device.

See also

setMaxWorkspaceSize().

Returns: The maximum workspace size.

2.11. IOptimizer¶

This section describes the TopsInference Optimizer definition API.

class TopsInference::IOptimizer¶

#include <TopsInferRuntime.h>

Optimizer definition. The optimizer will do a series of optimizations on the layer.

Public Functions

virtual IEngine *build(INetwork *network) = 0¶

Build an engine from network.

See also

INetwork.

See also

IEngine.

Note

Examples:

TopsInference::IEngine* engine = optimizer->build(network);

Parameters: network – INetwork
Returns: IEngine* pointer of IEngine

virtual IOptimizerConfig *getConfig() = 0¶

Get the config pointer used for optimizer.

See also

BuildFlag use IOptimizerConfig*-> to set.

Returns: IOptimizerConfig*.

2.12. ICalibrator¶

This section describes the TopsInference Calibrator definition API.

class TopsInference::ICalibrator¶

Subclassed by TopsInference::IInt8EntropyCalibrator, TopsInference::IInt8MaxMinCalibrator, TopsInference::IInt8MaxMinEMACalibrator, TopsInference::IInt8PercentCalibrator

Public Functions

virtual int32_t getBatchSize() const noexcept = 0¶

Get the batch size used for calibration batches.

Returns: The batch size.

virtual bool getBatch(TensorPtr_t data[], const char *names[], int32_t num) noexcept = 0¶

Get a batch of input for calibration The batch size of the input must match the batch size returned by getBatchSize().

Parameters

data – An array of pointers to host memory that containing each network input data.
names – The names of the network input for each pointer in the binding array.
num – The number of pointers in the binding array.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if there are no more batches for calibration.

virtual const void *readCalibrationCache(int64_t &length)¶

Load a calibration cache. Calibration is potentially expensive, so it can be useful to generate the calibration data once, then use it on subsequent builds of the network. The cache includes the regression cutoff and quantized values used to generate it, and will not be used if these do not batch the settings of the current calibrator. However, the network should also be recalibrated if its structure changes, or the input data set changes, and it is the responsibility of the application to ensure this.

Parameters: length – The length of the cached data, If there is no data,this should be zero.
Returns: A pointer to the cache, or nullptr if there is no data.

virtual bool writeCalibrationCache(const void *ptr, int64_t length)¶

Save a calibration cache.

Parameters

ptr – A pointer to the data to cache.
length – The length in bytes of the data to cache.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual CalibrationAlgoType getAlgorithm() noexcept = 0¶

Get the algorithm used by this calibrator.

Returns: CalibrationAlgoType The algorithm used by the calibrator.

virtual ICalibratorConfig *getConfig()¶

Get the config pointer used for calibrator.

Returns: ICalibratorConfig*.

class TopsInference::IInt8EntropyCalibrator : public TopsInference::ICalibrator ¶

Public Functions

inline virtual CalibrationAlgoType getAlgorithm() noexcept override¶

Get the algorithm used by this calibrator.

Returns: CalibrationAlgoType The algorithm used by the calibrator.

class TopsInference::IInt8MaxMinCalibrator : public TopsInference::ICalibrator ¶

Public Functions

inline virtual CalibrationAlgoType getAlgorithm() noexcept override¶

Get the algorithm used by this calibrator.

Returns: CalibrationAlgoType The algorithm used by the calibrator.

class TopsInference::IInt8MaxMinEMACalibrator : public TopsInference::ICalibrator ¶

Public Functions

inline virtual CalibrationAlgoType getAlgorithm() noexcept override¶

Get the algorithm used by this calibrator.

Returns: CalibrationAlgoType The algorithm used by the calibrator.

class TopsInference::IInt8PercentCalibrator : public TopsInference::ICalibrator ¶

Public Functions

inline virtual CalibrationAlgoType getAlgorithm() noexcept override¶

Get the algorithm used by this calibrator.

Returns: CalibrationAlgoType The algorithm used by the calibrator.

class IInt8Calibrator¶: #include <TopsInferRuntime.h>

Application-implemented interface for calibration.

2.13. IRefitter¶

This section describes the TopsInference Refitter definition API.

class TopsInference::IRefitter¶

#include <TopsInferRuntime.h>

Updates weights in an engine.

Public Functions

virtual int32_t getAllWeights(int32_t size, const char **weightsNames) = 0¶

Get names of all weights that could be refit.

Parameters

size – [in] The number of weights names that can be safely written to.
weightsNames – [out] The names of the weights to be updated, or nullptr for unnamed weights.

Returns

The number of Weights that could be refit. It should be call twice. The first use of getAllWeights(0, nullptr) returns size, initializes the weightsNames with the size, and passes in the size and weightsNames to get all weights.

virtual int32_t getMissingWeights(int32_t size, const char **weightsNames) = 0¶

Get names of missing weights. For example, if some Weights have been set, but the engine was optimized in a way that combines weights, any unsupplied Weights in the combination are considered missing.

Parameters

size – [in] The number of weights names that can be safely written to.
weightsNames – [out] The names of the weights to be updated, or nullptr for unnamed weights.

Returns

The number of missing Weights. It should be call twice. The first use of getMissingWeights(0, nullptr) returns size, initializes the weightsNames with the size, and passes in the size and weightsNames to get missing weights.

virtual bool setNamedWeights(const char *name, Weights weights) = 0¶

Specify new weights of given name.

Parameters

name – [in] The name of the weights to be updated.
weights – [in] new weight.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if new weights are rejected.

virtual bool getNamedWeights(const char *name, Weights &weights) = 0¶

Obtain weights of given name.

See also

Weights.

Parameters

name – [in] The name of the weights to be obtain.
weights – [inout] weights must be initialized with type.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if new weights are rejected.

virtual bool refitEngine() = 0¶

Updates associated engine.

Returns

bool.

Returns

true – Return true if succeed.
false – Return false if fail.

virtual bool isSupportPreprocess() = 0¶: Check whether the current refitter supports automatic preprocessing.