2. Class¶

2.1. Dims32¶

This section describes the TopsInference Dimension information API.

using Dims = Dims32¶: Alias for Dims32.

Warning

: This alias might change in the future.

class TopsInference::Dims32¶

#include <TopsInferRuntime.h>

Dimension information definition.

Public Members

int32_t nbDims = 0¶: The actual dimension.

int32_t dimension[MAX_DIMS] = {0}¶: The size in each dimension.

Public Static Attributes

static const int32_t MAX_DIMS = 8¶: The max dimension supported by the class.

2.2. IErrorManager¶

This section describes the TopsInference error manager information API.

class TopsInference::IErrorManager¶

#include <TopsInferRuntime.h>

Error Manager for recording the internal errors.

Note

Examples:

IParser *onnx_parser = create_parser(ParserType::TIF_ONNX);
IErrorManager *error_manager = create_error_manager();

try {
  const char *model = "add.onnx";
  INetwork *network = onnx_parser->readModel(model);
  if (network == nullptr) {
    int32_t error_count = error_manager->getErrorCount();
    for (int32_t i = 0; i < error_count; i++) {
      char* error_msg = error_manager->getErrorMsg(i);
      TIFStatus error_status = error_manager->getErrorStatus(i);
    }
    error_manager->clear();
  }
} catch (std::exception &e) {
  int32_t error_count = error_manager->getErrorCount();
  for (int32_t i = 0; i < error_count; i++) {
    char* error_msg = error_manager->getErrorMsg(i);
    TIFStatus error_status = error_manager->getErrorStatus(i);
}
  error_manager->clear();
}

Warning

Which is thread-safe

Public Functions

virtual TIFStatus getErrorStatus(int32_t index) = 0¶

Get the Error Status.

See also

Status code

Parameters: index – The index for queried error, index < getErrorCount()
Returns: TIFStatus

virtual int32_t getErrorCount() = 0¶

Get the Error Count object.

Returns: int32_t The error count in error manager

virtual const char *getErrorMsg(int32_t index) = 0¶

Get the Error Msg object.

Parameters: index – The index for queried error, index < getErrorCount()
Returns: const char* The error message

virtual bool reportStatus(int32_t index) = 0¶

print the Error Status and Msg as String

Parameters

index – The index for queried error, index < getErrorCount()

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual bool clear() = 0¶

Need to call after finishing to query.

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

2.3. ITensor¶

This section describes the TopsInference ITensor API.

typedef class ITensor *TensorPtr_t¶

class TopsInference::ITensor¶

#include <TopsInferRuntime.h>

ITensor information definition The attributes of input and output, including origin data pointer, shape, device type, etc., are recorded in ITensor.

Public Functions

virtual void *getOpaque() = 0¶

Get the data pointer of the ITensor.

Returns: The data pointer.

virtual bool setOpaque(void *opaque) = 0¶

Set the data pointer of the ITensor.

Parameters: opaque – The data pointer. If using DeviceType as HOST, opaque represent HOST memory pointer, if using DeviceType as Device, opaque should use pointer of DeviceMemory

virtual Dims getDims() = 0¶

Get the dimension.

Returns: Dimension array.

virtual void setDims(Dims dims) = 0¶

Set dimension.

Parameters: dims – The dimension.

virtual DataDeviceType getDeviceType() = 0¶

Get the device type of the ITensor buffer(.

See also

DataDeviceType).

Returns: The device type of the ITensor buffer.

virtual void setDeviceType(DataDeviceType device_type) = 0¶

Set the device type of the ITensor buffer(.

See also

DataDeviceType).

Parameters: device_type – The device type of the ITensor buffer.

virtual void release() = 0¶: Release current ITensor, and this interface must be called once you want to delete the ITensor.

virtual DataType getDataType() = 0¶

Get the type of the ITensor buffer(.

See also

DataType).

Returns: The type of the ITensor buffer.

virtual void setDataType(DataType datatype) = 0¶

Set the type of the ITensor buffer(.

See also

DataType).

Returns: The type of the ITensor buffer.

2.4. IConfig¶

This section describes the TopsInference Config API.

class TopsInference::IConfig¶

#include <TopsInferRuntime.h>

Common config definition.

Subclassed by TopsInference::ICalibratorConfig, TopsInference::IEngineConfig, TopsInference::IOptimizerConfig, TopsInference::IParserConfig

Public Functions

virtual bool loadConfig(const char *proto) = 0¶

Load config from proto buffer file name.

Parameters

proto –

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual bool saveConfig(const char *proto) = 0¶

Save config into proto buffer file name.

Parameters

proto –

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

class TopsInference::IParserConfig : public TopsInference::IConfig ¶

#include <TopsInferRuntime.h>

Config for IParser.

Public Functions

virtual void setSimplify(bool simplify) = 0¶

Simplify the network when parsing.

Parameters: simplify – If True, to simplify the network

virtual bool getSimplify() = 0¶

Check the simplify status in parser.

Returns

bool

Returns

true. – Config is simplify
false. – Config is not simplify

class TopsInference::IOptimizerConfig : public TopsInference::IConfig ¶

#include <TopsInferRuntime.h>

Config for Optimizer.

Public Functions

virtual bool setCompileOptions(const char *options) = 0¶

Set the Compile Options.

Parameters

options – is json format

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual const char *getCompileOptions() const = 0¶

Get the compile options.

Returns: const char* Current compile options

virtual bool setBuildFlag(BuildFlag flag) = 0¶

Set the build flag.

See also

BuildFlag

Parameters

flag –

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual bool setBuildFlag(int64_t flag) = 0¶

Set the build flag.

Parameters

flag – formed by composing multiple BuildFlag

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual int64_t getBuildFlag() = 0¶

Get the build flag.

Returns: int64_t BuildFlag

virtual bool setMaxShapeRange(const char *op_shape) = 0¶

set the max shape dim for the specified op with json string

Note

Examples:

Json::Value max_shape_range_setting;
Json::Value op_max_val;
op_max_val["main"].append("1,3,512,512");
max_shape_range_setting.append(op_max_val);
Json::Value min_shape_range_setting;
Json::Value op_min_val;
op_min_val["main"].append("1,3,112,112");
min_shape_range_setting.append(op_min_val);
char* max_setting_str = max_shape_range_setting.toStyledString();
char* min_setting_str = min_shape_range_setting.toStyledString();
assert((optimizer_config->setMaxShapeRange(max_setting_str.c_str()),
    "[Error] set max shape range failed!"));
assert((optimizer_config->setMinShapeRange(min_setting_str.c_str()),
    "[Error] set min shape range failed!"));

Parameters

op_shape – is json format

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual bool setMinShapeRange(const char *op_shape) = 0¶

set the min shape dim for the specified op with json string

See also

setMaxShapeRange

Parameters

op_shape – is json format

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual bool setInt8Calibrator(ICalibrator *calibrator) = 0¶

config the calibrator

Parameters

calibrator – a pointer to the calibrator

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual void setRefitPreprocess(bool refit_preprocess_flag) = 0¶

Set the refit preprocess mode.

Parameters: refit_preprocess_flag – If true, enable refit with preprocess.

class TopsInference::IEngineConfig : public TopsInference::IConfig ¶

#include <TopsInferRuntime.h>

Config for engine.

Public Functions

virtual const char *getEngineVersion() = 0¶

Get the engine version.

Returns: const char* engine version

virtual void setAutoBatchMode(bool auto_mode) = 0¶

Set the auto batch mode.

Parameters: auto_mode – If true, set auto batch mode

virtual bool isAutoBatchMode() = 0¶

Get the auto batch mode.

Returns

bool

Returns

true. – Config is auto batch mode
false. – Config is not auto batch mode

class TopsInference::ICalibratorConfig : public TopsInference::IConfig ¶

#include <TopsInferRuntime.h>

Config for Calibrator.

Public Functions

virtual bool setOpPrecision(const char *op_name, DataType dtype) = 0¶

set the op precision used for calibration

Parameters

op_name – op name to be set
dtype – set the op to dtype precision

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual bool setOpCalibrateAlgo(const char *op_name, CalibrationAlgoType algo) = 0¶

set the op algorithm used for calibration

Parameters

op_name – op name to be set
algo – set the op to algo algorithm

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual bool setOpThreshold(const char *op_name, double thres_val) = 0¶

set the op threshold for calibration

Parameters

op_name – op name to be set
thres_val – set the op to thres_val threshold

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual bool enableQuantizeOps(const char **op_types, int32_t num) = 0¶

set the op types to be quantized

Parameters

op_types – op types to be set

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual bool getQuantizeOps(char **op_types, int32_t *num) = 0¶

get the op type which will be quantized

Parameters

op_types – op types to be quantized
num – the num of quantized op_type

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual bool disableQuantizeOps(const char **op_types, int32_t num) = 0¶

exclude the op types which will be quantized

Parameters

op_types – op types to be set
num – the num of quantized op_type

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual DataType getOpPrecision(const char *op_name) = 0¶

get the op precision set in calibration

Parameters: op_name – op name
Returns: DataType op precision

virtual CalibrationAlgoType getOpCalibrateAlgo(const char *op_name) = 0¶

get the op algorithm set in calibration

Parameters: op_name – op name
Returns: CalibrationAlgoType op algorithm

virtual double getOpThreshold(const char *op_name) = 0¶

get the op threshold set in calibration

Parameters: op_name – op name
Returns: double op threshold

2.5. ILayer¶

This section describes the TopsInference Layer information API.

class TopsInference::ILayer¶

#include <TopsInferRuntime.h>

Layer definition. Base class for all layer classes in a network definition.

Public Functions

virtual LayerType getType() = 0¶

Get the layer type.

See also

LayerType

Returns: LayerType

virtual void setName(const char *name) = 0¶

Set the layer name, take effect at compile time.

Parameters: name – layer name

virtual const char *getName() = 0¶

Get the layer name.

Returns: const char*

virtual bool setPrecision(DataType dataType) = 0¶

Set the layer precision, take effect at compile time. In TIF_KTYPE_MIX_FP16 mode, user can set TIF_FP32 or TIF_FP16. In TIF_KTYPE_INT8_MIX_FP32 mode, user can set TIF_FP32 or TIF_INT8.

Parameters

dataType – set DataType::TIF_FP32 or DataType::TIF_FP16 or DataType::TIF_INT8

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual DataType getPrecision() = 0¶

Get the layer precision.

Returns: DataType

virtual bool isPrecisionSet() = 0¶

Check whether the layer precision is set or not.

Returns: bool whether the precision has been set

virtual void resetPrecision() = 0¶: Reset the layer precision, take effect at compile time.

2.6. INetwork¶

This section describes the TopsInference Network definition API.

class TopsInference::INetwork¶

#include <TopsInferRuntime.h>

Network definition A network definition to the builder.

Public Functions

virtual int32_t getLayerNum() = 0¶

Get the layer number.

Returns: int32_t layer number

virtual ILayer *getLayer(int32_t index) = 0¶

Get the layer according to the index,.

Parameters: index – the index must be less than the layer number
Returns: ILayer* Ilayer pointer

virtual void dump() = 0¶: Dump the network information.

virtual ILayer **getLayer(const char *regex_str, int32_t *match_num) = 0¶

Get the layer according to the layer name, also support fuzzy match by regex expression.

Parameters

regex_str – regex expression
match_num – matched layer number

Returns

ILayer* Ilayer pointer

2.7. IParser¶

This section describes the TopsInference Parser definition API.

class TopsInference::IParser¶

#include <TopsInferRuntime.h>

Parser definition IParser is a compiler component that translate onnx model into TopsInference network definition.

Public Functions

virtual INetwork *readModel(const char *model) = 0¶

Read model file, support onnx file.

See also

{INetwork}

Parameters: model – model file
Returns: INetwork*

virtual INetwork *readModelFromStr(const char *model, uint32_t model_size) = 0¶

Read model string.

See also

{INetwork}

Parameters

model – model string
model_size – model size

Returns

INetwork*

virtual INetwork *readModelObj(const void *model_obj) = 0¶

Read model object. This interface is not ABI compatibility. Please use abi1 onnx, compiled with _GLIBCXX_USE_CXX11_ABI=1, which is the default value for gcc with version numbers greater than 5.

See also

{INetwork}

Parameters: model – model object
Returns: INetwork*

virtual INetwork *getModel() = 0¶

Get the model.

Returns: INetwork*

virtual void setInputNames(const char *val) = 0¶

Set the input names before reading model, when there are multi inputs, names are separated with a comma, such as “a,b”. If not set, the original properties of the network are used.

Parameters: val – input names

virtual void setInputDtypes(const char *val) = 0¶

Set the input data types before reading model, when there are multi inputs, types are separated with a comma, such as “TIF_FP32,TIF_FP32”. Subnets of the network can be intercepted after setting the input and output. If not set, the original properties of the network are used.

Parameters: val – input dtypes

virtual void setInputShapes(const char *val) = 0¶

Set the input shapes before reading model, when there are multi inputs, shapes are separated with a colon, such as “3,4:3,4”. If not set, the original properties of the network are used.

Parameters: val – input shapes

virtual void setOutputNames(const char *val) = 0¶

Set the output names before reading model, when there are multi outputs, names are separated with a comma, such as “a,b”. If not set, the original properties of the network are used.

Parameters: val – output names

virtual void setOutputDtypes(const char *val) = 0¶

Set the output data types before reading model, when there are multi outputs, types are separated with a comma, such as “TIF_FP32,TIF_FP32”. Subnets of the network can be intercepted after setting the input and output. If not set, the original properties of the network are used.

Parameters: val – output dtypes

virtual IParserConfig *getConfig() = 0¶

Get the config pointer used for parser, use IParserConfig*-> to set.

Returns: IParserConfig*

2.8. IStream¶

This section describes the TopsInference Stream definition API.

using topsInferStream_t = IStream*¶

class TopsInference::IStream¶

#include <TopsInferRuntime.h>

Stream definition Stream of asynchronous action.

Public Functions

virtual bool synchronize() = 0¶

Stream synchronize, Wait for any pending asynchronous action in stream.

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

2.9. IFuture¶

This section describes the TopsInference Future definition API.

class TopsInference::IFuture¶

#include <TopsInferRuntime.h>

IFuture is used for asynchronous inference, Used to describe whether the current output data status is ready.

Public Functions

virtual void wait() = 0¶: wait until output data is ready

virtual bool status() = 0¶

status of output data

Returns: if output data was ready, return true, otherwise return false

2.10. IEngine¶

This section describes the TopsInference Engine definition API.

class TopsInference::IEngine¶

#include <TopsInferRuntime.h>

Executable definition, the serialized engine contains the necessary copies of the weights, the parser, network definition.

See also

IOptimizer

Public Functions

virtual bool saveExecutable(const char *name) = 0¶

Save engine to file.

Parameters

name – Engine file

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual bool loadExecutable(const char *name) = 0¶

Load engine from file.

Parameters

name – Engine file

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual bool loadExecFromBuffer(const void *blob, std::size_t size) = 0¶

Deserialize an engine from buffer.

Parameters

blob – The memory that holds the serialized executable
size – The size of the memory

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual Dims getInputShape(int32_t index) = 0¶

Get the set input shape corresponding to the given index, the index argument must be less than the value of getInputNum(). if using dynamic shape mode, Dims may contain -1.

Parameters: index – Each engine may have several input nodes, The index-th input
Returns: Dims

virtual Dims getOutputShape(int32_t index) = 0¶

Get the output shape corresponding to the given index, the index argument must be less than the value of getOutputNum()

Parameters: index – Each engine may have several output nodes, The index-th output.
Returns: Dims

virtual int32_t getInputNum() = 0¶

Get the input num.

Returns: int32_t

virtual int32_t getOutputNum() = 0¶

Get the output num.

Returns: int32_t

virtual DataType getInputDataType(int32_t index) = 0¶

Get the input data type corresponding to the given index, the index argument must be less than the value of getInputNum()

Parameters: index – Each engine may have several input nodes, The index-th input
Returns: DataType

virtual DataType getOutputDataType(int32_t index) = 0¶

Get the output data type corresponding to the given index, the index argument must be less than the value of getOutputNum()

Parameters: index – Each engine may have several output nodes, The index-th output
Returns: DataType

virtual Dims getMaxInputShape(uint32_t index) = 0¶

Get the maximum input shape by index, this interface is only used when the current engine’s index-th input has dynamic shape. if used in static shape mode, it will return static shape.

Parameters: index – The input id, the index of the input
Returns: The maximum input dimension

virtual Dims getMaxOutputShape(uint32_t index) = 0¶

Get the maximum output shape by index, this interface is only used when the current engine’s index-th output has dynamic or unknown shape.

Parameters: index – The output id, the index of the output
Returns: The maximum output shape, if the maximum input shape has not been set when compiling, it will raise error

virtual Dims getMinInputShape(uint32_t index) = 0¶

Get the minimum input shape by index, this interface is only used when the current engine’s index-th input has dynamic shape. if used in static shape mode, it will return static shape.

Parameters: index – The input id, the index of the input
Returns: The minimum input dimension.

virtual IEngineConfig *getConfig() = 0¶

get the IEngine Config pointer, use IEngineConfig*-> to set

Returns: IEngineConfig*

virtual bool run(void **input, void **output, BufferType buf_type, topsInferStream_t stream = nullptr) = 0¶

run with specified device(cluster)

When running with auto batch mode, the async mode is not supported now, it means that you must keep stream nullptr

See also

BufferType

When doing inference with buf_type equal to IN_HOST_OUT_HOST, the async mode is not supported now, it means that you * must keep stream nullptr

Parameters

input – The input buffer bound for current engine, data is arranged in Column-major order
output – The output buffer bound for current engine, data is arranged in Column-major order
buf_type – engine run mode
stream – when stream is not nullptr, asynchronously

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual bool runWithBatch(std::size_t sample_num, void **inputs, void **outputs, BufferType buf_type, topsInferStream_t stream = nullptr, IFuture *future = nullptr) = 0¶

run with specified device(cluster) and dynamic batch

See also

BufferType

Parameters

sample_num – sample nums of runtime batch size
input – The sample_num input buffer bound for current engine, data is arranged in Column-major order, , [[sample1_input_1, sample2_input_1, … , sample_x_input_1],[sample1_input_2, sample2_input_2, … , sample_x_input_2],..
output – The sample_num output buffer bound for current engine, data is arranged in Column-major order
buf_type – engine run mode
stream – when stream is not nullptr, run asynchronously
future – Future is an optional parameter,When you are using asynchronous mode and you want to know the current request status, please use it

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual TIFStatus runV2 (IN TensorPtr_t *inputs, INOUT TensorPtr_t *outputs, topsInferStream_t stream=nullptr, IFuture *future=nullptr)=0

run with specified device(cluster) and dynamic batch

See also

Status code

Parameters

inputs – The input tensor list pointer for each input
outputs – The output tensor list pointer for each output
stream – When stream is not null, run with asynchronous mode, otherwise with synchronous mode
future – Future is an optional parameter, please keep it not null When you want to know the current request status with asynchronous mode

Returns

TIFStatus

virtual bool runV3 (IN TensorPtr_t *inputs, INOUT TensorPtr_t *outputs, topsStream_t stream=nullptr, IFuture *future=nullptr)=0

run with specified device(cluster) and dynamic batch, and this interface can be mixed with rt3.0

Parameters

inputs – The input tensor list pointer for each input
outputs – The output tensor list pointer for each output
stream – When stream is not null, run with asynchronous mode, otherwise with synchronous mode
future – Future is an optional parameter, please keep it not null When you want to know the current request status with asynchronous mode

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual const char *getInputName(uint32_t index) = 0¶

Get the input name by index. the index argument must be less than the value of getInputNum() The life cycle of the pointer is consistent with that of the engine, The value may change in reload or rebuild.

Parameters: index – The input id, the index of the input
Returns: The index-th input name.

virtual const char *getOutputName(uint32_t index) = 0¶

Get the output name by index. the index argument must be less than the value of getOutputNum() The life cycle of the pointer is consistent with that of the engine, The value may change in reload or rebuild.

Parameters: index – The output id, the index of the output
Returns: The index-th output name.

virtual size_t getDeviceMemorySize() = 0¶

Get the memory size of gcu device runtime required.

Returns: The engine’s memory size required. Return 0 if fail to get memory size.

virtual bool shapeInfer (IN const TensorPtr_t *inputs, INOUT TensorPtr_t *outputs)=0

run with specified device(cluster) and dynamic batch If the model input is a dynamic shape, this API can help push out the output shape

Parameters

inputs – The input tensor list pointer for each input
outputs – The output tensor list pointer for each output

Returns

bool if success return true, otherwise return false

inline virtual bool setIODimensionInfo(const int32_t *inputNIndices, const int32_t *outputNIndices)¶

Set the N-dimensional information for input and output tensors. If a dimension exists, specify the index of N in that dimension. If it doesn’t exist, set index to -1. It’s not thread safety. If you do not set the N index information after creating the engine, the N index for both inputs and outputs will default to 0, means NHWC/NCHW/NWHC/NXXX formats.

Parameters

inputNIndices – The input tensor list pointer for each input
outputNIndices – The output tensor list pointer for each output

Returns

bool if success return true, otherwise return false

inline virtual void setMaxWorkspaceSize(std::size_t workspaceSize) noexcept¶

Set the maximum workspace size.

See also

getMaxWorkspaceSize()

Parameters: workspaceSize – The maximum GCU temporary memory which the engine can use at execution time.

inline virtual std::size_t getMaxWorkspaceSize() const noexcept¶

Get the maximum workspace size. By default the workspace size is the size of total global memory in the device.

See also

setMaxWorkspaceSize()

Returns: The maximum workspace size.

2.11. IOptimizer¶

This section describes the TopsInference Optimizer definition API.

class TopsInference::IOptimizer¶

#include <TopsInferRuntime.h>

Optimizer definition The optimizer will do a series of optimizations on the layer.

Public Functions

virtual IEngine *build(INetwork *network) = 0¶

Build an engine from network.

See also

INetwork

See also

IEngine

Note

Examples:

TopsInference::IEngine* engine = optimizer->build(network);

Parameters: network – INetwork
Returns: IEngine* pointer of IEngine

virtual IOptimizerConfig *getConfig() = 0¶

Get the config pointer used for optimizer.

See also

BuildFlag use IOptimizerConfig*-> to set

Returns: IOptimizerConfig*

2.12. ICalibrator¶

This section describes the TopsInference Calibrator definition API.

class TopsInference::ICalibrator¶

Subclassed by TopsInference::IInt8EntropyCalibrator, TopsInference::IInt8MaxMinCalibrator, TopsInference::IInt8MaxMinEMACalibrator, TopsInference::IInt8PercentCalibrator

Public Functions

virtual int32_t getBatchSize() const noexcept = 0¶

Get the batch size used for calibration batches.

Returns: The batch size

virtual bool getBatch(TensorPtr_t data[], const char *names[], int32_t num) noexcept = 0¶

Get a batch of input for calibration The batch size of the input must match the batch size returned by getBatchSize()

Parameters

data – An array of pointers to host memory that containing each network input data
names – The names of the network input for each pointer in the binding array
num – The number of pointers in the binding array

Returns

bool

Returns

true – Return true if succeed
false – Return false if there are no more batches for calibration

virtual const void *readCalibrationCache(int64_t &length)¶

Load a calibration cache. Calibration is potentially expensive, so it can be useful to generate the calibration data once, then use it on subsequent builds of the network. The cache includes the regression cutoff and quantized values used to generate it, and will not be used if these do not batch the settings of the current calibrator. However, the network should also be recalibrated if its structure changes, or the input data set changes, and it is the responsibility of the application to ensure this.

Parameters: length – The length of the cached data, If there is no data,this should be zero
Returns: A pointer to the cache, or nullptr if there is no data

virtual bool writeCalibrationCache(const void *ptr, int64_t length)¶

Save a calibration cache.

Parameters

ptr – A pointer to the data to cache
length – The length in bytes of the data to cache

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual CalibrationAlgoType getAlgorithm() noexcept = 0¶

Get the algorithm used by this calibrator.

Returns: CalibrationAlgoType The algorithm used by the calibrator.

virtual ICalibratorConfig *getConfig()¶

Get the config pointer used for calibrator.

Returns: ICalibratorConfig*

class TopsInference::IInt8EntropyCalibrator : public TopsInference::ICalibrator ¶

Public Functions

inline virtual CalibrationAlgoType getAlgorithm() noexcept override¶

Get the algorithm used by this calibrator.

Returns: CalibrationAlgoType The algorithm used by the calibrator.

class TopsInference::IInt8MaxMinCalibrator : public TopsInference::ICalibrator ¶

Public Functions

inline virtual CalibrationAlgoType getAlgorithm() noexcept override¶

Get the algorithm used by this calibrator.

Returns: CalibrationAlgoType The algorithm used by the calibrator.

class TopsInference::IInt8MaxMinEMACalibrator : public TopsInference::ICalibrator ¶

Public Functions

inline virtual CalibrationAlgoType getAlgorithm() noexcept override¶

Get the algorithm used by this calibrator.

Returns: CalibrationAlgoType The algorithm used by the calibrator.

class TopsInference::IInt8PercentCalibrator : public TopsInference::ICalibrator ¶

Public Functions

inline virtual CalibrationAlgoType getAlgorithm() noexcept override¶

Get the algorithm used by this calibrator.

Returns: CalibrationAlgoType The algorithm used by the calibrator.

class IInt8Calibrator¶: #include <TopsInferRuntime.h>

Application-implemented interface for calibration.

2.13. IRefitter¶

This section describes the TopsInference Refitter definition API.

class TopsInference::IRefitter¶

#include <TopsInferRuntime.h>

Updates weights in an engine.

Public Functions

virtual int32_t getAllWeights(int32_t size, const char **weightsNames) = 0¶

Get names of all weights that could be refit.

Parameters

size – [in] The number of weights names that can be safely written to.
weightsNames – [out] The names of the weights to be updated, or nullptr for unnamed weights.

Returns

The number of Weights that could be refit. It should be call twice. The first use of getAllWeights(0, nullptr) returns size, initializes the weightsNames with the size, and passes in the size and weightsNames to get all weights

virtual int32_t getMissingWeights(int32_t size, const char **weightsNames) = 0¶

Get names of missing weights. For example, if some Weights have been set, but the engine was optimized in a way that combines weights, any unsupplied Weights in the combination are considered missing.

Parameters

size – [in] The number of weights names that can be safely written to.
weightsNames – [out] The names of the weights to be updated, or nullptr for unnamed weights.

Returns

The number of missing Weights. It should be call twice. The first use of getMissingWeights(0, nullptr) returns size, initializes the weightsNames with the size, and passes in the size and weightsNames to get missing weights

virtual bool setNamedWeights(const char *name, Weights weights) = 0¶

Specify new weights of given name.

Parameters

name – [in] The name of the weights to be updated
weights – [in] new weight

Returns

bool

Returns

true – Return true if succeed
false – Return false if new weights are rejected

virtual bool getNamedWeights(const char *name, Weights &weights) = 0¶

Obtain weights of given name.

See also

Weights

Parameters

name – [in] The name of the weights to be obtain
weights – [inout] weights must be initialized with type

Returns

bool

Returns

true – Return true if succeed
false – Return false if new weights are rejected

virtual bool refitEngine() = 0¶

Updates associated engine.

Returns

bool

Returns

true – Return true if succeed
false – Return false if fail

virtual bool isSupportPreprocess() = 0¶: Check whether the current refitter supports automatic preprocessing.