5.2. TopsIDEAS gcu debug¶
描述¶
在模型整网mismatch时,通过和onnxruntime-cpu的结果进行逐层比较,遍历并找到gcu计算错误的最小子图。
注:本工具在模型较大时可能耗时很长。
命令行¶
使用方法¶
usage: topsideas gcu debug [-h] --input_onnx INPUT_ONNX [--inputs INPUT_META [INPUT_META ...]] [--input_value_range INPUT_VALUE_RANGE [INPUT_VALUE_RANGE ...]]
[--fp16] [--device DEVICE] [--fp32_layers [FP32_LAYERS ...]] [--min_shapes MIN_SHAPES [MIN_SHAPES ...]]
[--max_shapes MAX_SHAPES [MAX_SHAPES ...]] [--resource_mode RESOURCE_MODE] [--compile_options COMPILE_OPTIONS]
[--save_path SAVE_PATH] [--rtol RTOL] [--atol ATOL] [--ntol NTOL] [--cos_sim COS_SIM] [--mode {model,quick,linear}]
[--log_path LOG_PATH] [--inputs_npz INPUTS_NPZ] [--seed SEED] [--try_all]
参数¶
short | long | default | help |
---|---|---|---|
-h |
--help |
show this help message and exit | |
--input_onnx |
None |
Provide original onnx file. | |
--inputs |
[] |
Overwrite input shapes or data type. Format: --inputs NAME:SHAPE:DTYPE. For example: --inputs input1 input2:[1,3,224,224]:float32 input3:int32 input4:[]. If omitted, uses the current model inputs | |
--input_value_range |
[] |
Overwrite input random value range. Format: --input_value_range NAME:[MIN,MAX] | |
--fp16 |
Enable fp16 mix precision, only works when using topsinference backend, onnxruntime-topsinfernce always enable fp16 mix | ||
--device |
0 |
Device id | |
--fp32_layers |
None |
Set layers as fp32 (topsinference) | |
--min_shapes |
[] |
Min input shapes. Format: --min_shapes NAME:SHAPE. For example: --min_shapes input1:[1,3,224,224] input2:[1,3,224,224] | |
--max_shapes |
[] |
Max input shapes. Format: --max_shapes NAME:SHAPE. For example: --max_shapes input1:[1,3,224,224] input2:[1,3,224,224] | |
--resource_mode |
None |
TopsInference compile option, see TopsInference docs for more info | |
--compile_options |
{} |
TopsInference compile option, see TopsInference docs for more info | |
--save_path |
./test_cases |
Path to save fail onnx subgraph test cases | |
--rtol |
0.01 |
Relative tolerance | |
--atol |
0.01 |
Absolute tolerance | |
--ntol |
0 |
Mismatch number tolerance, eg. ntol=0.01 means 1%% mismatch is allowed | |
--cos_sim |
0 |
Use cosine similarity instead of numerical tolerance | |
--mode |
linear |
Iteration mode, choices are ['model', 'quick', 'linear']:'model' means run inference on whole model and compare model output only;'quick' means first add all tensors to output and check, then find backward from failed tensors;'linear' means first add each node from input in dfs order until fail, then delete each node from input until pass; | |
--log_path |
None |
Path to save inference logs | |
--inputs_npz |
None |
Path to saved real sampleN_inputs.npz | |
--seed |
None |
Set the seed to generate random data, defaults to None which uses time as seedWill be ignored if --inputs_npz is given | |
--try_all |
Search won't stop when mismatch found, but until model outputs. Only supports --mode=linear and quick |
示例¶
topsideas gcu debug --input_onnx Inception-v1.onnx --fp16 --inputs input:[4,3,224,224] --cos_sim=0.9999
控制台会输出对onnx子图遍历、推理的过程,并显示该子图是否存在数值计算错误
遍历完成后,找到的错误子图和对应的输入输出数据会被存放到文件夹中,结构如下
例如这是Inception-v1.onnx的一个可能的错误子图