作者:Grace Sun,AMD工程师;来源:AMD开发者社区
Vitis AI Library包含了xdputil工具,可作为板级开发的辅助调试手段,其源代码位于以下位置:
https://github.com/Xilinx/Vitis-AI/tree/master/src/vai_library/usefultools
在预编译的官方board image和Vitis AI docker中均已安装了xdputil。对于定制的target board,安装方式可参考对应版本的Vitis AI Library用户指南,例如:
https://docs.amd.com/r/en-US/ug1354-xilinx-ai-sdk/Step-3-Installing-the-...
在docker环境下跑xdputil,可运行usr/bin/python3 -m xdputil。以下是运行xdputil -h以后的用法概览:
大部分的子命令需要关联DPU和Device信息,只能在目标板上运行。一般在docker里面可对xmodel文件做进一步解析和查看。
对于xdputil xmodel子命令,可以进一步用-h查看用法。
以下给出了一些具体示例及命令输出。
显示xmodel subgraph信息,包括input/output tensor,kernel。
xdputil xmodel -l yolov6m_pt.xmodel { "subgraphs":[ { "index":0, "name":"subgraph_ModelNNDct__ModelNNDct_QuantStub_quant__input_1", "device":"USER" }, { "index":1, "name":"subgraph_ModelNNDct__ModelNNDct_CSPBepBackbone__module__CSPBepBackbone_backbone__RealVGGBlock_stem__Conv2d_conv__input_3", "device":"DPU", "fingerprint":"0x603000b56011861", "DPU Arch":"DPUCVDX8G_ISA3_C32B6", "workload":81415208000, "input_tensor":[ { "index":0, "name":"ModelNNDct__ModelNNDct_QuantStub_quant__input_1_fix", "shape":[ 1, 640, 640, 3 ], "fixpos":6 } ], "output_tensor":[ { "index":0, "name":"ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_reg_preds__ModuleList_1__inputs_7_fix", "shape":[ 1, 40, 40, 4 ], "fixpos":3 }, { "index":1, "name":"ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_cls_preds__ModuleList_2__24287_fix", "shape":[ 1, 20, 20, 80 ], "fixpos":4 }, { "index":2, "name":"ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_reg_preds__ModuleList_2__inputs_11_fix", "shape":[ 1, 20, 20, 4 ], "fixpos":3 }, { "index":3, "name":"ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_cls_preds__ModuleList_0__24021_fix", "shape":[ 1, 80, 80, 80 ], "fixpos":4 }, { "index":4, "name":"ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_reg_preds__ModuleList_0__inputs_3_fix", "shape":[ 1, 80, 80, 4 ], "fixpos":4 }, { "index":5, "name":"ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_cls_preds__ModuleList_1__24154_fix", "shape":[ 1, 40, 40, 80 ], "fixpos":4 } ], "reg info":[ { "name":"REG_0", "context type":"CONST", "size":34244864 }, { "name":"REG_1", "context type":"WORKSPACE", "size":37017600 }, { "name":"REG_2", "context type":"DATA_LOCAL_INPUT", "size":1230784 }, { "name":"REG_3", "context type":"DATA_LOCAL_OUTPUT", "size":705600 } ], "instruction reg":213312 }, { "index":2, "name":"subgraph_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_reg_preds__ModuleList_2__inputs_11_fix_", "device":"CPU" }, { "index":3, "name":"subgraph_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_reg_preds__ModuleList_1__inputs_7_fix_", "device":"CPU" }, { "index":4, "name":"subgraph_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_reg_preds__ModuleList_0__inputs_3_fix_", "device":"CPU" }, { "index":5, "name":"subgraph_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_cls_preds__ModuleList_2__24287_fix_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__inputs", "device":"CPU" }, { "index":6, "name":"subgraph_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_cls_preds__ModuleList_1__24154_fix_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__inputs_9", "device":"CPU" }, { "index":7, "name":"subgraph_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_cls_preds__ModuleList_0__24021_fix_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__inputs_5", "device":"CPU" } ] }
把xmodel转成其他格式
以-t为例,xdputil xmodel yolov6m_pt.xmodel -t yolov6_mt_xmodel.txt
从导出的.txt中可以获取input/output tensor,op_node等的详细属性。
显示xmodel中某一个operator的信息,op_name可从上述导出的.txt中获取。
xdputil xmodel yolov6m_pt.xmodel --op ModelNNDct__ModelNNDct_CSPBepBackbone__module__CSPBepBackbone_backbone__RealVGGBlock_stem__Conv2d_conv__input_3 xmodel: yolov6m_pt.xmodel op_name: ModelNNDct__ModelNNDct_CSPBepBackbone__module__CSPBepBackbone_backbone__RealVGGBlock_stem__Conv2d_conv__input_3 { "name" : "ModelNNDct__ModelNNDct_CSPBepBackbone__module__CSPBepBackbone_backbone__RealVGGBlock_stem__Conv2d_conv__input_3", "type" : "conv2d-fix", "attrs" : { "workload" : 270336000, "device" : "DPU", "bias_term" : true, "workload_on_arch" : 635699200, "shift_hsigmoid" : -128, "nonlinear" : "RELU", "kernel" : [ 3, 3 ], "dilation" : [ 1, 1 ], "hsigmoid_in" : -128, "stride" : [ 2, 2 ], "out_dim" : 48, "channel_augmentation" : 1, "pad_mode" : "FLOOR", "shift_hswish" : -128, "pad" : [ 1, 0, 1, 0 ], "in_dim" : 3, "group" : 1 }, "inputs" : [ { "index" : 0, "op_name" : "ModelNNDct__ModelNNDct_QuantStub_quant__input_1_upload_0", "tensor_name" : "ModelNNDct__ModelNNDct_QuantStub_quant__input_1_fix_upload_0", "shape" : [ 1, 640, 640, 3 ], "data_type" : "xint8" }, { "index" : 1, "op_name" : "ModelNNDct___module_backbone_stem_conv_weight", "tensor_name" : "ModelNNDct___module_backbone_stem_conv_weight_fix", "shape" : [ 48, 3, 3, 3 ], "data_type" : "xint8" }, { "index" : 2, "op_name" : "ModelNNDct___module_backbone_stem_conv_bias", "tensor_name" : "ModelNNDct___module_backbone_stem_conv_bias_fix", "shape" : [ 48 ], "data_type" : "xint8" } ], "outputs" : { "op_name" : "ModelNNDct__ModelNNDct_CSPBepBackbone__module__CSPBepBackbone_backbone__RealVGGBlock_stem__Conv2d_conv__input_3", "tensor_name" : "ModelNNDct__ModelNNDct_CSPBepBackbone__module__CSPBepBackbone_backbone__RealVGGBlock_stem__ReLU_relu__input_7_fix", "shape" : [ 1, 320, 320, 48 ], "data_type" : "xint8", "attrs" : { "round_mode" : "DPU_ROUND", "bit_width" : 8, "location" : 1, "reg_id" : 1, "fix_point" : 4, "if_signed" : true, "ddr_addr" : 11827200, "stride" : [ 4915200, 15360, 48, 1 ] } } }
显示device信息,包括DPU 配置,指纹信息,runtime版本等,这可以帮助用户快速了解当前board的DPU重要信息,辅助调试运行中跟DPU兼容性相关的失败。
xdputil query
显示DPU寄存器状态
xdputil status
做benchmark测试
xdputil benchmark
subgraph_index从0开始,-i设成-1表示跑整个graph。Subgraph_index可从xdputil xmodel -l的输出中获取。
如果第一级为USER subgraph,那么-i 0会报错。
{
"index":0,
"name":"subgraph_ModelNNDct__ModelNNDct_QuantStub_quant__input_1",
"device":"USER"
},
改成-i 1后可以正常测试。
xdputil run可用于DPU运行结果不正确的调试,交叉检查参考值和DPU推理值。UG1414中给出了具体步骤:
https://docs.amd.com/r/en-US/ug1414-vitis-ai/DPU-Debug-with-VART
总之,xdputil的用法简单,可以辅助用户更直观深入地了解编译后的模型以及当前DPU的一些信息,在调试诸如DPU无法找到,指纹不匹配,以及和量化后准确率差异过大等问题的时候是一个有效的调试手段。