xdputil工具的应用指南

作者:Grace Sun,AMD工程师;来源:AMD开发者社区

Vitis AI Library包含了xdputil工具,可作为板级开发的辅助调试手段,其源代码位于以下位置:

https://github.com/Xilinx/Vitis-AI/tree/master/src/vai_library/usefultools

在预编译的官方board image和Vitis AI docker中均已安装了xdputil。对于定制的target board,安装方式可参考对应版本的Vitis AI Library用户指南,例如:

https://docs.amd.com/r/en-US/ug1354-xilinx-ai-sdk/Step-3-Installing-the-...

在docker环境下跑xdputil,可运行usr/bin/python3 -m xdputil。以下是运行xdputil -h以后的用法概览:

大部分的子命令需要关联DPU和Device信息,只能在目标板上运行。一般在docker里面可对xmodel文件做进一步解析和查看。

对于xdputil xmodel子命令,可以进一步用-h查看用法。

以下给出了一些具体示例及命令输出。

显示xmodel subgraph信息,包括input/output tensor,kernel。

xdputil xmodel -l yolov6m_pt.xmodel

{

    "subgraphs":[

        {

            "index":0,

            "name":"subgraph_ModelNNDct__ModelNNDct_QuantStub_quant__input_1",

            "device":"USER"

        },

        {

            "index":1,

            "name":"subgraph_ModelNNDct__ModelNNDct_CSPBepBackbone__module__CSPBepBackbone_backbone__RealVGGBlock_stem__Conv2d_conv__input_3",

            "device":"DPU",

            "fingerprint":"0x603000b56011861",

            "DPU Arch":"DPUCVDX8G_ISA3_C32B6",

            "workload":81415208000,

            "input_tensor":[

                {

                    "index":0,

                    "name":"ModelNNDct__ModelNNDct_QuantStub_quant__input_1_fix",

                    "shape":[

                        1,

                        640,

                        640,

                        3

                    ],

                    "fixpos":6

                }

            ],

            "output_tensor":[

                {

                    "index":0,

                    "name":"ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_reg_preds__ModuleList_1__inputs_7_fix",

                    "shape":[

                        1,

                        40,

                        40,

                        4

                    ],

                    "fixpos":3

                },

                {

                    "index":1,

                    "name":"ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_cls_preds__ModuleList_2__24287_fix",

                    "shape":[

                        1,

                        20,

                        20,

                        80

                    ],

                    "fixpos":4

                },

                {

                    "index":2,

                    "name":"ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_reg_preds__ModuleList_2__inputs_11_fix",

                    "shape":[

                        1,

                        20,

                        20,

                        4

                    ],

                    "fixpos":3

                },

                {

                    "index":3,

                    "name":"ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_cls_preds__ModuleList_0__24021_fix",

                    "shape":[

                        1,

                        80,

                        80,

                        80

                    ],

                    "fixpos":4

                },

                {

                    "index":4,

                    "name":"ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_reg_preds__ModuleList_0__inputs_3_fix",

                    "shape":[

                        1,

                        80,

                        80,

                        4

                    ],

                    "fixpos":4

                },

                {

                    "index":5,

                    "name":"ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_cls_preds__ModuleList_1__24154_fix",

                    "shape":[

                        1,

                        40,

                        40,

                        80

                    ],

                    "fixpos":4

                }

            ],

            "reg info":[

                {

                    "name":"REG_0",

                    "context type":"CONST",

                    "size":34244864

                },

                {

                    "name":"REG_1",

                    "context type":"WORKSPACE",

                    "size":37017600

                },

                {

                    "name":"REG_2",

                    "context type":"DATA_LOCAL_INPUT",

                    "size":1230784

                },

                {

                    "name":"REG_3",

                    "context type":"DATA_LOCAL_OUTPUT",

                    "size":705600

                }

            ],

            "instruction reg":213312

        },

        {

            "index":2,

            "name":"subgraph_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_reg_preds__ModuleList_2__inputs_11_fix_",

            "device":"CPU"

        },

        {

            "index":3,

            "name":"subgraph_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_reg_preds__ModuleList_1__inputs_7_fix_",

            "device":"CPU"

        },

        {

            "index":4,

            "name":"subgraph_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_reg_preds__ModuleList_0__inputs_3_fix_",

            "device":"CPU"

        },

        {

            "index":5,

            "name":"subgraph_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_cls_preds__ModuleList_2__24287_fix_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__inputs",

            "device":"CPU"

        },

        {

            "index":6,

            "name":"subgraph_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_cls_preds__ModuleList_1__24154_fix_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__inputs_9",

            "device":"CPU"

        },

        {

            "index":7,

            "name":"subgraph_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_cls_preds__ModuleList_0__24021_fix_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__inputs_5",

            "device":"CPU"

        }

    ]

}

把xmodel转成其他格式

以-t为例,xdputil xmodel yolov6m_pt.xmodel -t yolov6_mt_xmodel.txt

从导出的.txt中可以获取input/output tensor,op_node等的详细属性。

显示xmodel中某一个operator的信息,op_name可从上述导出的.txt中获取。

xdputil xmodel yolov6m_pt.xmodel --op ModelNNDct__ModelNNDct_CSPBepBackbone__module__CSPBepBackbone_backbone__RealVGGBlock_stem__Conv2d_conv__input_3

xmodel:  yolov6m_pt.xmodel

op_name:  ModelNNDct__ModelNNDct_CSPBepBackbone__module__CSPBepBackbone_backbone__RealVGGBlock_stem__Conv2d_conv__input_3

{

    "name" : "ModelNNDct__ModelNNDct_CSPBepBackbone__module__CSPBepBackbone_backbone__RealVGGBlock_stem__Conv2d_conv__input_3",

    "type" : "conv2d-fix",

    "attrs" : {

        "workload" : 270336000,

        "device" : "DPU",

        "bias_term" : true,

        "workload_on_arch" : 635699200,

        "shift_hsigmoid" : -128,

        "nonlinear" : "RELU",

        "kernel" : [

            3,

            3

        ],

        "dilation" : [

            1,

            1

        ],

        "hsigmoid_in" : -128,

        "stride" : [

            2,

            2

        ],

        "out_dim" : 48,

        "channel_augmentation" : 1,

        "pad_mode" : "FLOOR",

        "shift_hswish" : -128,

        "pad" : [

            1,

            0,

            1,

            0

        ],

        "in_dim" : 3,

        "group" : 1

    },

    "inputs" : [

        {

            "index" : 0,

            "op_name" : "ModelNNDct__ModelNNDct_QuantStub_quant__input_1_upload_0",

            "tensor_name" : "ModelNNDct__ModelNNDct_QuantStub_quant__input_1_fix_upload_0",

            "shape" : [

                1,

                640,

                640,

                3

            ],

            "data_type" : "xint8"

        },

        {

            "index" : 1,

            "op_name" : "ModelNNDct___module_backbone_stem_conv_weight",

            "tensor_name" : "ModelNNDct___module_backbone_stem_conv_weight_fix",

            "shape" : [

                48,

                3,

                3,

                3

            ],

            "data_type" : "xint8"

        },

        {

            "index" : 2,

            "op_name" : "ModelNNDct___module_backbone_stem_conv_bias",

            "tensor_name" : "ModelNNDct___module_backbone_stem_conv_bias_fix",

            "shape" : [

                48

            ],

            "data_type" : "xint8"

        }

    ],

    "outputs" : {

        "op_name" : "ModelNNDct__ModelNNDct_CSPBepBackbone__module__CSPBepBackbone_backbone__RealVGGBlock_stem__Conv2d_conv__input_3",

        "tensor_name" : "ModelNNDct__ModelNNDct_CSPBepBackbone__module__CSPBepBackbone_backbone__RealVGGBlock_stem__ReLU_relu__input_7_fix",

        "shape" : [

            1,

            320,

            320,

            48

        ],

        "data_type" : "xint8",

        "attrs" : {

            "round_mode" : "DPU_ROUND",

            "bit_width" : 8,

            "location" : 1,

            "reg_id" : 1,

            "fix_point" : 4,

            "if_signed" : true,

            "ddr_addr" : 11827200,

            "stride" : [

                4915200,

                15360,

                48,

                1

            ]

        }

    }

}

显示device信息,包括DPU 配置,指纹信息,runtime版本等,这可以帮助用户快速了解当前board的DPU重要信息,辅助调试运行中跟DPU兼容性相关的失败。

xdputil query

显示DPU寄存器状态
xdputil status

做benchmark测试
xdputil benchmark [-i subgraph_index]

subgraph_index从0开始,-i设成-1表示跑整个graph。Subgraph_index可从xdputil xmodel -l的输出中获取。

如果第一级为USER subgraph,那么-i 0会报错。

{

"index":0,

"name":"subgraph_ModelNNDct__ModelNNDct_QuantStub_quant__input_1",

"device":"USER"

},

改成-i 1后可以正常测试。

xdputil run可用于DPU运行结果不正确的调试,交叉检查参考值和DPU推理值。UG1414中给出了具体步骤:
https://docs.amd.com/r/en-US/ug1414-vitis-ai/DPU-Debug-with-VART

总之,xdputil的用法简单,可以辅助用户更直观深入地了解编译后的模型以及当前DPU的一些信息,在调试诸如DPU无法找到,指纹不匹配,以及和量化后准确率差异过大等问题的时候是一个有效的调试手段。

最新文章

最新文章