作者:沈月红,本文转载自: Ingdan FPGA微信公众号
根据PCIE规范对设备的要求是PERST# must deassert 100 ms after the power good of the systems has occurred, and a PCI Express port must be ready to link train no more than 20 ms after PERST# has deasserted.
现在大规模FPGA的bitstream比较大导致板卡从上电到FPGA配置完成的时间远远超过100MS的要求,从而电脑端无法正常识别到PCIE设备。
为此Xilinx的PCIE Tandem(详见PG156)功能是专为满足PCIe设备在100ms之内枚举起来要求而设计的。
data:image/s3,"s3://crabby-images/d15dd/d15dddff9bb0606d739311c94cf67efebb67943a" alt=""
Zynq UltraScale+ MPSoC是Xilinx推出的第二代多处理SoC系统,其PL提供高性能的PCIE GEN3 IP core给客户使用。
data:image/s3,"s3://crabby-images/d3ae1/d3ae1d42b8eb9d1e5b5854bdb186cb6efb4aa7a1" alt=""
如下图所示,由于MPSOC的启动加载image方式是跟纯FPGA器件是有所差异,MPSOC器件是需要从PS部分先去加载bootrom里面内容,然后按顺序去加载FSBL\Bitstream等等内容。
data:image/s3,"s3://crabby-images/e1340/e134080ddcd2ff995fda175614e06aa59ceb0e64" alt=""
从上图看来MPSOC加载是比纯FPGA器件复杂很多同时差异也比较大,所以本文主要是介绍如何在MPSOC的器件里面实现PL PCIE tandem的加载方式来满足PCIE规范里面对设备100ms的加载时间要求。
此设计流程是James Shen基于Xilinx AE Iris Yang提供的方法上面完善设计并进行板卡验证。
详细操作步骤请按照下面流程来进行:
data:image/s3,"s3://crabby-images/da7f4/da7f48b6e0343d9fce46a1e221f186f541b1355c" alt=""
1、测试环境为ZCU106 V1.1板卡和Vivado 2019.1软件;
2、由于ZCU106的PS DDR4 DIMM中间换过,所以新的DIMM需要按照下面参数来进行修正,不然系统会无法启动;
data:image/s3,"s3://crabby-images/641c1/641c1cbc6f7e2c52dd1ce52a2fdb773a4ca1f44e" alt=""
data:image/s3,"s3://crabby-images/73890/738906d975412dc48212883f7b6e98cf6feff311" alt=""
data:image/s3,"s3://crabby-images/5de10/5de10f7b3c6dbdb669f10e0ab481b3e0c1884808" alt=""
3、在PL里面搭建PCIE XDMA架构;
data:image/s3,"s3://crabby-images/6a162/6a1620a4cb11890f3dcef9439acbf8e0eee5173b" alt=""
4、根据ZCU106板卡硬件做XDMA配置;
data:image/s3,"s3://crabby-images/a120b/a120b24910317580f41591f51808c04c0ba5eb62" alt=""
data:image/s3,"s3://crabby-images/1fe20/1fe207a9fdbc50d7e096382dc4cc275b95d28fb0" alt=""
5、根据Xilinx的PCIE example design修改XDC约束;
data:image/s3,"s3://crabby-images/18f3a/18f3ae1301fb4a46cbee48e0350d33460b4def3f" alt=""
6、把ZCU106的QSPI配置同时提高时钟频率到300MHZ;
data:image/s3,"s3://crabby-images/23295/232951a83b6e3d825e6cb970f5380117da4a4391" alt=""
data:image/s3,"s3://crabby-images/74b5f/74b5f977cc726f48e3d33fc066ac8ee1ab010dc3" alt=""
7、在XDMA界面设置Tandem PROM;
data:image/s3,"s3://crabby-images/f08db/f08db93656c04235240a4753a8e016d7879e692f" alt=""
8、设置XDC里面相关约束文件;
data:image/s3,"s3://crabby-images/54eae/54eae8bcc68c8988b861132288bd9af82beb8a13" alt=""
9、修改xfsbl_qspi.c里面时钟计算相关值;
data:image/s3,"s3://crabby-images/64b12/64b12ae9188b7199429047fbe916ce62fbb2926d" alt=""
10、去掉打印Debug等信息,从而节约加载时间;
data:image/s3,"s3://crabby-images/a14a7/a14a73b7bf9385916017a51fa5e7d633e6c39e31" alt=""
11、修改xfsbl_partition_load.c来支持加载两个阶段的bitstream文件;
data:image/s3,"s3://crabby-images/e8330/e8330488e8dfbcf313592b91f1b4206ee0d9cfda" alt=""
data:image/s3,"s3://crabby-images/9d42d/9d42dce410c145e19e0d5a29f160de33a8ab154d" alt=""
data:image/s3,"s3://crabby-images/67c22/67c220ead199993688d405a3d212f86e4290a64b" alt=""
data:image/s3,"s3://crabby-images/f659e/f659e186e9c86c00089e4a0114e1e0805e52dc53" alt=""
12、对ZCU106的硬件需要进行设置;
data:image/s3,"s3://crabby-images/df93f/df93f2707d83db168bd045c666a521fd05fef941" alt=""
data:image/s3,"s3://crabby-images/51804/51804e1468e9681f80389eae12f0585bb9b22f10" alt=""
data:image/s3,"s3://crabby-images/7ad3f/7ad3f12e93248f478989d16199520c6472a5d5a9" alt=""
13、生成相关boot image;
data:image/s3,"s3://crabby-images/e0844/e0844a798e5b3770732f4375165c7c5d5ad16261" alt=""
data:image/s3,"s3://crabby-images/30567/3056792b137169b6b3069a9b7047d92d0b088066" alt=""
data:image/s3,"s3://crabby-images/dc9da/dc9daff72aab76a055c97827916a1d2a09d4f2c8" alt=""
14、根据ZCU106板卡硬件选择下载方式去下载bit到板卡上面就实现本文目的。
data:image/s3,"s3://crabby-images/7e3e1/7e3e1349ece0d6943fab2080e63bd831fc0bc0d1" alt=""
根据上文的流程和要求,经过硬件板卡实际验证可以满足PCIE在100ms之内枚举的要求。供大家参考。
如果您有此类问题需要讨论或者需要实际工程验证,请联系我们: