HIeamicheal, I tried to build pynq image for zcu102 but the bitstream was not loading for me the light is always red. some boot partition error after successful pynq image generation. So i asked my mentor to provide me zcu104 and i got it. But it would be helpful if you can share the procedure to build pynq image for ZCU102. Also let me know the errors you encountered here. If I have solved any such i could help.for info I am currently working on zcu104. I have seen lot of errors of similar kind.
I want to load riscv on zcu102 xilinx board. I looked at various sites, they have codes for other specific boards and I am not quite sure how to port it.Since I am a beginner, can you provide some starting point for it ?
Sorry for ambiguity. I wanted to run riscv soc platform such as lowrisc on fpga. But the code given in their github is optimized for nexy4 ddr board. And I am getting issues for converting it to zcu102. So, I was asking if there is some steps I can follow ? Like list of interfaces need to be changed ?
You can try the Instant Soc from FPGA Cores.The compiler builds a soc including this RISC-V processor and UARTs, I2Cs etc directly from C++. All peripherals are defined as C++ objects. Very easy to use.I have mostly used it with Artix to interface the AXIS streams on the Ethernet cores.
The defect detection network consists of multiple Cross Channel Normalization layers. To support this layer on hardware, the 'LRNBlockGeneration' property of the conv module needs to be turned on in the bitstream used for FPGA inference. The shipping zcu102_single bitstream does not have this property turned on. A new bitstream can be generated using the following lines of code. The generated bitstream can be used along with a dlhdl.Workflow object for inference.
When creating a dlhdl.ProcessorConfig object for an existing shipping bitstream, make sure that the bitstream name matches the data type and the FPGA board that you are targeting. In this example the target FPGA board is the Xilinx ZCU102 SoC board and the date type is single. Update the processor configuration with 'LRNBlockGeneration' turned on and 'SegmentationBlockGeneration' turned off. Turn the latter off to fit the Deep Learning IP on the FPGA and avoid overutilization of resources.
Create an object of the dlhdl.Workflow class. When you create the class, specify the network and the bitstream name. Make sure to use the generated bitstream which enables processing of Cross Channel Normalization layers on the FPGA. Specify the saved pretrained neural network, snet_defnet, as the network.
To deploy the network on the Xilinx ZCU102 SoC hardware, run the deploy function of the dlhdl.Workflow object. This function uses the output of the compile function to program the FPGA board by using the programming file. It also downloads the network weights and biases. The deploy function starts programming the FPGA device and displays progress messages and the time it takes to deploy the network.
Load an image from the attached testImages folder and resize the image to match the network image input layer dimensions. Run the predict function of the dlhdl.Workflow object to retrieve and display the defect prediction from the FPGA.
Create an object of the dlhdl.Workflow class. When you create the class, specify the network and the bitstream name. Make sure to use the generated bitstream which enables processing of Cross Channel Normalization layers on the FPGA. Specify the saved pretrained neural network, trainedblemDetNet, as the network.
The trainedBlemDetNet network improves performance to 45 frames per second. The target performance of the deployed network is 100 frames per second while staying within the target resource utilization budget. The resource utilization budget takes into consideration parameters such as memory size and onboard IO. While you can increase the resource utilization budget by choosing a larger board, doing so increases the cost. Instead, improve the deployed network performance and stay within the resource utilization budget by quantizing the network. Quantize and deploy the trainedBlemDetNet network.
Load the data set as an image datastore. The imageDatastore labels the images based on folder names and stores the data. Divide the data into calibration and validation data sets. Use 50% of the images for calibration and 50% of the images for validation. Expedite the calibration and validation process by using a subset of the calibration and validation image sets.
Use the calibrate function to exercise the network by using sample inputs and collect the range information. The calibrate function exercises the network and collects the dynamic ranges of the weights and biases in the convolution and fully connected layers of the network and the dynamic ranges of the activations in all layers of the network. The calibrate function returns a table. Each row of the table contains range information for a learnable parameter of the quantized network.
The trainedBlemDetNet network consists of a Cross Channel Normalization layer. To support this layer on hardware, the 'LRNBlockGeneration' property of the conv module needs to be turned on in the bitstream used for FPGA inference. The shipping zcu102_int8 bitstream does not have this property turned on. A new bitstream can be generated using the following lines of code. The generated bitstream can be used along with a dlhdl.Workflow object for inference.
When creating a dlhdl.ProcessorConfig object for an existing shipping bitstream, make sure that the bitstream name matches the data type and the FPGA board that you are targeting. In this example the target FPGA board is the Xilinx ZCU102 SoC board and the date type is int8. Update the processor configuration with 'LRNBlockGeneration' turned on and 'SegmentationBlockGeneration' turned off. Turn the latter off to fit the Deep Learning IP on the FPGA and avoid overutilization of resources.
Create an object of the dlhdl.Workflow class. When you create the class, specify the network and the bitstream name. Make sure to use this newly generated bitstream which enables processing of Cross Channel Normalization layers on the FPGA. Specify the saved pretrained quantized trainedblemDetNet object dlQuantObj as the network.
To test that the quantized network can identify all test cases deploy an additional image, resize the image to match the network image input layer dimensions, and run the predict function of the dlhdl.Workflow object to retrieve and display the defect prediction from the FPGA.
Define the target FPGA board programming interface by using the dlhdl.Target object. Create a programming interface with custom name for your target device and an Ethernet interface to connect the target device to the host computer.
The lane detection network consists of multiple cross-channel normalization layers. To support this layer on hardware, enable the LRNBlockGeneration property of the conv module in the bitstream that you need to use for FPGA inference. The zcu102_single bitstream does not have this property turned on. A new bitstream can be generated using the following lines of code. The generated bitstream can be used along with a dlhdl.Workflow object for inference.
When you create a dlhdl.ProcessorConfig object for a reference bitstream, make sure that the bitstream name matches the data type and the FPGA board that you are targeting. In this example the target FPGA board is the Xilinx ZCU102 SoC board and the date type is single. Update the processor configuration with the LRNBlockGeneration property enabled and the SegmentationBlockGeneration property disabled. Disabling the SegmentationBlockGeneration property ensures that the Deep Learning IP fits on the FPGA and avoids overuse of resources. If targeting the Xilinx ZC706 board, replace 'zcu102_single' with 'zc706_single' in the first command.
To deploy the network on the Xilinx Zynq UltraScale+ MPSoC ZCU102 hardware, run the deploy method of the dlhdl.Workflow object. This method programs the FPGA board using the output of the compile method and the programming file, downloads the network weights and biases, displays progress messages, and the time it takes to deploy the network.
To reduce the time required to design a custom deep learning network that meets performance requirements, before deploying the network, analyze layer level latencies. Compare deep learning network performances on custom bitstream processor configurations to performances on reference (shipping) bitstream processor configurations.
To retrieve the zcu102_single bitstream configuration, use the dlhdl.ProcessorConfig object. For more information, see dlhdl.ProcessorConfig. To learn about modifiable parameters of the processor configuration, see getModuleProperty and setModuleProperty.
3a8082e126