FPGA Bitstreams, Headless Configuration and Expansion

An introduction to how the Parallella board FPGA can be configured for “headless” operation (no HDMI controller) and alternative expansion.

The Parallella’s Zynq chip provides not only a dual-core ARM processor and peripheral interfaces such as Ethernet and USB, but also a field-programmable gate array (FPGA) that greatly increases the flexibility of the platform. In Zynq parlance these are termed the processing system (PS) and programmable logic (PL) respectively, and their close integration brings many benefits.

Communication with the Epiphany accelerator is via an eLink interface and on the host side this is implemented in the FPGA. However, the FPGA is also used to implement a HDMI controller and to provide up to 48x pins of general purpose I/O (GPIO), and as we’ll come to see this configuration can easily be changed.

The configuration bitstream

On power-up the FPGA is configured for a particular design via a bitstream file stored in the boot partition of the Micro SD card, named parallella.bit.bin. Simply changing this file for another and rebooting is all that it takes to reconfigure the FPGA, although the Linux kernel must also be configured for the available devices and this is done by updating a second file, devicetree.dtb.

The parallella-bin repository contains releases of all the three files which must be located in the boot partition: the FPGA bitstream, Linux kernel and device tree. At present a default configuration is provided with HDMI support, along with a headless release that is without HDMI.

Running headless

With the HDMI configuration I measured the current consumption to be 1038mA, as can be seen above. Changing to the headless configuration brought this down by almost 100mA, to 948mA.

This is by no means the only way that power saving can be achieved and there are other options available and techniques that can be employed. However, this shows how for applications which do not require HDMI, almost half a watt can be shaved off the power consumption straight away. In addition to which removing the HDMI controller also frees up space in the FPGA for implementing new custom capabilities.

Configuring expansion

Detail of the version.v file from the parallella_7020_headless project

The pre-built bitstreams currently in the parallella-bin repository provide 24 “bits” of differential I/O that is not connected to the PS (ARM host).

The pre-built bitstreams in the parallella-hw repository provide 48x (Z-7020) or 24x (Z-7010) pins of single-ended GPIO, that is routed to the PEC_FPGA expansion connector and accessed from Linux via, for example, the sysfs filesystem (/sys/class/gpio).

The FPGA can also be configured for differential I/O instead of single-ended, whereby a pair of pins is used to convey a signal — improving resistance to noise and enabling interfaces to be run at much higher data rates, with the trade-off being that double the number of pins are required. This is configured in the version.v file of the FPGA projects in the parallella-hw repository. If prior to building, FEATURE_GPIO_DIFF is defined and IOSTD_GPIO is set to LVDS_25, you will get (half the number of) differential GPIOs.

Note that HDMI projects remain to be added to the parallella-hw repo and defining HDMI in a headless project will not result in a bitstream with HDMI support being built. Also, any changes to projects should be reflected in the VERSION_VALUE defined in version.v, as there may be tools which read the FPGA register value and make assumptions about what connections are present (see versions.txt).

Finally, a peripheral-rich “megaio” configuration is also in the works, which will provide second Ethernet (external PHY required), I2C and UART ports, plus SPI and CAN bus.

Building the bitstream

Building an FPGA project using PlanAhead

The FPGA projects are built using the PlanAhead tool and the free-of-charge ISE WebPACK Design Software is sufficient. Depending on which O/S you are using there may be some post-installation steps required, and as a Debian user I found this guide from Aram Kalaydzhyan most helpful, substituting 14.6 with 14.7 when installing the most recent version of ISE.

To build a project simply open it in PlanAhead and select Generate Bitstream. For further details see the project’s README.md file.

Many more possibilities

This post has served simply as a high level exploration of the presently available FPGA configurations, and as such it has not even scratched the surface of what is possible in terms of implementing custom high-speed interfaces and peripherals etc.

Andrew

With thanks to Fred @ Adapteva for clarification on numerous points.

[Edit: corrections for -bin vs. -hw repo bitstreams and added note about VERSION_VALUE define.]

11 Comments

sam says:

May 13, 2014 at 12:29 pm

Nice, I guess that helps with heat as well. Any chance that future software releases come in a full and headless version, the headless version would not need X11 and so could be a lot slimmer.
AugustoRighetto says:

May 14, 2014 at 1:51 am

Hey Andrew,

I’ve found this document (http://www.xilinx.com/support/documentation/ip_documentation/pcie_7x/v3_0/pg054-7series-pcie.pdf) in Zynq-7000 documentation base.
It is possible to use these GPIOs availible in Parallella to connect to a PCI Express v3.0 device?

Regards,

Augusto
Andrew Back says:

May 14, 2014 at 1:59 am

I’ve created a headless Debian 7.0 image ( http://elinux.org/Parallella_Debian ), and I guess we could have both options for the Linaro Ubuntu one, similar to how Ubuntu official have Desktop and Server editions. We’ll give some thought to this!
Fred H says:

May 16, 2014 at 9:39 am

Hi Augusto,

Not directly. The two Zynq devices we support on the Parallella board are the 7Z010 and 7Z020, they do not have the gigabit transceivers needed to support PCIe. The GPIO pins can support no more than 1 gigabit per second per pair, PCIe starts (Gen-1) at 2.5 gigabits per second. It would be possible, however, to make a daughtercard with an intermediary chip to work between the parallella and PCIe.

We do have a PCIe board on our roadmap, so keep an eye out for an announcement in a few months! Cheers, Fred
Alex says:

May 20, 2014 at 5:02 pm

Would PCIe be off GPIO pins from the Zynq, or directly off the epiphany link? Just thinking about the latency and throughput implications if the need for the data is in the ecores.
fhuettig says:

May 20, 2014 at 5:31 pm

There has to be something with some smarts between the eLink pins and PCIe, the interface and protocol are completely different. However that ‘something’ can be done in ways that keep the latency to a minimum. For example if we used a Zynq on such a board (we might), the data does not necessarily have to go through the arm processors or even through the Zynq’s SDRAM. There can be a path from a DMA controller directly to the eLinks.
greytery says:

May 31, 2014 at 5:29 pm

Headless FPGA configurations imply a smaller memory footprint for the OS, which means more space in the 1GB DRAM available to share with the Epiphany. The current allocation is 32MB , which is built into the FPGA code. It is easy to increase the amount of memory to share memory space – up to 512MB (and even beyond) – but that requires adjusting the FPGA mapping from the (legacy) 32MB allocation.
The headless Debian build is smaller than the standard Ubuntu image, and the minimum Linaro Nano build (http://elinux.org/Parallella_Linaro_Nano) is ~23MB, so plenty of scope for opening the memory gate and still having a powerful Linux OS to interface to the Epiphany.
Spooky – the answer is 42!
Jonathan says:

June 6, 2014 at 11:03 am

That’s great! Mine went from 1040mA down to 780mA when I changed to the headless configuration, so just below 4 Watts!! (it seems to go up to about 850mA under load..)
I’m running Andrew’s Debian…
Jonathan says:

June 6, 2014 at 11:10 am

Oh, and the memory footprint of that system is just 15MB!

total used free shared buffers cached
Mem: 969 33 935 0 4 13
-/+ buffers/cache: 15 953
greytery says:

June 6, 2014 at 4:13 pm

15MB is really Low-FAT! Thanks – will definitely try that (as well as the nano build)!
Can’t get my ‘head’ round why we need to have a Full-Fat Linux on the Parallella – that makes it just a souped up Pi. Why not just get 2 or 3 * Pi’s? It’s cheaper!
The ‘natural’ configuration for a Parallella is headless, maybe as part of a cluster.
The Linux OS is there to shovel coal into the boiler.
Frank DG1SBG says:

August 23, 2015 at 6:10 am

Hi –

So, a few months passed since those last messages here. I am really eager to learn more about that PCIe board “down the road”. Any hints / news / plans / roadmap ?

Thx!

Regards
Frank