Cooling and Monitoring the Temperature

xtemp

While the Parallella board is capable of delivering industry leading GFLOPS/watt performance (doing more with less energy), it is a small and densely populated board and it’s vital that adequate cooling is provided. Without which there may be temporary or even permanent failures.

Kickstarter boards and those ordered via the Adapteva shop prior to 10th July came supplied with a heatsink for the Zynq chip only, and must always be used with a fan. Very little airflow is required, a gentle current of air across the heatsink is sufficient and the fan need not be at all powerful.

Boards ordered since 10th July come supplied with a much larger heatsink which is affixed to both the Zynq and Epiphany chips. This configuration can be used without a fan provided that the board is vertically positioned (on its side). If placed horizontally or in an enclosure, a fan must be used.

It has been asked whether the larger “slab” heatsink can be used with pre-10th July boards and unfortunately this is not possible due to the placement of some larger capacitors on those boards. That is, unless you have access to a machine shop and are able to mill or drill recesses in the heatsink at the correct positions to accommodate the taller components.

Many variables

Of course, there are many variables that can affect the heat generated by the board and in turn the effectiveness of any heatsink and fan. For example, the:

  • workload, both on the ARM host and Epiphany accelerator;
  • size of the FPGA design (larger “non-standard” designs will generate additional heat);
  • use of an enclosure;
  • ambient temperature.

As such it is important to monitor the actual temperature of the Zynq chip to see if additional cooling is required. And it goes without saying that this should be one of the first things to check if you are experiencing any sort of failures.

xtemp

The xtemp utility can be used to monitor the temperature of the Zynq chip. This can be built and installed from source, or installed via the parallella-utils package.

There are presently two ways of installing the package.

Quick install and as a one-off

$ wget https://launchpad.net/~parallella/+archive/ubuntu/snapshots/+files/parallella-utils_0.0%2B1SNAPSHOT20140710~trusty1_armhf.deb
$ sudo dpkg -i parallella-utils_0.0+1SNAPSHOT20140710~trusty1_armhf.deb

Longer install which gets automatic updates

1. Add the Parallella Snapshots PPA to /etc/apt/sources.list:

deb http://ppa.launchpad.net/parallella/snapshots/ubuntu trusty main
deb-src http://ppa.launchpad.net/parallella/snapshots/ubuntu trusty main

2. Add the signing key to your keychain

$ sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 39A6ED25

3. Update APT

$ sudo apt-get update

4. Install parallella-utils

$ sudo apt-get install parallella-utils

Note that the Snapshots PPA is for development releases of packages and in due course there will also be a Releases PPA, with more thoroughly tested releases.

Running the utility is simply a matter of typing “xtemp” at the shell prompt, and if you are connected to your board via SSH be sure to have X11 forwarding enabled.

It’s difficult to provide absolute figures at this point. Any temperature below 85C is probably OK, but it would be advisable to aim to keep the reported temperature under 70C. If you’re experiencing failures and the temperature is above 70C, it may well be that additional cooling is required.

Andrew

7 Comments

  • Thanks for this – knowing the safe temperature range is critical of course!
    But given the consequences of overheating, it would be hugely helpful if you could provide or point to a program that could run in the background and automatically send an alert or reboot or kill processes or do something to keep it from overheating and destroying the system.
    Perhaps thermald would work. See Kernel/PowerManagement/ThermalIssues – Ubuntu Wiki: https://wiki.ubuntu.com/Kernel/PowerManagement/ThermalIssues

  • Ville says:

    This command did not work for me:

    sudo apt-key adv –keyserver keyserver.ubuntu.com–recv-keys 39A6ED25

    It errored out with a message ending with “gpg: Conflicting Commands”

    I ended up going to the the keyserver.ubuntu.com website and searching for “parallella”. The search returned the key (39A6ED25) and I was able to copy paste it to a file (key.txt) and run:

    sudo apt-key add key.txt

  • Josh Tollefson says:

    I modified xtemp to have a command line switch that will just report the temp. I then created a Nagios wrapper script/command that monitors the temperature via the values returned from xtemp. Now that I have safe range values I can put warning/critical values in it.

  • oere says:

    You have to use sudo apt-key adv –keyserver keyserver.ubuntu.com –recv-keys 39A6ED25
    There are two dashes (-) without a whitespace in front of keyserver and recv-keys for long commandline options.

  • Andrew Back says:

    Thanks for pointing out the formatting error. Now fixed!

  • Tom says:

    I get unknown option –keyserver with latest hdmi desktop build

  • Tim O'Connor says:

    Hi Josh,

    I am actually after this exact functionality if you wouldn’t mind sharing how you achieved it.

    Cheers,

    Tim

Leave a Reply to oere Cancel Reply