Parallel Programming With Parallella

Parallel programming can be very difficult, but it can also be very, very simple. Executing a thousand independent programs that don’t depend on each other in any way is not that difficult if the right infrastructure is available. An operating system performs multiple tasks concurrently all day long without us noticing. Sure, sometimes we’ll run into performance problems or all-out thrashing, but for the most part it works great.

For larger clusters, there have always been job schedulers to help distribute and balance batch work loads across servers. I remember using the LSF “Load Sharing Facility” package to abuse our compute clusters with random design verification tests and power simulations on the TigerSHARC DSPs over a decade ago. In a sense this was a trivial type of parallel programming. A user would send out a set of almost identical simulation jobs to the job schedule, with the only difference being the seed used to simulate the chip.

The simplicity of this “job scheduling” parallel usage model was always in the back of my mind as I started the Epiphany and later the Parallella project. Wouldn’t it be nice if there was a utility that allowed for programmers to easily take advantage of massive parallelism to scale out performance for low-hanging-fruit problems that are “embarrassingly parallel”?

During the summer I worked with our summer interns to put together a simple job scheduling demo.  Well…to be honest, it was more like me sketching some half baked ideas on the board and then swiftly abandoning them for a much needed one week vacation. When I came back, the students had a basic version working and it only took one more week of work (under the guidance of Yaniv) to produce the demo you see in the video! Certainly, the outcome is mostly a testament to the hard work and talent of our interns but in some small way it also demonstrates the power and ease of use of the Epiphany and Parallella platforms.

The key Parallella platform features that enabled this project include:

  • Well documented Epiphany features allowing for monitor and scheduling of programs at each core.
  • The ability to run completely independent programs on each Epiphany core.
  • Low latency communication between the ARM processor and the Epiphany coprocessor.
  • A stable Linux distribution that runs on a very capable dual core ARM A9 processor.

The video demo in this post shows multiple independent applications running on the dual core ARM processor, with small independent kernel tasks being launched to different Epiphany cores based on availability. The Epiphany resource manager (“ERM”) runs in the background and continuously monitors the Epiphany network traffic and core workload, displaying the Epiphany status through a simple java based app.

The full project source code for the demo can be found on github:


Perhaps the most amazing part of the story is that our interns didn’t even know C before they started at Adapteva six weeks earlier!

(from left to right: Xin Mao, Wenlin Song, and Kevin Cheng)

I am confident that stories like this will play out time after time across the globe over the coming year once thousands of Parallella boards reach developers’ hands. Put an open platform in the hands of clever hard working developers, and magic happens!




Leave a Reply