Parallel Programming With Parallella

Parallel programming can be very difficult, but it can also be very, very simple. Executing a thousand independent programs that don’t depend on each other in any way is not that difficult if the right infrastructure is available. An operating system performs multiple tasks concurrently all day long without us noticing. Sure, sometimes we’ll run into performance problems or all-out thrashing, but for the most part it works great.

For larger clusters, there have always been job schedulers to help distribute and balance batch work loads across servers. I remember using the LSF “Load Sharing Facility” package to abuse our compute clusters with random design verification tests and power simulations on the TigerSHARC DSPs over a decade ago. In a sense this was a trivial type of parallel programming. A user would send out a set of almost identical simulation jobs to the job schedule, with the only difference being the seed used to simulate the chip.

The simplicity of this “job scheduling” parallel usage model was always in the back of my mind as I started the Epiphany and later the Parallella project. Wouldn’t it be nice if there was a utility that allowed for programmers to easily take advantage of massive parallelism to scale out performance for low-hanging-fruit problems that are “embarrassingly parallel”?

During the summer I worked with our summer interns to put together a simple job scheduling demo.  Well…to be honest, it was more like me sketching some half baked ideas on the board and then swiftly abandoning them for a much needed one week vacation. When I came back, the students had a basic version working and it only took one more week of work (under the guidance of Yaniv) to produce the demo you see in the video! Certainly, the outcome is mostly a testament to the hard work and talent of our interns but in some small way it also demonstrates the power and ease of use of the Epiphany and Parallella platforms.

The key Parallella platform features that enabled this project include:

  • Well documented Epiphany features allowing for monitor and scheduling of programs at each core.
  • The ability to run completely independent programs on each Epiphany core.
  • Low latency communication between the ARM processor and the Epiphany coprocessor.
  • A stable Linux distribution that runs on a very capable dual core ARM A9 processor.

The video demo in this post shows multiple independent applications running on the dual core ARM processor, with small independent kernel tasks being launched to different Epiphany cores based on availability. The Epiphany resource manager (“ERM”) runs in the background and continuously monitors the Epiphany network traffic and core workload, displaying the Epiphany status through a simple java based app.

The full project source code for the demo can be found on github:

  • https://github.com/adapteva/epiphany-examples/tree/master/erm
  • https://github.com/adapteva/epiphany-examples/tree/master/erm_example

Perhaps the most amazing part of the story is that our interns didn’t even know C before they started at Adapteva six weeks earlier!

(from left to right: Xin Mao, Wenlin Song, and Kevin Cheng)

I am confident that stories like this will play out time after time across the globe over the coming year once thousands of Parallella boards reach developers’ hands. Put an open platform in the hands of clever hard working developers, and magic happens!

Cheers,

Andreas

6 Comments

  • phil.pidgeon says:

    The source for this has disappeared.

  • phil.pidgeon says:

    found them now…..thanks

  • phil.pidgeon says:

    found in epiphany-examples-2014.11.zip\apps\erm and erm-examples

  • Jacob says:

    Awesome, I’ve ordered a board, hoping to get it soon!

  • adam adam says:

    can someone actually fix this tool it has few issues:
    1) colors between green and red there should be yellow 🙂
    2) having hard-coded paths in epiphany-examples that points to desktop of some particular user that are spread all over the erm sources is bad. Plus build.sh doesn’t even builds up java code.
    3) faulty design – app consists of two modules, one gathering plain data and writes it to file system, and second part UI in java that rapidly parses and updates view accordingly – well since I did not seen any stats yet, then we’re taking some indeterministicly spaced samples, write it in file and then from those indeterministicly spaced sample we read at indeterministic time and display it – it more or less blinks since it doesn’t have any quality data, nor even average over time span. There is absolutely zero checking for writing over array boundaries.

    Now some good part – thank you for pointing out that monitoring emesh is possible (well more in Epiphany IV since E3 lacks uses of any transfer). I know it’s there in Epiphany Architecture, but it’s really easy to miss, and basically this board has like zero tools for monitoring usage.

    Great part goes for realization of Parallella board – I’ve been waiting to get my hands on something like that since my fun days with Cell Broadband Engine which I miss so bad – it’s awesome to have you guys. It’s great piece of hardware for fun, learn, and test.
    Plus that small local storage of 32 kb 🙂 that’s just forcing users to do two things, use high level library and loose most of the fun or learn some parallel programming skills and possibly familiarize and try oneself at actor model approach.

    There are some issues of course like immaturity of the tools, but the sources are there, except for maybe figuring out gcc and linux patches where I’d like to have up-to-date gcc with c++17 support or at least c++14 for epiphany. I like the simplicity of C, but I like even better strong typing, writing code that is readable and inlines to basically the same code as the one from C.

    Thanks again guys – thanks to you all I’m having something fun and interesting to do on my spare time.
    I was having pretty much fun since the day 1 of parallella where whilst I was soldering fan to the board the phone rang and well my board fell on the floor with a lots of things on it and the SD slot broke off the board – that was the time where I had to figure out how to boot it from the network and boy that time linux kernel branch was at mess and didn’t had enabled epiphany support -.- well it took me a while to figure it out, and enable network support, write some script to boot from TFTP the kernel, so that I don’t have to go over and over again to the board just to reconfigure something for test – u-boot is nice for that i allows you setup the ethernet and get a file out of TFTP and execute it as script for u-boot – yey secured as never :D. Back then I was using TFTP + NFS now I’m using TFTP + USB HDD which sometimes doesn’t gets properly initialized during the boot – maybe something with power regulator or usb reset sequence during linux kernel boot, but it more or less starts with usb initialized or not randomly – hmm maybe some timing issue, about those there appears to be lots of those. One is for CPU frequency scaling which seems to affect ethernet, so unless this is fixed I advise not to enable it in kernel. Second one is at e-mesh where once in a while 32-bit AXI just can’t get upper 32-bits right for mailbox.

  • adam adam says:

    Epiphany resource manager – umm what ?!
    it’s not like you can atomically request access to resource so it’s pretty much pointless to ask unless you have fairly strong guarancy that no one will stole what you’re looking at.
    Basically unless you plan your system to use multi apps and split them, you’ll get into trouble.
    What you would like is to have ability for computations to auto-magically schedule themselves on whatever gird is available under a given system configuration – actor model seems to go that way, so maybe erlang could be put to usage ? Just a hit 😉

Leave a Reply