OpenSHMEM for Epiphany


The Epiphany coprocessor has 16 CPU cores, but they are configured differently than the 16 cores you might find in an Intel Xeon processor.  The Xeon processor is configured as a Symmetric Multiprocessor (SMP) [0] where all cores have shared access to a single main memory. Programming models like OpenMP and OpenCL are more suitable for SMP architectures.  In contrast, the Epiphany cores have their own local memory space while each other core is remotely addressable.  This configuration has more in common with a distributed clusters and may be programmed with distributed memory programming models or partitioned global address space (PGAS) [1] models. Many PGAS languages have been developed, but the primary C/C++ library used in High Performance Computing (HPC) has been SHMEM, which was developed by Cray in the 1990s. More recently, the OpenSHMEM committee has standardized the interface for improved compatibility between various platforms. The current version, OpenSHMEM 1.3 [2], includes symmetric memory management, remote memory access routines (including non-blocking), atomic memory operations, collective routines, point-to-point synchronization, and distributed locking routines.

The US Army Research Laboratory (ARL) has developed the ARL OpenSHMEM for Epiphany [3] and released the project as open source software on GitHub [4]. The library can be used as a replacement for inter-core communication operations within the device-side Epiphany SDK (e-lib) [5]. There are few directly comparable routines between OpenSHMEM and e-lib with the exception of remote memory copying. OpenSHMEM outperforms the e-lib implementation by approximately 2-10x for all array sizes. OpenSHMEM also provides convenient interfaces for common communication patterns and operations not found in e-lib. The 2D topology of the Epiphany network-on-chip has been abstracted to a single 1D topology in OpenSHMEM with no significant performance impact. Additional performance details appear in the paper [3]. Many of the OpenSHMEM test codes included in the repository also perform microbenchmarks.

Please use the library in your Parallella projects and submit bugs and feature requests on GitHub [4]!  In addition to the included test and examples, there are many OpenSHMEM example codes available online and in the OpenSHMEM documentation [2].

[0] Symmetric multiprocessing (Wikipedia)
[1] Partitioned global address space (Wikipedia)
[2] OpenSHMEM 1.3 Specification (openshmem.org)
[3] Ross J., Richie D. (2016) An OpenSHMEM Implementation for the Adapteva Epiphany Coprocessor. In: Gorentla Venkata M., Imam N., Pophale S., Mintz T. (eds) OpenSHMEM and Related Technologies. Enhancing OpenSHMEM for Hybrid Environments. OpenSHMEM 2016. Lecture Notes in Computer Science, vol 10007. Springer, Cham (online, arXiv)
[4] ARL OpenSHMEM for Epiphany (GitHub)
[5] adapteva/epiphany-libs/e-lib (GitHub)