The COPRTHR® SDK provides libraries and tools for developers targeting accelerators and co-processors. The SDK is freely available under the open-source GPLv3 license. COPRTHR version 1.6 has been updated to support Parallella. In this blog post I present a quick overview of the support for Parallella provided by the SDK.
The SDK provides several APIs for programming Parallella allowing developers to choose the approach that best meets the needs of their project.
STDCL® is a powerful API for targeting accelerators such as GPUs and Intel MIC processors. For Parallella, STDCL provides a simple API for targeting the Epiphany co-processor using a compute offload programming model. The STDCL implementation leverages OpenCL for portability while providing simpler and more intuitive semantics. STDCL greatly reduces the amount of code required for a particular application and may be used directly in C, C++ and Fortran applications. The STDCL API is recommended for programmers looking for an intuitive API that is easy to use and integrated with a standard compilation model and work flow.
Basic OpenCL support is provided for both the ARM CPU and Epiphany co-processor. The OpenCL implementations are not conformant or complete, but neverthreless provide a practical means of programming Parallella for those familiar with OpenCL. When using the OpenCL API it is important to remember that the Epiphany RISC array is not a GPU and simply recycling OpenCL code written for a GPU is not likely to exhibit the best performance. A portable API does not imply portable performance.
The new low-level COPRTHR API exports the low-level code used to implement STDCL and OpenCL for programmers interested in a more direct and precise API for the Epiphany co-processor. The low-level API supports malloc(2) based device memory allocation, immediate operations, and scheduled stream and thread programming models. The thread model is a direct extension of Pthreads for co-processors.
Standard Compilation Model
The SDK provides a set of compiler tools that support a standard compilation model and workflow familiar to most programmers. The
clcc compiler tool may be used to pre-compile kernels for a range of architectures including Parallella, allowing co-processor device code to be directly linked with host code in single executable or shared library. The embedded kernels are directly accessible using the STDCL API. The following tools are provided:
clcc– compiler front-end for pre-compiling kernels targeting one or more co-processor devices.
clld– linker allowing recursive linking of individual object files with pre-compiled kernels as well as other binary object manipulation.
clnm– displays embedded kernel symbol information similar to the UNIX
As an example of how easy it is to build applications for Parallella using these tools, the following commands are all that are necessary to build and run a heterogeneous application that uses both the ARM and Epiphany processors:
] clcc -k -o my_kernels.o my_kernel_1.cl my_kernel_2.cl my_kernel_3.cl
] gcc -o my_program.x my_host_code.c my_kernels.o
A few additional libraries are provided by the SDK, including
libclrpc.so, and both may be of interest to Parallella programmers. The
libocl.so library is a direct replacement for the conventional OpenCL loader
libOpenCL.so with support for more precise platform configuration, call intercept hooks, and CLRPC server support. The
libclrpc.so library provides supported for networked devices using a Remote Procedure Call (RPC) implementation of OpenCL.
The following resources may be helpful in providing more information: