by ysapir » Sun Apr 27, 2014 7:55 pm
@tnchan - the article shows analysis of performing the 2D FFT under certain assumptions. One of them is that the whole image should be stored locally on the chip, assuming there is no DRAM access. This mode of operation limits the max size of the image to 128x128 (grayscale, complex-float pixels). Another key point in the chosen method was to perform the 1D FFTs on a single core. The formula was developed for this use case, and may need some adaptations for other schemes.
It was quite some time ago, but IIRC, the current Epiphany chip can do up to 1024-point FFT. Thus, if you allow DRAM access, you could do an 1024x1024 image in (1024/16=) 64 batches. Furthermore, one can perform 1D FFT using a 2D FFT engine. This means that, adding a simple intermediate stage, you could use the given method for performing 128x128=16K point FFT, in similar times.