SPI code is slow



AT91_SPI_Driver::nWrite8_nRead8 and AT91_SPI_Driver::nWrite16_nRead16 are not thread safe and interrupts during SPI write/read operations will force the system into a HARD_BREAKPOINT(). This can be fixed this calling GLOBAL_LOCK() right after entering those functions.
Overall, the SPI Driver is painfully slow. Writing 8 bit with 10MHz SCLK induces up to 80% overhead. Setting every GPIO Pin to Input -> Alternate Mode -> Output might be a good idea if the pins are multiplexed. But using SPI controlled displays or other performance dependent devices reveals some major drawbacks.
Closed Sep 11, 2014 at 10:45 PM by dankuo
The AT91 driver is no longer under active development. We encourage any users to submit a fix if they have one.


Highfield wrote Sep 9, 2011 at 8:06 AM

I don't see any serious overhead in the SPI, unless you send just one byte at once.
If you try to transfer a byte array at once, the performance is very high, with almost no overhead.
BTW, it is true that it is not thread-safe, but I'd prefer an interruption, instead of hooking the rest of the program during a long transfer.
IMHO, the break-ability of the SPI transfer would not be an issue. It is a synchronous protocol, so there's no problem.

Also, would you explain better what's the GPIO pins workaround? I don't have clear what's the meaning.
Thank you and cheers.

empower wrote Sep 11, 2011 at 7:33 AM

Obviously I didn't raise the original issue but I may be able to answer your question. There is no workaround for this issue.
In the current AT91_SPI_Driver code the pins are switched from their default GPIO function to their alternate (in this case SPI) function for each transaction and afterwards they are switched back to their GPIO function. This Switching back and forth makes small transaction of a few bytes take about 5 times longer than necessary.
In many applications this would be inconsequencial but there are genuine cases where the added overhead just kills the performance.
It appears the driver was written with the anticipation that the user may want to use the pins as GPIO as well as SPI pins. This may well be a valid approach but there must be a better way to maintain this capability without the performance hit of initialising the port for every transaction.

empower wrote Sep 11, 2011 at 10:01 AM

I just did some simple benchmarking on my netduino plus. At 1MHz SPI Clock it should take 32microsecnds to transfer 4 bytes. The execution time for a 4 byte transaction was 331 microseconds and Chip select was active for 67microseconds.

Highfield wrote Sep 11, 2011 at 11:35 AM

I totally agree with you about the (useless) overhead. BTW the SPI itself is very fast indeed, the problem is how it is managed before-and-after the actual transfer.
I'd suggest a different approach:
1) the SPI should be "free" if its "Config" is null. In such a condition, any previous setting of shared I/O will become the default;
2) there should be the ability to instantiate a SPI "without" config, i.e. a parameterless constructor;
3) as soon you set any valid SPI.Configuration instance, the I/O behavior should be overriden by the config itself, WITHOUT any useless clearing of SCLK and MOSI.
4) the SPI signals MUST keep their most recent state, unless they are reset to the config "default";
I've checked a few ago, and I've seen that once you dispose the SPI, the related I/Os are switched to InputPorts automatically (pullup-ed). This can lead to another unexpected behavior if your device need the SS active high.
Again, I wonder why the current implementation has to be so cumbersome. I'd bet that the suggested version would take few rooms also.

empower wrote Sep 11, 2011 at 9:55 PM

I am hesitant to be too critical of the original authors because there is more to consider than the performance of this one driver for one core architecture. There may be good reasons for doing things that way and the few comments in the code hint at some of those reasons.
I still think that the port initialisation code belongs in the constructor and destructor functions.
If the port pins are required for their GPIO functions then that could be achieved by constructing and discarding the SPI object for each transaction.

DoingNZ wrote Sep 11, 2011 at 11:40 PM

What may need some consideration is that the SPI is also a "bus" and that more than one device can be present on the SPI., each device with its own CS.
Currently the SPI_CS pin cannot be changed without creation of new SPI configuration object. Addressing two devices on a single SPI HW functional block requires new obects each time SPI read/write happens.

Maybe the SPI needs a thread safe "busy" Waitable object, and the concept of a collection of configurations objects, one collection for each SPI bus, each device with its own SPI configurtion obect. In the simplest situation with only one device on an SPI HW module, the initialisazion happens in the constructor and destructor, otherwise it happens in the Add/Remove on the collection of configuration objects. When addressing an SPI device, the native code can track the current and requested configuration for the SPI hardware and adjust accordingly. So in the case there are two devices on a single SPI, with the same clock polarity, active CS levels, clock speed etc, but with different CS, then the SPI can simply set the correct CS active before doing the transfer and achieve the lowest overhead possible.

When measruing the overhead, one needs to consider the time it takes to go from managed to unmanaged and back.
If you need a driver that must operate very quickly, then might be best to code that in native C/C++.

Highfield wrote Sep 12, 2011 at 4:34 AM

Well, I wouldn't create such a long chat over this section (we may move our discussion somewhere else).
Anyway, if you take a look at the native code, it is clear enough that the things could be better. IMHO, there was some decision over the SPI driver management, and here is the result.
As far as I see, the hardware SPI is involved (init/uninit) just in the moment of any transaction. This is uncommon, because when you "open" a COM, for instance, none else can re-open the same serial until you close it. Here is not.
I would not sacrifice the performance and the "compatibility" of such an important device like the SPI. If any specific platform have to share some pin, it should have its own driver.
Also I re-confirm how I would like to see the SPI: totally similar to a "COM". Open (and none else can open it), change whatever/whenever the config you want, transfer, finally close.
Notice that the current implementation cannot offer any slave-SPI functionality. My proposal does.

wrote Sep 27, 2011 at 1:15 PM

wrote Jan 24, 2012 at 8:19 PM

wrote Apr 2, 2012 at 7:07 PM

wrote Oct 11, 2012 at 8:27 PM

wrote Dec 19, 2012 at 11:33 AM

wrote Jan 30, 2013 at 8:33 AM

wrote Feb 22, 2013 at 12:45 AM

wrote Sep 11, 2014 at 10:45 PM

wrote Sep 11, 2014 at 10:45 PM