I moved the Test code for virtual functions out to see if that would confuse the optimiser:
uint32_t TimeCalls(CommandHandler *pCommandHandler, uint32_t uIters)
uint32_t uStartTicks = HAL_GetTick();
for(int i=0; i < uIters; i++)
pCommandHandler->streamData(buffer, 64, false);
uint32_t uEndTicks = HAL_GetTick();
return uEndTicks - uStartTicks;
and then called:
uVirtualFpgalTime = TimeCalls(pFpgaHandler, 1000);
uVirtualCountTime = TimeCalls(pCountHandler, 1000000);
Interestingly the TimeCalls() for the CountHandler is inlined as is the call to StreamData, for the FPGAHandler TimeCalls() is inlined but the call to StreamData() is not.
The FPGAHandler StreamData code is obviously of the level where the compiler is seeing no need to inline it.
So if we take the FPGAHandlers timings as the cost of calling a non inlined virtual function, I also changed the code so that the FPGA write didn't exit when length was reached and upped the iterations to 100,000:
inline = 52393ms
virtual = 54396ms
overhead = (54396-52393)/100000 = 0.02003ms
Method Time = 0.52393ms
So even though we do have 20us overhead compared to the inline function it pales into insignificance to the time to process the data, even when that code has been improved I guess.
So every 1ms usb packet will incur a 20us overhead if using virtual functions.