I've made benchmark based on lesson's source code (lesson6b.cpp) from
www.scratchapixel.com.
There are separate modules for 3 test modes:
1. Computing using ordinary procedures - lesson6bench.Mod
2. Computing using built-in matrix compiler feature - lesson6benchMatrix.Mod
3. Computing using ordinary procedures, but two procedures (cross product and dot product computation) implemented using SSE - lesson6benchSSE.Mod
All test results obtained on Intel Pentium dual core E2160 @ 1.8 GHz, WinAos rev. 1578.
1:
Bucket size: 32 Time elapsed: 640
Bucket size: 64 Time elapsed: 329
Bucket size: 128 Time elapsed: 187
Bucket size: 256 Time elapsed: 109
Bucket size: 512 Time elapsed: 79
Bucket size: 1024 Time elapsed: 93
2:
ArrayBase: setting runtime library (semi-optimized) default methods.
ArrayBaseOptimized: installing runtime library optimizations:ASM SSE SSE2 done.
Bucket size: 32 Time elapsed: 6344
Bucket size: 64 Time elapsed: 3250
Bucket size: 128 Time elapsed: 1765
Bucket size: 256 Time elapsed: 1078
Bucket size: 512 Time elapsed: 813
Bucket size: 1024 Time elapsed: 828
3:
Bucket size: 32 Time elapsed: 672
Bucket size: 64 Time elapsed: 344
Bucket size: 128 Time elapsed: 187
Bucket size: 256 Time elapsed: 109
Bucket size: 512 Time elapsed: 94
Bucket size: 1024 Time elapsed: 78