Oberon Community Platform Forum
November 22, 2019, 07:31:42 PM *
Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
News:
 
   Home   Help Search Login Register  
Pages: 1 [2]
  Print  
Author Topic: www.scratchapixel.com  (Read 26557 times)
staubesv
Administrator
Sr. Member
*****
Posts: 387



« Reply #15 on: September 08, 2008, 12:24:54 PM »

Inofficially, support for vector/matrix operations is already integrated into the Active Oberon language as well as the Compiler (Inofficially = part of current release). The compiler does heavily optimize matrix operations using SSE/multicore (see I386.ArrayBaseOptimized.Mod).

For the syntax, see http://nativesystems.inf.ethz.ch/pub/Main/FelixFriedrichPublications/ProgrammingMultilinearAlgebra.pdf. (Don't be confused by non-capital letter keywords - this is also an inoffical Active Oberon / PACO feature).

If you want to implement something that does linear algebra computations it would make sense to use the new language support for that purpose. This simplifies/compacts the code and the same time results in much better performance.

But be aware that inofficially also means: could be changed, no support, etc.
« Last Edit: September 08, 2008, 12:33:03 PM by staubesv » Logged
sage
Full Member
***
Posts: 170



WWW
« Reply #16 on: September 08, 2008, 07:14:37 PM »

Inofficially, support for vector/matrix operations is already integrated into the Active Oberon language as well as the Compiler (Inofficially = part of current release). The compiler does heavily optimize matrix operations using SSE/multicore (see I386.ArrayBaseOptimized.Mod).
I've made a little perfomance test, and results for built-in compiler feature are not so well.
Code:
MODULE VecCrossPerfomance;

IMPORT
SYSTEM, Utilities, Commands, KernelLog, Kernel;

TYPE
Vector = ARRAY [3] OF REAL;
Vector2 = ARRAY 4 OF REAL;

CONST
N = 100000000;

PROCEDURE LogReal(r: REAL);
VAR
a: ARRAY 32 OF CHAR;
BEGIN
Utilities.FloatToStr(r, 0, 16, 0, a);
KernelLog.String(a)
END LogReal;

PROCEDURE vecCross(VAR c: Vector; CONST a: Vector; CONST b: Vector);
BEGIN
c[0] := a[1] * b[2] - a[2] * b[1];
c[1] := a[2] * b[0] - a[0] * b[2];
c[2] := a[0] * b[1] - a[1] * b[0]
END vecCross;

PROCEDURE vecCrossSSE(VAR c: Vector2; CONST a: Vector2; CONST b: Vector2);
CODE {SYSTEM.i386, SYSTEM.SSE}
MOV EBX, c[EBP]
MOV ECX, a[EBP]
MOV EDX, b[EBP]
MOVUPS XMM0, [ECX]
MOVUPS XMM1, [EDX]
MOVAPS XMM2, XMM0
MOVAPS XMM3, XMM1
SHUFPS XMM0, XMM0, 201
SHUFPS XMM1, XMM1, 210
SHUFPS XMM2, XMM2, 210
SHUFPS XMM3, XMM3, 201
MULPS XMM0, XMM1
MULPS XMM2, XMM3
SUBPS XMM0, XMM2
MOVUPS [EBX], XMM0
END vecCrossSSE;

PROCEDURE perfomance*(context: Commands.Context);
VAR
v1, v2, v3, v4: Vector;
v21, v22, v5: Vector2;
t: Kernel.MilliTimer;
i, elapsed1, elapsed2, elapsed3: LONGINT;
BEGIN
v1[0] := 3.0; v1[1] := 3.0; v1[2] := -5.0;
v2[0] := 3.0; v2[1] := 6.0; v2[2] := 0.0;
v21[0] := 3.0; v21[1] := 3.0; v21[2] := -5.0;
v22[0] := 3.0; v22[1] := 6.0; v22[2] := 0.0;

Kernel.SetTimer(t, 0);
FOR i := 0 TO N - 1 DO
v3 := v1 * v2
END;
elapsed1 := Kernel.Elapsed(t);

Kernel.SetTimer(t, 0);
FOR i := 0 TO N - 1 DO
vecCross(v4, v1, v2)
END;
elapsed2 := Kernel.Elapsed(t);

Kernel.SetTimer(t, 0);
FOR i := 0 TO N - 1 DO
vecCrossSSE(v5, v21, v22)
END;
elapsed3 := Kernel.Elapsed(t);

KernelLog.Enter;
KernelLog.Ln;
KernelLog.String("Computation of vectors' cross product using built-in feature: ");
KernelLog.Ln;
KernelLog.Int(elapsed1, 0); KernelLog.String(" ms");
KernelLog.Ln;
KernelLog.String("Result: ");
KernelLog.Ln;
LogReal(v3[0]);
LogReal(v3[1]);
LogReal(v3[2]);
KernelLog.Ln;
KernelLog.Ln;
KernelLog.String("Computation of vectors' cross product using ordinary procedure: ");
KernelLog.Ln;
KernelLog.Int(elapsed2, 0); KernelLog.String(" ms");
KernelLog.Ln;
KernelLog.String("Result: ");
KernelLog.Ln;
LogReal(v4[0]);
LogReal(v4[1]);
LogReal(v4[2]);
KernelLog.Ln;
KernelLog.Ln;
KernelLog.String("Computation of vectors' cross product using SSE procedure: ");
KernelLog.Ln;
KernelLog.Int(elapsed3, 0); KernelLog.String(" ms");
KernelLog.Ln;
KernelLog.String("Result: ");
KernelLog.Ln;
LogReal(v5[0]);
LogReal(v5[1]);
LogReal(v5[2]);
KernelLog.Ln;
KernelLog.Exit;

END perfomance;

END VecCrossPerfomance.

VecCrossPerfomance.perfomance~

SystemTools.Free VecCrossPerfomance~

Output (100000000 operations, P4 3,4 GHz, WinAos rev. 1575):
Quote
ArrayBase: setting runtime library (semi-optimized) default methods.
ArrayBaseOptimized: installing runtime library optimizations:ASM SSE SSE2  done.
{P cpuid= 0, pid= 2220
Computation of vectors' cross product using built-in feature:
5125 ms
Result:
  0.0000000000000000 -15.0000000000000000 -9.0000000000000000

Computation of vectors' cross product using ordinary procedure:
2125 ms
Result:
  30.0000000000000000 -15.0000000000000000  9.0000000000000000

Computation of vectors' cross product using SSE procedure:
1109 ms
Result:
  30.0000000000000000 -15.0000000000000000  9.0000000000000000
}
« Last Edit: September 09, 2008, 05:00:00 AM by sage » Logged
sage
Full Member
***
Posts: 170



WWW
« Reply #17 on: September 09, 2008, 06:02:19 AM »

Computation of vectors' cross product using SSE procedure:
1109 ms
Results may be even better, because when processor copes with unaligned data (in vecCrossSSE particularly) it spents about 48 cycles only for moving data between XMM registers / memory.
« Last Edit: September 09, 2008, 07:18:33 AM by sage » Logged
Pages: 1 [2]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.21 | SMF © 2015, Simple Machines Valid XHTML 1.0! Valid CSS!