Oberon Community Platform Forum
November 18, 2017, 03:18:40 AM *
Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
News:
 
   Home   Help Search Login Register  
Pages: [1]
  Print  
Author Topic: The new WinAos deluding performance  (Read 5488 times)
danp
Newbie
*
Posts: 37


« on: October 08, 2010, 04:06:02 PM »

Hi,

Few weeks ago I have decided to convert, at least the server part of my framework, to the new WinAos version and syntax. I steel have all production state applications running on the 3 Apr 2007 version.
I was very happy to see the new GC at work, the better memory management, no more TCP connections uncollected after close, and I want to thanks Felix and all the people who hardly worked to this nice project.

I was not very happy with the new XML system, passed from ArrayCollection structure to a more Oberon style linked list (double). The only fault of the old system I can see, is the reallocation needed on out of bound. But on the other side the array is much faster to sort/search. Do you know any list sorting algorithm matching the QuickSort performance? But probably you don't use XML for managing data.

During the upgrade, I have noticed longer compilation time, but who cares, if the result is a better software. The delusion arrived as soon as I have finished and put it at work with real data.
The test job (a peace of a real one) consists in decoding 115078 bytes of binary data and convert it in a result set of 2084 XML records of this type (just adding elements, no other operation):

<recordset>
    <record id="0">
        <rectype>4</rectype>
        <id5>1</id5>
        <id3>1</id3>
        <time type="d">2010-10-02 06:10:15</time>
        <latitude type="n">2680.924600</latitude>
        <longitude type="n">1666.933100</longitude>
        <direction type="i">235</direction>
        <speed type="i">12</speed>
        <height type="n">13</height>
        <km type="n">6841.452</km>
        <ia2 type="i">920</ia2>
        <eec2 type="n">102.00</eec2>
        <eec1 type="n">2210.25</eec1>
        <engine_temperature type="n">82</engine_temperature>
        <engine_hours type="n">1384.30</engine_hours>
        <dd type="n">100.00</dd>
        <lfc type="n">43883.00</lfc>
    </record>

This is the resulting time table:

concurrent tasks               Old            New
---------------------------------------------------
             1                         6 s          15 s
             5                       82 s         533 s

If the old system is very constant on 6s from the very first run, the new one starts with a first iteration of 50-60s , to keep then on 15s, but going from time to time on 20s or more, very inconsistent indeed.
Are you aware of this fall of performance? I don't understand if it's all about the new XML, or is the new runtime?

Thanks,
Dan
« Last Edit: October 08, 2010, 04:09:36 PM by danp » Logged
danp
Newbie
*
Posts: 37


« Reply #1 on: October 25, 2010, 08:31:32 PM »

I have made the same test in UnixAos and this are the results:

concurrent tasks       WinAos Old     WinAos New      UnixAos 0.91
------------------------------------------------------------------
                1                      6 s                15 s               3s
                5                    82 s               533 s            111s

I was amazed, because UnixAos looks very slow on user interface, but out of graphics, it rocks. It is able to use all 8 processor threads, reaching 100% of CPU usage, and as far as I could understand, has faster memory allocation, but ... unfortunately TCPServices doesn't work and I cannot use it.

Dan
« Last Edit: October 25, 2010, 08:37:57 PM by danp » Logged
BohdanT
Sr. Member
****
Posts: 271


Life is difficult, but fortunately is short!


WWW
« Reply #2 on: October 25, 2010, 09:44:50 PM »

Quote
concurrent tasks       WinAos Old     WinAos New     
------------------------------------------------
                1                      6 s                15 s 
                5                    82 s               533 s 

Wirth's law
 Cheesy Cheesy Cheesy
Logged
staubesv
Administrator
Sr. Member
*****
Posts: 387



« Reply #3 on: October 26, 2010, 02:35:29 PM »

Well... that's indeed bad numbers. It was me that changed the representation of the XML data structure. The reasons for the changes were that the A2 component system highly relies on XML and that again posed two problems both related to memory allocation:

1. Large number of memory blocks slow down garbage collector too much
For the XML attributes it seemed somewhat overkill to put them into a (dynamically allocated) array. First, even an element with no attributes had a container object (well, that could have been optimized by allocating the container on demand only). Second, since the attributes are represented using objects, it's not much overhead to add previous/next pointers, so another object (dynamic array) is not needed.

2. Traversal of XML data structure should not lead to dynamic memory allocation
This is mainly important because the A2 visual components "are" XML.Element's. The usage of enumerator objects for traversal caused memory allocations when traversing XML structures so that just drawing the screen already caused allocations of memory. That was the reason why the A2 system constantly performed memory allocations even when nothing "special" was going on (e.g. when moving the mouse, XML traversal is required to find out over which component the mouse cursor is located).

I have to admit that I haven't performed any benchmarks and it maybe that this changes caused the performance impact...

... so there is some work to do ;-)


Logged
staubesv
Administrator
Sr. Member
*****
Posts: 387



« Reply #4 on: October 27, 2010, 05:17:18 PM »

Ok, the numbers were indeed so bad that some action was required  Wink

Several optimizations are now in place:
- avoid usage of temporary dynamic strings in XML scanner
- directly use dynamic strings from XML scanner to set names, values, ... of XML objects
- experimental: use string pool to avoid allocation of many dynamic strings with same content

For my test, this makes parsing a large XML document about 3x faster in average, so the current WinAos version should outperform the old version. Also, the slow initial run disappeared so the results are more consistent now.

Note:
For large XML documents as the one you mentioned that have a regular structure, it makes a lot of sense to enable the usage of a string pool. This means that strings will be re-used if their content is the same instead of allocating a new dynamic string for every content.
For example, even if you just use 10 different element names in the whole document, the previous XML parser would have allocated a new dynamic string for any element instance. So if you had one million elements sharing the same 10 names, one million dynamic string were allocated.

To enable the string pool: XMLScanner.Scanner.SetStringPooling({0..31}). This will pool all kind of strings (also data).

Important: Any String returned by any XML.* object must not be mutated! So never change such a string when you have a reference to it. It is fine, however, to set a name/value/whatever to a new(!) dynamic string.

« Last Edit: October 27, 2010, 05:19:39 PM by staubesv » Logged
staubesv
Administrator
Sr. Member
*****
Posts: 387



« Reply #5 on: October 28, 2010, 10:21:10 AM »

Quote
decoding 115078 bytes of binary data and convert it in a result set of 2084 XML records
Binary... ok, so the XML parser speed up won't help here. Although it is somewhat strange that this has slowed down since the AddContent operation should not be slower just because of linking content instead of adding it to a container... well, maybe it's the runtime.

Dan, I do have a few questions:
- What version of UnixAos have you used in the benchmarks above?
- What exactly does the "cocurrent tasks" mean? Doing one work with 5 concurrent tasks, or doing the job for 5 different files?

Quote
But on the other side the array is much faster to sort/search
I've just had a look at the old implementation: It didn't provide direct access to the array (only to XML.Container.coll). Have you used a modified version for your purposes?

Logged
soren renner
Global Moderator
Full Member
*****
Posts: 216



« Reply #6 on: October 28, 2010, 07:51:03 PM »

Repository WinAos EXE will not even start on my Linux box: "wine aos.exe" hangs the console.

Repository UnixAos has very frequent GC pauses that interfere with the UI.
Logged
danp
Newbie
*
Posts: 37


« Reply #7 on: October 31, 2010, 02:01:28 PM »

Hi Sven,

Dan, I do have a few questions:
- What version of UnixAos have you used in the benchmarks above?
UnixAos 0.91 as written above

- What exactly does the "cocurrent tasks" mean? Doing one work with 5 concurrent tasks, or doing the job for 5 different files?
5 simultaneous tasks using the same binary data

I've just had a look at the old implementation: It didn't provide direct access to the array (only to XML.Container.coll). Have you used a modified version for your purposes?
Yes, and I have another advantage with the ArrayCollection version: one element could happily belong to more collections. Adding an element of another collection is very cheap and harmless. With the list it is not possible, unless you destroy the source list. Imagine an SQL select statement that builds a recordset from more tables. The same way I'm building a result recordset putting together elements from more xml structures read from the database. Now I have two options: to destroy the origin list (and could be fine in many cases) or clone the element if the origin list must be kept alive (very expensive).

Thank you for the answer and for the effort to speed up WinAos,
Dan
Logged
danp
Newbie
*
Posts: 37


« Reply #8 on: October 31, 2010, 06:28:12 PM »

I was wandering how a fresh language as Go will manage the same task. I have written a minimal Xml implementation (attached file) and doubled the buffer to 204555, for meaningful results. I wasn't able to run it concurrently, so the 5 iterations are sequential.

1 iteration       305ms
5 iterations    1347ms

All that in a Virtualbox Ubuntu.

Dan

* xml.go (1.43 KB - downloaded 259 times.)
« Last Edit: October 31, 2010, 06:31:05 PM by danp » Logged
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.21 | SMF © 2015, Simple Machines Valid XHTML 1.0! Valid CSS!