logic loop vs render loop discussion

jerome · October 7, 2015

Hi,

I start a new topic here from this last post from Fenomas : http://www.html5gamedevs.com/topic/17550-framerate-issues-at-144hz/?p=99321

The goal is study how to deal with a constant logic loop (what manages, say, physics, IA, external data processing, etc) besides the BJS render loop (registerBeforeRender) in term of duration and, if we can, in term of performance.

Remember the algorithm proposal of Fenomas :

var tick_time = 40 // desired ms between ticks.. 40 would be for 25 ticks/secondfunction RAF() {  var now = performance.now()  var dt = now - _last  _last = now  if (game_paused) { return }  accumulator += dt  if (accumulator > tick_time) {    accumulator -= tick_time    game.tick(tick_time)    last_tick = now  }  var time_since_tick = now - last_tick  game.render(time_since_tick)  requestAnimationFrame(RAF)}

So please, consider this example : http://www.babylonjs-playground.com/#2KSQ1R#98

As you can see at lines 36-38, I create a SPS into which I add 500 cubes, 500 tetrahedrons and 200 torus knots.

This example runs for now at 60 fps in my Chrome browser.

Set your own right value, say, to the number of torus knots because they generate the more vertices, to get 60 fps also.

Now have a look from the line 79 :

      var tick = function (tick_time) {          PS.setParticles();      };      var tick_time = 40;      var now = Date.now();      var last = Date.now();      var dt = 0;      var accumulator = 0;      var last_tick = 0;       // animation      //setInterval(function() { tick(); }, tick_time);      scene.registerBeforeRender(function () {          now = Date.now();          dt = now - last;          last = now;          accumulator += dt;          if (accumulator > tick_time) {              accumulator -= tick_time;              //tick(tick_time);          }          //tick();          pl.position = camera.position;          mesh.rotation.y += 0.001;      });

I just tried to re-implement the Fenomas solution : the tick() function here just call the SPS.setParticles() which recomputes all the vertex positions (70800 vertices for a single mesh for my 1200 solids) and recompute all the normals.

SPS.setParticles() will just call SPS.updateParticles() for each particle and, in my example, will rotate each particle around an axis particle - SPS center.

For now, all the tick() calls are commented, so nothing happens, the whole mesh just slowly rotates.

We are about to compare three implementations :

tick() called directly in the render loop as usual
tick() called from the render loop but only at a fixed given frequency, here every 40 ms
tick() called outside the render loop, from a js setInterval, every 40 ms

As I don't pass to tick() the tick_time parameter like Fenomas did in his own example, the updateParticle() won't take into account the different delay between two consecutive calls, so the particles might rotate at different speed from an example to another one.

I know that is justly the reason why we should use a decoupled logic loop, but I just want to compare the performances here.

So don't mind about the particle rotation, don't change your screen size and

let's start :

#1 : http://www.babylonjs-playground.com/#2KSQ1R#99 directly in the render loop : average 55 fps, quite constant

#2 : http://www.babylonjs-playground.com/#2KSQ1R#100 at tick_time from the render loop : average 48 fps, quite constant

#3 : http://www.babylonjs-playground.com/#2KSQ1R#101 only with setInterval clocked at tick_time, outside the render loop : average 55 fps, but with much variability from 51 to 60 fps

Unless I made some mistake, it seems really weird that #2 is the slowest while I expected it to be the fastest.

It's weird also that #1 is as fast as #3 because the tick() is then called every registerBeforeRender call ... unless it's a coincidence and my tick lasts enough to delay registerBeforeRender as much as #3 would do.

Well, this example is not a pure logic loop example as the setParticle() function updates the mesh vertices, it is to say that it accesses to the BJS core itself, to the WebGL part. Maybe another example, where only, say, coordinates or speeds would have been computed, would have been more pertinent.

Any thoughts ?

fenomas · October 7, 2015

Not sure where your results came from, but for me #1 was the slowest - which is expected, since it's doing more work.

Incidentally I'd avoid trying to measure performance in the playground. There's a ton of unrelated code being run, and e.g. moving the mouse around in the code view has a big impact on performance.

More generally, the point of the loop I suggested was to decouple the logic updates from the renders without breaking physics and without creating jitter (i.e. temporal aliasing). It's not a performance hack, and I don't think it's likely to affect performance much.

adam · October 7, 2015

Wouldn't you want to use the web workers api for something like this?

adam · October 7, 2015

You should set the accumulator to 0 after each tick or eventually you will always be ticking.

jerome · October 7, 2015

@adam : the tick_time value is subtracted from the accumulator each tick

yep, webworkers would be nice here, but as I set a BJS object, it's not easy to do... would be better with just a real logic process

@fenomas : I agree about the interest of loose coupling the logic and the renders more than a hack about performance. I know this.

I just wanted to test how the performance would be affected in an almost stressed case (before applying the tick).

You are right about the PG side effects that I just forgot.

So same local tests (a little bigger window size than before) :

# 1 : 60 fps

# 2: 50-52 fps

# 3 : 59-60 fps

#2 still slower, I don't get why ... :angry:

I did the measures with Chrome either with the BJS debug layer, either with the browser internal fps display (and no debug layer)

adam · October 7, 2015

@adam : the tick_time value is subtracted from the accumulator each tick

The accumulator is greater than tick_time. If you are just subtracting tick_time from accumulator, eventually accumulator is always going to be greater than tick_time.

jerome · October 7, 2015

ooops you are right

however, resetting the accumulator to zero doesn't change my former local results.

I don't get why the implementation I expect to be the fastest is the slowest here

adam · October 7, 2015

I'm not seeing that much of difference between the 3 in Chrome.

fenomas · October 7, 2015

Wouldn't you want to use the web workers api for something like this?

Web workers don't share memory with the main thread, so using them only makes sense if a tick would take longer than the overhead of copying stuff back and forth.

You should set the accumulator to 0 after each tick or eventually you will always be ticking.

If you check you'll see that's not the case.

#2 still slower, I don't get why ...

I did the measures with Chrome either with the BJS debug layer, either with the browser internal fps display (and no debug layer)

It's not slower for me, so I can't help. My only advice is: don't look at FPS counters, profile!. Chrome has incredible developer tools - you can run a profile and see precisely which functions are taking up time. There's no need to guess what's going on (and indeed, modern JS VMs make that impossible anyway).

adam · October 7, 2015

If you check you'll see that's not the case.

I'm embarrassed to admit that I did and it eventually ticked on every frame. Anyway, this is not helping Jerome.

jerome · October 7, 2015

I just re-tried at home with my old laptop, editor hidden

#1 and #2 are quite the same at 28 fps

#3 is really faster at 39 fps

CtlAltDel · October 7, 2015

I suggest completely controlling your render from your main loop, and not try to hack this around the existing babylong (pre)render stuff.

So just call the approriate functions at the correct time from your own mainloop.

An implementation of such a mainloop:

https://github.com/IceCreamYou/MainLoop.js

And reading material on the whole deal:

http://gafferongames.com/game-physics/fix-your-timestep/

Let babylon be the renderer and not dictate everything else, unless they have a proper mainloop, then I said nothing but the above article is still great reading material

adam · October 7, 2015

If you look at the last one that has the better framerate you should see that the cubes are rotating slower. That tells me that the other examples are calling the tick function more, which would account for the lower fps.

fenomas · October 8, 2015

I'm embarrassed to admit that I did and it eventually ticked on every frame. Anyway, this is not helping Jerome.

Each frame it adds dt to the accumulator and subtracts at most tick_rate, so the only way it'll fire every frame is if (dt >= tick_rate). If that's the case, then you're rendering slower than your desired tick rate, so there's no point in separating the loops in the first place!

#1 and #2 are quite the same at 28 fps
#3 is really faster at 39 fps

You're comparing the framerates of code that does quite different amounts of work. Why not make a counter that runs for 10-20s and counts how many ticks/renders occur? The whole point of loops like this is to have logic and rendering run at consistent, but separate rates, so why not measure whether they're achieving that?

So just call the approriate functions at the correct time from your own mainloop.

And reading material on the whole deal:
http://gafferongames.com/game-physics/fix-your-timestep/

That's basically what I suggested in the previous thread. This thread is Jerome's reply.

jerome · October 8, 2015

Yes, I know that the goal is to have both loops (rendering and logic) at their own consistent rate.

And I read the linked article also. And I like your algo proposal.

And I'm convinced since ever that this is right way to do, whatever the choice of the implementation.

That's why I didn't focus on the fact the logic loop is really triggered at the right frequency or not, but about how it would impact the performance in a real stressed case.

The main error I did in my example is that my logic loop is not a "pure" logic loop as it calls updateVerticesData somewhere, it is too say that it interferes with something related to the rendering, at least with the WebGL buffers.

So my example is really not pertinent to illustrate this loop decoupling.

Imho, it remains important to focus on the real FPS, the one the user can see and feel. If I would implement a logic, decoupled or not, consistent or not, making my application to lag or to become to slow to be used, I wouldn't have do the job. This is the reason why I wanted to compare the different FPS, those seen thy the end-user.

However I might have done something wrong in my implementation of your algo because it is the slowest on each on my tests, while it is expected to be the fastest on the paper.

Maybe I might not have coded it within registerBeforeRender but somewhere else ? in engine.runRenderLoop() ? no idea about what went wrong here ...

This error apart, the setInterval solution is quite good in most cases (this one wasn't described in the article because it is specific to javascript), though it has a default :

when the user hides (changing tab, minimizing the browser, etc) the window running requestAnimationFrame, this method isn't called any longer by the browser while the setInterval keeps on being called at its own frequency.

This means the logic keeps on going since the rendering is stopped : imagine the logic computes the mesh positions for instance ...

Vousk-prod. · October 8, 2015

To add my useless two cents: I never use registerBeforeRender, if I need to do something directly linked to the engine running rate, I always do that directly in runRenderLoop (although it's not based on any benchmark, just pure intuition :lol: )

fenomas · October 8, 2015

Jerome:

I think of it this way. There are two cases to consider:

When things are unstressed, we want render to be called as often as possible (at the RAF rate), and we want tick to be called at a consistent, defined rate.
When things are overstressed, we want to call tick as close as possible to the desired rate, and we want to call render at least once per tick.

The accumulator loop I suggested is just meant to meet those two goals. The accumulator is for accuracy over the long run - setInterval will be slightly slower due to how it works.

Now for performance. If things are stressed - meaning render and tick are too slow to call both as often as we'd like - then the best we can do for goal (2) is to simply alternate between render and tick as fast as possible. In that case, the only real way to optimize performance is to minimize idle time (when the browser is just waiting for the next RAF or interval). I suspect that the best way to do this will vary with browser/version/OS.

For real-world content, in my game I find that for very heavy stress, idle time goes to zero, so there's not much to optimize. But with lighter stress, the visible FPS will go below 60 even though there is still idle time, so probably it could be a little better. I suspect that the best thing might be to have a setInterval(0) loop with an accumulator to decide when to call a tick, just to avoid having the browser idle. But I haven't tested with real content.

jerome · October 8, 2015

really smart explanation

thank you !

logic loop vs render loop discussion

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members