JCPalmer Posted April 23, 2015 Share Posted April 23, 2015 I know the optimizing of inactive meshes is not a priority, but thought I would report the observation that making a lot of them inactive / invisible does not reduce the cpu, or increase FPS when cpu bound:public setLayerMask(maskId :number){ for (var i = this._subs.length - 1; i >= 0; i--){ if (this._subs[i] !== null) this._subs[i].setLayerMask(maskId); } this.layerMask = maskId; // need to make sure not pickable, when mask is for suspended level this.isVisible = maskId !== DialogSys.SUSPENDED_DIALOG_LAYER;}Given that I have never seen line level cpu or run count profiling for javascript, this may be useful to help pin down closer, where much of the cpu overhead of having a mesh might really be. How to reproduce observation. Run the Dialog Tester Scene. https://googledrive.com/host/0B6-s6ZjHyEwUfjlzYXJKMC1zLXdIaV81REJhbjdfRmczQTJFOEpjWWg2SUIwZVRRS0VsR28 Click the Use System Camera checkbox, which will enable "Dock" button. Turn on Debug layer with the checkbox. Observe values of statistics Temporarily hide statistics, so the "Dock" button can be clicked Re-enable statistics, & compare. Quote Link to comment Share on other sites More sharing options...
fenomas Posted April 23, 2015 Share Posted April 23, 2015 The "Dock" button looks disabled to me, and clicking it doesn't seem to have an effect. What's it doing? Quote Link to comment Share on other sites More sharing options...
JCPalmer Posted April 23, 2015 Author Share Posted April 23, 2015 Those buttons are disabled initially, sorry I missed a step, added above. Dock pushes a Panel on the modal stack, docked to the top - center of the Window. This temporally makes all the meshes on the main panel inactive, so they cannot be picked. Quote Link to comment Share on other sites More sharing options...
GameMonetize Posted April 23, 2015 Share Posted April 23, 2015 Setting a mesh to inactive SHOULD improve CPU usage:https://github.com/BabylonJS/Babylon.js/blob/master/Babylon/babylon.scene.ts#L1231 It shortcuts CPU intensive stuff like frustum clipping Quote Link to comment Share on other sites More sharing options...
fenomas Posted April 23, 2015 Share Posted April 23, 2015 Yeah, it does spend a lot of time in mesh selection with stuff turned off. Mostly computing world matrices, which happens even when meshes are disabled. Quote Link to comment Share on other sites More sharing options...
JCPalmer Posted April 23, 2015 Author Share Posted April 23, 2015 Andy,Looking at statistics with the "Modal Stack" menu selected, 753 meshes (700 active), I show "Mesh selection" Duration at around 95 ms (after a while). When I Click "Dock", it only drops to about 65 ms with only about 29 active meshes. If it was perfectly linear, it would drop to 95 * (29/700), or 3.9 ms. That is not even close to 65. Looking at the section DK linked to. Yes, it would appear it HAS to be computeWorldMatrix, BUT there is checking at the front to not always do it. I added a counter, which incremented only when it was actually done, then changed my "Input" button to write that number to console. If nothing changed, I can go minutes between "Input" clicks and the value to console is same. Something else is responsible.public static nCompWMs = 0;public computeWorldMatrix(force?: boolean): Matrix { if (!force && (this._currentRenderId === this.getScene().getRenderId() || this.isSynchronized(true))) { return this._worldMatrix; } AbstractMesh.nCompWMs++; ....}Was hoping that there was a defective check for not always doing it. If it were, fixing would improve everything, not just for inactive meshes (not that important). Really wish Javascript had a line level profiler. It is critical for an interpreted language. I had one way back in the mid 80's with the Sharp APL interpreter. It saved my life over an over, even though I had to code my own reports. Netbeans's Java profiler is to die for. Quote Link to comment Share on other sites More sharing options...
fenomas Posted April 24, 2015 Share Posted April 24, 2015 JC, are you not using the browser profilers? No need to guess where the time is being spent (PS: Andy meaning me right?) JCPalmer 1 Quote Link to comment Share on other sites More sharing options...
JCPalmer Posted April 24, 2015 Author Share Posted April 24, 2015 fenomas,Thanks I had never found the profiler before. BTW, yes thought you were Andy. Now that I have this, I know why computeWorldMatrix consumes so much cpu, when it is not actually doing anything. The test itself of check to see if it needs to do anything is very large. Normally, you want the checking to save cpu to be as fast as possible. See my profile with the inside of computeWorldMatrix. This scene can not only generate a huge # of clones, but they are highly nested. All this recursive parent checking for sync is a large waste of time. If the recursion was in _evaluateActiveMeshes & in the opposite direction (parent to child), the parent calling their children would already know if it was in sync and could pass it. Not all scenes do as much parenting as this, but overhead checking is not good. I will think about this. This computeWorldMatrix step might be made as a separate pass through scene.meshes in _evaluateActiveMeshes(). A separate pass would mean activeMeshes would still come in the same order as before (I know you care about the order for materials). Quote Link to comment Share on other sites More sharing options...
fenomas Posted April 27, 2015 Share Posted April 27, 2015 Hey, okay, hope your thinking is helpful. I also suspect optimizations could be made here - for example in a typical case for me I often see the scene spend 80% of its time in mesh selection (20% rendering) even when nothing is moving anywhere in the scene, and that's with no more than one level of nesting. One might think that more matrix updates could be skipped, but then I've logged 6-8 bugs lately and I think they've all been due to BJS being too aggressive in skipping matrix updates, so it's presumably not a simple matter. (And I am Andy, yes, I just didn't know I'd said so ) Quote Link to comment Share on other sites More sharing options...
JCPalmer Posted April 27, 2015 Author Share Posted April 27, 2015 Your Github account is your name. Some of the issues you posted, that I get emails of, could only have come from you. Was on Github yesterday. You even have the same avatar. I spent Friday afternoon, testing my theory. I modified Node.ts to hide direct access to parent with gets / sets. This allowed a children : Array<Node> property & a way to maintain it.Made changes to abstractMesh.computeWorldMatrix(), adding a skipParentSyncChecking, and in isSynchronized too.Added recursive scene function computeWorldMatrixTree(), and called for all meshes that either did not have a parent or the parent was not a mesh, eg. a camera. It worked practically first run, but somehow the result was unchanged. I try not to keep my mad scientist changes to the repository around too long, so I trashed them with a reset. I save the entire filesystem daily. Might still have it, could paste changes, if so. Quote Link to comment Share on other sites More sharing options...
jerome Posted April 27, 2015 Share Posted April 27, 2015 Wouldn't this computeWorldMatrix() method be threadable in a worker ? Very naive question since I don't know anything about it ... Quote Link to comment Share on other sites More sharing options...
GameMonetize Posted April 27, 2015 Share Posted April 27, 2015 We have to be very cautious here as Fenomas mentioned. Computing the world matrix is expensive and so Babylon.js uses various ways to skip this step. Once of them is obviously the evaluation of active meshes. This is a complex problem because a world matrix can be updated:- Because you changed position, rotation, scaling- Parent or parent of parent or parent of parent of parent (and so on) changed its world matrix- You are using Billboarding Quote Link to comment Share on other sites More sharing options...
JCPalmer Posted April 27, 2015 Author Share Posted April 27, 2015 I think for the vast majority of a scene, meshes DO NOT change every frame, e.g. background meshes. Think the best strategy would be to take away direct access of any property that could cause a recompute, just like I did for Node.parent with getters / setters. The setters could set a simple _isDirty : boolean. ComputeWorldMatrix, could just check this. The node.children member could handle parent changes. Do not know what a renderID is, so do not know if ComputeWorldMatrix could set it back to false, or the scene would have to. Think the code would be a lot cleaner too. Think allowing direct access, leads to either recomputing everything every frame, or increasingly intricate checking & difficult code to follow. The overhead of a getter / setter is probably low, and you only pay for when you use it. Quote Link to comment Share on other sites More sharing options...
JCPalmer Posted April 27, 2015 Author Share Posted April 27, 2015 Vector3 would also have to have for x y & z. So that is a road block. Quote Link to comment Share on other sites More sharing options...
GameMonetize Posted April 27, 2015 Share Posted April 27, 2015 We already check if something has changed without having to use getter and setter: This is the goal of IsSynchronized which check against cached values Adding getter/setter will have performance impact in ALL the engine as vector3 are used everywhere. And I'm pretty sure that even if we removed the isSynchronized stuff, this won't lead to a big performance gain (but perhaps I'm wrong ) One idea: Adding a IsWorldMatrixFrozen property to mesh. THis will be used to block the update of the WorldMatrix. THoughts? Quote Link to comment Share on other sites More sharing options...
Temechon Posted April 27, 2015 Share Posted April 27, 2015 This would be great, I would use it right away in my current project (which is HUGE in term of performance) Quote Link to comment Share on other sites More sharing options...
GameMonetize Posted April 27, 2015 Share Posted April 27, 2015 Ok, with the last push I introduced 3 optimizations:- mesh.freezeWorldMatrix() and mesh.unfreezeWorldMatrix(). A frozen world matrix will never be evaluated and always server from cache- mesh.alwaysSelectAsActiveMesh = true: Frustrum clipping is disabled which leads to performance improvements in active meshes evaluation (But will remove frustrum clipping then)-mesh.isEnabled == false will now block comptuteWorldMatrix evaluation Feel free to give feedbacks! Quote Link to comment Share on other sites More sharing options...
Temechon Posted April 27, 2015 Share Posted April 27, 2015 Cooool, I will test that tomorrow Quote Link to comment Share on other sites More sharing options...
JCPalmer Posted April 27, 2015 Author Share Posted April 27, 2015 Application level optimizations do offer ways for the developer always do or never do things, that only they would know. Also bit of an advanced feature, so probably want to do this as part of a publishing phase. I have seen where using a pair of methods to set something on or off were later regretted, .e.g. Java swing show() & hide(). In that example, they changed to setVisible(boolean). Wonder if single function like mesh.setFixedWorldMatrix(boolean) might allow for more flexible calling. For the dialog extension, using isEnabled to completely block comptuteWorldMatrix evaluation of entire Panel hierarchies that I know will never show on any camera, sounds good. Quote Link to comment Share on other sites More sharing options...
fenomas Posted April 28, 2015 Share Posted April 28, 2015 I will have a look at how these affect my scenes later in the week. I would imagine that skipping matrix updates for disabled meshes will go a long way towards solving JC's case, but scenes that regularly have lots of disabled meshes probably aren't so common. Just to check, freezeWorldMatrix just affects a mesh's matrix w.r.t. the world, and the camera transform is separate on top of that, right? Quote Link to comment Share on other sites More sharing options...
fenomas Posted April 28, 2015 Share Posted April 28, 2015 Okay, I wound up trying these today. First, freezeMatrix is a solid improvement. Here's a scene with a couple thousand meshes (only a few hundred draw calls), which already uses octrees to moderately speed up mesh selection: Here's the same scene after freezing the terrain: So yeah, solid improvement! Very cool. With that said, some thoughts: 1. Would mesh.static:Boolean might be a better name? It would be hard for casual users to guess the implications of "freezeWorldMatrix", but "static" would be pretty straightforward. There might even other optimizations one could do with a mesh that the user has declared to be "static". 2. Could BJS initialize the mesh's world matrix when the freeze API is called? It would be most straightforward if the user can create a mesh, set its position/rotation, and then freeze it, but that doesn't work (I assume because the matrix doesn't get made until the next render). 3. It doesn't work for billboards. I know you already alluded to that but do you think there's any (possibly separate) way that billboarded static meshes could be optimized? I think it's a fairly common use case to have lots of terrain billboards that never move (grass, flowers, etc). jerome 1 Quote Link to comment Share on other sites More sharing options...
jerome Posted April 28, 2015 Share Posted April 28, 2015 Sorry for interrupting this ver interesting technical debate but just for my own knowledge : the worldMatrix is the transformation matrix from the mesh local space to the world space and so there is a worldMatrix per mesh.Am I right ? The debate here is about to improve performance by not recomputing this worldMatrix each frame for meshes tagged as immutable/static once created (if this feature would be possible).Did I get it ? Quote Link to comment Share on other sites More sharing options...
Temechon Posted April 28, 2015 Share Posted April 28, 2015 Exactly Jerome, you got it Quote Link to comment Share on other sites More sharing options...
GameMonetize Posted April 28, 2015 Share Posted April 28, 2015 @fenomas:1. I want to keep it as a function because it implies some drawbacks that the user has to understand. So I prefer having a explicit function there2. Already the case: https://github.com/BabylonJS/Babylon.js/blob/master/Babylon/Mesh/babylon.abstractMesh.ts#L1893. Billboards need to have a new worldmatrix per frame. Because they are facing the camera Quote Link to comment Share on other sites More sharing options...
JCPalmer Posted April 28, 2015 Author Share Posted April 28, 2015 Think I can use the freeze and enabled for the meshes not on the top of the stack. Also, started merging meshes. That also massively reduced not only this, but removed 700+ draws. I am sure things will be really fast! Consuming more memory, but you cannot have everything. GameMonetize 1 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.