Page 7 of 8

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Thu Oct 14, 2010 7:09 pm
by CABAListic
Actually, I don't think assigning patches to someone for testing would be that productive. Instead, when you have time and actually *can* test a patch you spot on the tracker (as in, you either have the time or a ready program which could show the bug and thus test the patch), then do it and leave a comment on the item. This would already help quite a lot when going through the patch queue from our side.

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Thu Oct 14, 2010 8:03 pm
by Wolfmanfx
Yes of course but it would make more sense to know which patches are worth to go trough (i mean bugfixes are clear but what with new features)

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Thu Oct 14, 2010 8:11 pm
by CABAListic
New features basically depend on whether they belong into Ogre, how high is the demand for them and the quality of the submitted code. On all of these aspects, opinions of community members is welcome. Again, that would help us judge a submitted patch more quickly.

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Thu Oct 14, 2010 9:23 pm
by jacmoe
sinbad wrote:If you're concerned that patches aren't being reviewed, you can help by reviewing them yourself. Basically, if 3 people in the community have tested a patch and said it works for them (even better if it's on different platforms), it makes the process of central acceptance considerably easier. And if there are problems, helping resolve them by making a fork on BitBucket, applying the patch, then refining it, even giving others in the community chance to help so that the final version is acceptable is a smooth fashion, all help these things go quicker.
Exactly! :)
We just need to get the ball rolling.
Cloning the repository and pull from a fork sounds like a very effective way of working.
That's taking it one step further.
And with Bitbucket being merged with Atlassian, I think the service is about to get better (hopefully). :wink:

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Thu Mar 22, 2012 6:44 pm
by dark_sylinc
I'm here after seeing David Rogers' tweet. Like him, I'm concerned the SM is getting old.

Before reading my post, reading these papers are a must: I said in another post:
Ogre's performance is below other AAAs engine standards (Anvil engine, CryEngine, Frostbite 2)
I'm struggling to get 1.000 of rendercalls @20 fps, while Anvil engine is doing three times those render calls at the same frame rate (in both cases, not being GPU bound)

Profiling reveals the compositor wastes a lot of time parsing the scene manager multiple times (when using something other than render_quad) and there are A LOT of cache misses.

The lack of threaded culling makes this even worse. Furthermore, with DirectX 11 threading model, it's possible to process a scene and batch render calls in multiple threads in a very concurrent way:
  • One thread handles shadow rendering.
  • One thread handles main scene
  • One thread handles environment mapping (i.e. reflections)
Ogre's already struggling to get a high amount of entities in scenes doing main scene's & shadow's rendering in the same thread. When I add env. mapping (one pass, not 6) the amount of cache misses inside the scene manager is gigantic.

I'm afraid, as someone suggested, fixing this may require some strong redesign of the Ogre core. For instance, automatic reference counting of pointers goes against a concurrent system. Singletons don't help (as it's very easy for programmers to make a mistake and access a singleton when it isn't safe to do so)
This thread started by Sinbad as "SM needs 2.0 because of lack of features", I'm going to say "SM needs 2.0 because of low performance.

Let's analyze the problems:
  • Lack of multi-threading: Reference counting is troublesome
  • Lots of cache misses: In 2001, an octree algorithm was very efficient. Today it contains too many branches. A more "brute force" approach outperforms allegedly "efficient" culling algorithms.
  • Lots of cache misses: The way SceneNode creation is handled, is just very memory non-local.
Solutions proposed:

SceneNode, creation:
SceneNodes goes into an array. Plain old std::vector<SceneNode>. SceneManager plugins can send their own queue, but the principle is a large array. Or chunks of arrays similar to this:

Code: Select all

std::vector<std::vector<SceneNode>> mChunks;
mChunks[0].reserve( 1000 );
mChunks[1].reserve( 1000 );
A list/set/map with custom allocators to ensure data locality is possible, but they don't thread well.
An array of std::vector<SceneNode*> could be used if the mem. allocator places them contiguously in memory, and is probably a better option (keep reading).
The vector could be kept ordered for faster finds.
Child scene nodes must go in different arrays than the parent scene nodes.
First all parent nodes are updated, then all children nodes, then all their children. Then bottom up their bounding boxes are updated. Just like in Pitfalls of OO paper is suggested

SceneNode, updates:
Complete removal of "if( mDirty ) update()" idiom; unless the update() has a loooooot of code to execute inside.

SceneNode, position's & Matrix4 memory layout:
Rather than using Vector3 to store position, I would suggest (optional) to use an Array of X, Y & Z for SoA arrangement, in which adjacent elements belong to the next the SceneNode to parse.
In other words:

Code: Select all

SceneNode0
float *posX = memPtr[0][0];
float *posY = memPtr[1][0];
float *posZ = memPtr[2][0];
SceneNode1
float *posX = mem[1][1];
float *posY = mem[2][1];
float *posZ = mem[3][1];
SceneNode2
float *posX = mem[1][2];
float *posY = mem[2][2];
float *posZ = mem[3][2];
SceneNode3
float *posX = mem[1][3];
float *posY = mem[2][3];
float *posZ = mem[3][3];
This way we can, using SSE, update 4 scene node's transform at the same time.

Multithreading
Each thread would handle a certain amount of SceneNodes to parse for updating.
For example:
  • Thread 1: SceneNodes[0 - 100]
  • Thread 2: SceneNodes[100 - 200]
Singletons: For those who like it, unfortunately they have to go away. Not because they can't coexist, but rather because they encourage non-advanced users to, well, use them; when they kill performance in a MT environment, or they just are not safe if accessed.
An exception is if we implement something like:

Code: Select all

getSingleton( threadId )
I've seen that Boost callbacks have been suggested. My reaction is "hell no!". MT is complex. MT is prone to thread unsafety, MT is hard to make scalable. Debugging MT bugs are a major PITA. The more simple it stays, the better it works.

Updating SceneNodes in multiple threads is dead easy once they're well stored in a vector. Upon creation a threadID is sent to each thread, which is used to index which portion of the vector they have to update. That's pretty much how OpenCL, CUDA & shaders work in general.
Proper care must be taken for:
  • Memory barriers (Hardware out of order execution)
  • Volatile variables in the right places (Compiler reordering execution of key variables)
  • Ensuring children SceneNodes & parent SceneNodes are all updated in the same thread
Culling
Like I said in my quote, software occlusion culling is very popular today. They can be highly parallelized (heck, that's why GPUs are so good at rasterization), extremely cache friendly and very fast.
With SSE, and the proper cares above (use a large or multiple large SceneNode array(s), use SoA, remove the 'if dirty', etc) writing a SW rasterizer that only writes Depth (not to mention Hi Z & Z compression can be implemented by storing the triangle's plane equation for blocks of pixels, that's how GPUs do it) and outputs an array of SceneNodes (or Renderables?) that are visible; should be a three weeks top, job.

Multithreading part 2
Updating the SceneNodes in parallel is one task. But parsing a Scene can be parallelized differently:
  • The "main" scene
  • The "shadow texture" scene
  • The "alternate" scenes (i.e. environment mapping for reflections)
All of them can have their culling and render queues parsed in different threads. The compositor should be updated for specifying which render_scene passes are independent of each other so they can be parallelized.
Furthermore, D3D11 allows for grouping all batches for each "scene" in different threads and then dispatching them to the main thread. A fallback for emulating this behavior ourselves must be implemented for D3D9 (I dunno OGL & GL ES status regarding this).

Memory & Resource management
There are different options to consider. All of them mean the removal of automatic reference counting. It's a very lazy & handy solution, but I've seen ref. counts going way high (i.e. 8) because an object was passed down multiple levels down the stack. And that's even happening inside Ogre, when the render queues query for camera visibility. Not to mention Ogre's ref. count implementation contains a level of indirection (= cache misses)
  • Use explicit reference counting. obj->addReference() obj->removeReference(). Havok goes along this way. I don't particularly like this method.
  • Use a load-remove pattern. Specify a set of rules at which we say when & where an object can or must be loaded/created, and when & where it must be deleted. For example if he wants to destroy a Material, offer an utility function that helps him track down all the Entities using that material so he can decide what to do (don't destroy material, destroy the entities, change their material). Careful developers know that you whenever you explicitly write SceneMgr::destroy( xx ) you must be sure you don't have an instance of 'xx' wondering around. I like this idea better, some don't
D3D11 Readiness (watch out)
One of the key elements that's holding back D3D11 is that Microsoft switched the behavior of const GPU memory. Static buffers are filled when declared.
Ogre assumes that static buffers can be filled after being declared; but D3D11 doesn't support this.
It's normal in Ogre code to see this:

Code: Select all

VertexBuffer vb = createVB( staticTrue );
float *ptr = vb->lock();
..
*ptr++ = myPos[i];
..
vb->unlock();
While D3D11 expects this:

Code: Select all

VertexBuffer vb = createVB( staticTrue, myPos );
So, if anyone decides to accomplish this task of major SM overhaul and sees this code, please change it to the new behavior so D3D11 plugin development can be accelerated.

Reading this post can cause headaches, I know. I could implement the SM from scratch reusing old code where necessary to account for these changes, but truth is; I don't have the time & money to do that.

I'll be waiting for feedback.
Cheers
Dark Sylinc

PS: I estimate that an update as large as this for an experienced programmer could take around 4-6 months + testing, which is more than what GSoC provides.

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Thu Mar 22, 2012 7:13 pm
by Kojack
The Pitfalls Of OO is an excellent paper. I wish more papers put so much effort into explaining things.

The particle system in ogre is another area that can benefit from cache consideration. Currently every particle is newed individually (may not be contiguous) then stored in std::lists (a list for active and a list for inactive).


I agree with pretty much everything you wrote.
(I'm not really a fan of reference counting or singletons, so that part doesn't worry me)

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Thu Mar 22, 2012 7:20 pm
by CABAListic
While the whole project is certainly not suitable for GSoC, I think it can - and must - be broken down into smaller pieces. For example, I'm not certain the current thread support in Ogre can handle this kind of micro-tasking, given that it was designed primarily for background computations and background resource loading. Even if it can handle it, we'd need some mechanism to ensure that SceneNode updates etc. can never be blocked by background tasks.
This is groundwork that I think would need to be resolved first.

It might also be a good idea to start a new topic. This one's already quite long, and if you now focus attention to other areas, it will get chaotic :)

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Thu Mar 22, 2012 10:03 pm
by Wolfmanfx
@dark_sylinc
I just recently went through the SceneManager code and one of the main problems are that everything inside OGRE is extremly configurable for example you can control if a RenderQueue gets rendered several times via listeners. The down side is if you walk down one renderOneFrame() you will notice branches everywhere for maximum flexibility i mean nobody is using this Ogre::RenderQueueInvocationSequence this was just added because one user wanted more control :)

I think first steps should be:
* Remove all kind of funky features like RenderQueueInvocationSequence / Useless listeneres and that kind
* Factor out renderSingleObject from the SceneManager.
Maybe we should split more code but i am not sure there is so much code which handle all sort of shadow stuff.
* Pass -> this class was extremly usefull for fixedfunction based materials but nowadays where ff is removed almost everywhere (dx11, ogl3.2, gles2.0) its a really fat class where most
members are useless for shaderbased rendering. Also pass sorting on hashcode is/was an good idea but the default hashCode sorts the first 2 texture units.
I mean this was a USP: Ogre splits one pass to several passes when your hardware do not support 8 tu - but it does not work with shaders.

Regarding your ideas i like doing a software z-buffer but this is something which is not that urgent, we should make the overall code simpler - and yes this means removing features at the beginning. We have to create a plan/roadmap than it would be easy to spread the workload over more people - otherwise (if there is no design document) we will never manage to handle such a massive redesign (and this could mean OGRE is out of order in the next couple of years).

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Thu Mar 22, 2012 10:23 pm
by spacegaier
Really good input in the last posts!!! My question now (before we get back to the technical details): How should we drive that from an organizational point of view? I agree that this whole thing needs planning and thinking through and no one of us could every accomplish this on his/her own.

The first idea that pops into my mind is to create some wiki pages for the different chapters dark_sylinc outlined where we collect the ideas / points agreed upon from a design point of view. Once in some weeks/months/... the Ogre team + experienced users (yes you guys dark_sylinc, Wolfmanfx, ...) will vote for each of those chapters whether they feel comfortable with the design and then we can think about implementation (who, how, and all those questions).

Those wiki pages would also contain links to whitepapers and good forum threads, basically put together everything needed for a good decision voting. I would also think that we could perhaps create some buzz about those planned changes in other communities, to get some input from there as well for the (re-)design.

Comments?

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Thu Mar 22, 2012 10:31 pm
by Xavyiy
Ogre really needs that kind of heavy redesign. And IMO it cannot be posponed any longer.
As the majority of you, I've the feel that we're getting old. Ogre is a brilliant piece of software, but it's mainly the same thing than 5 or 6 years ago, and even based on an older design(10-12 years?).

The fact is that this redesign will break backward compatibility, so it has to be done in a very clear and well-defined way if we want to allow developers updating their projects to the new version without too many problems. IMO, since Steve retirement we've been a little(or not that little..) careless regarding this topic. There have been lots of little improvements, but it seems a big number of them were non-planned improvements, and the more direct proof of that is the lack of a full Ogre 1.8 changelog.

IMO we've to change that. We must define a clear roadmap and develop it, not define it and forget it, like it seems to had happend with this one: http://www.ogre3d.org/tikiwiki/ByatisNotes . Divide the work in "little" tasks: write a little overview, implementation details and porting notes. That may take a while, but in 12-18 months Ogre 2.0 may be a reality and with the changes that dark_sylinc, David Rogers and many others are propossing we can be again the reference even taking account comercial Engines.

@Wolfmanfx
You're right about a little refactoring must be done in current SM code since as you say there're some useless listeners, but it's very important to allow the 'advanced' developer a big control over certain "low-level" parts. That may not have much sense in games or apps that "directly" uses Ogre, but it's VERY important in engines (we must remember that Ogre is a graphics engine, so at the end it must be just a part of a big engine) or when you're implementing complex pipelines.

Of course the expert user will easily modify the needed Ogre parts, but personally I prefer to use listeners or reimplement classes in order to "control" Ogre without having to modify it.

For example, in the Paradise Engine I've been extending some Ogre parts (for example, writing a material system which internally uses the Ogre material system but implements advanced features like using Paradise Engine objets properties as shaders parameters automatically, textures, etc...) without having to modify Ogre itself due to the high "customization" offered.

RenderObjectListener::notifyRenderSingleObject(...) has been very useful here, and also being able to directly invoke RenderSystem::_setTextureUnitSettings(...), RenderSystem::bindGpuProgramParameters(...), ...

Xavier

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Thu Mar 22, 2012 10:38 pm
by dark_sylinc
Additionally, regarding multithreading, I've seen Havok having 2 interesting functions:
  • myManagerOwner->markForRead() / myManagerOwner->markForWrite()
  • myManagerOwner->unmarkForRead() / myManagerOwner->unmarkForWrite()
They're 'like asserts': they don't protect threads from concurrent access (i.e. like a mutex); but they warn you when two threads are accessing the same memory or executing the same code at the same time.

This is useful when you're sure you shouldn't need a mutex but you're worried a bug could prove you wrong. Diagnosing MT threading bugs becomes easier while at the same time you get the performance advantage of not using a mutex. (Note: It's not bullet proof, if you're very, very, very unlucky that race conditions in a particular place are so rare, the assertion will rarely appear, therefore not recommended in life-critical instruments)

I don't know how their implementation works, but I'm sure it must be something like this, as it's very easy to make one (I just wrote it):

Code: Select all

//This class can NOT handle more than 65.535 threads with read access
class AssertingMutex
{
#ifndef OGRE_NO_ASSERTING_MUTEX
	__declspec(align(4)) int mLockCount;
public:
	AssertingMutex() : mLockCount( 0 ) {}

	void markForWrite()
	{
		//Add 65.536
		int oldCount = InterlockedExchangeAdd( &mLockCount, 0x00010000 );
		if( oldCount )
		{
			//When having write acces, only OURSELVES can access this section. No one else
			OGRE_EXCEPT( Exception::ERR_THREAD_UNSAFE, "Freaking Out at " + dumpCallStack(),
							"AssertingMutex::markForWrite" );
		}
	}

	void unmarkForWrite()
	{
		//Substract 65.536
		InterlockedAdd( &mLockCount, 0xFFFF0000 );
	}

	void markForRead()
	{
		//Add 1
		int oldCount = InterlockedIncrement( &mLockCount ) - 1;
		if( oldCount > 0x00010000 )
		{
			//Multiple threads can access the same region for read access, but freak out
			//if a thread had already requested write access
			OGRE_EXCEPT( Exception::ERR_THREAD_UNSAFE, "Attempting to get read access while a thread "
							"locked for write access. Freaking Out at " + dumpCallStack(),
							"AssertingMutex::markForRead" );
		}
	}

	void umarkForRead()
	{
		//Substract 1
		InterlockedDecrement( &mLockCount );
	}
#else
	void markForWrite() {}
	void unmarkForWrite() {}
	void markForRead() {}
	void umarkForRead() {}
#endif
};
dumpCallStack() is a function that should dump the stack :P

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Thu Mar 22, 2012 10:54 pm
by dark_sylinc
Wolfmanfx wrote:@dark_sylinc
I just recently went through the SceneManager code and one of the main problems are that everything inside OGRE is extremly configurable for example you can control if a RenderQueue gets rendered several times via listeners. The down side is if you walk down one renderOneFrame() you will notice branches everywhere for maximum flexibility i mean nobody is using this Ogre::RenderQueueInvocationSequence this was just added because one user wanted more control :)
Yes I agree, but the problem is not that "it's flexible", but rather "it was designed to be flexible without threading in mind". A multi-threading system can be very flexible, once you get the core base running solid.
Wolfmanfx wrote: Regarding your ideas i like doing a software z-buffer but this is something which is not that urgent, we should make the overall code simpler - and yes this means removing features at the beginning. We have to create a plan/roadmap than it would be easy to spread the workload over more people - otherwise (if there is no design document) we will never manage to handle such a massive redesign (and this could mean OGRE is out of order in the next couple of years).
I believe that once the major work on SM is done, making a fast rasterizer should be a breeze. BTW I think there was a fast software open source implementation out there, if I found it again we could try analyzing integrating with it.
Furthermore alternative scene managers plugin must still be in on the desk.
I'm not sure a SW Rasterizer occl. culling can outperform an octree in a handheld device yet. Furthermore it's possible that in the future we could see a comeback of tree-based culling with hybrid systems for extremely large games (I'm thinking of world-size levels in... who knows how many years)

I agree that Ogre 2.0 will break backwards compatibility, but the most important is that users still keep the same feeling, that they're working with something familiar; like "Oh, that's how Entities are created now" "Oh, it's like when I did XX thing with the SceneNode/Material/Entity/Texture/etc"
Because after all, it's still a Graphics engine. Just like cars had the gearshift in the back of the wheel and later next to the driver; but the basics were still the same.

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Thu Mar 22, 2012 11:20 pm
by Xavyiy
dark_sylinc wrote:I agree that Ogre 2.0 will break backwards compatibility, but the most important is that users still keep the same feeling, that they're working with something familiar; like "Oh, that's how Entities are created now" "Oh, it's like when I did XX thing with the SceneNode/Material/Entity/Texture/etc"
Because after all, it's still a Graphics engine. Just like cars had the gearshift in the back of the wheel and later next to the driver; but the basics were still the same.
+1

Also, it's very important to define the external Ogre use despite the internal Ogre behaviour. For example, even Ogre can use a MT approach for updating scene nodes, animations, scenes organization, rendering, etc the API can allow the user to create all things like now, in a ST environment, or creating resources(in the sense of entities, materials, scene nodes, etc rather than loading textures/meshes/scripts) per thread(that will have a huge potential), etc.

I think the first thing to do is defining all that non-very technical ideas/requisites, and then study the real implementation.

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Thu Mar 22, 2012 11:43 pm
by madmarx
Hi,
I have myself wonder more than once if I should try to redo the scenemanager on my own, but my main problems were :
1/ the ton of 'unnecessary' feature / listener / functions. It is too big.
2/ the resources system, which ... is bad.
I think we should also avoid the inheritance in scenenodes (nowadays there are nodes/scenendes/bones ... and it makes things harder to code + slower to execute).

Ok for having 'kind of the same feeling', but not for mimicking thousands of overly complex functions. If the same 'feeling' can be achieved simply with the name of the classes + a few functions that would be much better.

As pointed by Sinbad, usually one guy does the job and then present an alpha version. I mostly agree with that way of working. But in that case, it could be difficult to work together to define the concepts/interface that we want to keep? Maybe we work on a bunch of "core classes interfaces" (camera/scMgr/entity/scenenodes/subentity/renderable/light) , agree mostly on the interface by vote, before any implementation, but also knowing what we want to achieve ? I don't think that would be that hard, to agree on a "new unpoluted" interface.

EDIT : just read the pitfalls of OO. Glad to see I was right about inheritance ^^

Best regards ,

Pierre

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Fri Mar 23, 2012 1:10 am
by Kojack
I think we should also avoid the inheritance in scenenodes (nowadays there are nodes/scenendes/bones ... and it makes things harder to code + slower to execute).
Currently there's:
Node/Bone
Node/Bone/TagPoint
Node/SceneNode
Node/SceneNode/OctreeNode (this is what most of us are using, even if you don't realise it)
Node/SceneNode/BSPSceneNode
Node/SceneNode/PCZSceneNode
:)

Node depends on Mesh and Renderable, it has the ability to draw itself as coordsys axis arrows, but otherwise is a non graphical object. SceneNode adds an aabb and a list of movable objects (the actual visible things). TagPoint adds some transform inheritance control to Bone (but node already has similar behaviour) and the ability for a Bone to contain a movable object (which is what SceneNode already does).
I'm sure Node, SceneNode, Bone and Tagpoint could be simplified down to maybe just one or two classes.

OctreeNode (any time you ask for a scenenode, ogre will probably give you one of these instead, the octree scene manager is usually default when it's plugin is present) adds to SceneNode a second aabb, 24 corner reals, 8 colours (unsigned longs), 24 indices (shorts) and an octant pointer.
But in fact the only code that uses the second aabb, corners, colours or indices is commented out with "Todo" at the top. The only needed extra member is the octant pointer (so only 4 bytes of the extra 184 are used).

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Fri Mar 23, 2012 1:42 am
by masterfalcon
This is a lot of good feedback. In addition to a redesign for performance and modernity, cleaning up cruft and streamlining the API would be fantastic.

dark_sylinc gave some great links to presentations. Are there any other relevant reading materials concerning design and scene management techniques/research that anyone else recommends?

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Fri Mar 23, 2012 10:17 am
by tuan kuranes
here's my 2 cents, which would be the way to solve performance, maintainability problem while keeping flexibility of current architecture:

Idea would be making ogre having entirely decoupled "stages", each stage being one or more plugin:

User Side : user update its transform(s) tree(s) and that fills a transform buffer (asynchronously, finishing his "frame" by a ogre->frameSend() or something)

Ogre side:
  • Transform Stage -> Ogre transforms the buffers (handling page/locality/etc) filling an Cull buffers
  • Cull Stage -> Culling per render target fills Ogre renderqueues
  • Shading Stage -> Shade/Render each renderqueue according to its "shading/rendering" type into a dx/gl/etc. command buffer
  • Execute Stage -> merge (or dispatch between GPU/Tiles/etc) all command buffer and execute them (asynchronously)
Each side being plugins that can be swapped, instead of the current huge render all possible way using branches in a single monolithic render loop.
Between each stages, a careful use of "copy on write" multithreaded enabled buffers, that allow multiple thread to read data from them. (memory is cheap, )

That way user-oriented "listener" system is replaced by user tweakable/readable plugins.

Performance wise :
  • "command buffer" like orientation where commands are ogre commands for each stage (transform/cull/renderqueue, each stage allows for its own multhreading/data oriented design/gpugpu opportunities).
  • cuda/opencl/compute etc. core integration => future oriented engine (see ) coupled with data oriented design. allow use of those gpugpu things when possible and/or multithreading), just by making a new plugin at one of the "stage"
Design wise that allow for flexibility and much simpler code, more specialized:
  • Ogre "shader/renderer" stage as plugin : allows for a forward shader/renderer, a deferred renderer, tile deferred renderer, light forward renderer, megatexture renderer, megamesh renderer, etc. so that code branching and code complexity is greatly reduced. That way advanced user can tweak it on their side much more easily. (why not a shadow renderer). That allows for much simpler user dive-in into code, therefore, not needing listener, he can tweak each stage according to his need.
  • Total decoupling between "transformation" tree (therefore becomes user space), "culling" (plugin based) and render queue renderer (plugin based) , etc. That will lead to much simpler code targeting branchless also better dod/mutlthread opportunities
  • Multiple plugin at once per stage, meaning multiple buffer type per stage : bone animation could be a different plugin than node animation for instance, so Ogre "transform buffer stage" should be able to call multiple plugin depending on transform buffer type, shadow renderer & forward renderer (allow very specialized code for each stage.)
  • Easier debugging/performance testing, as first planning step would be just refactoring current code into those stage, coding a new "stage type" plugin allow easy comparison with old "stage type" (comparing a dod cull against current cull for instance)
  • Easier Port to console or new renderer ? as final command buffer should be much more straightforward and less engine tweaking.
  • Ogre becomes less of a black box engine and more a Data Pipeline. From transform tree to commands. Less singleton needs and less constraints. User could even maintain multiple entirely independent transform trees ?
Cons: cannot imagine the mess of a compilation setup and source tree that would be.

few links (can add more if needed):

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Fri Mar 23, 2012 7:52 pm
by m2codeGEN
Kojack wrote:
I think we should also avoid the inheritance in scenenodes (nowadays there are nodes/scenendes/bones ... and it makes things harder to code + slower to execute).
Currently there's:
Node/Bone
Node/Bone/TagPoint
Node/SceneNode
Node/SceneNode/OctreeNode (this is what most of us are using, even if you don't realise it)
Node/SceneNode/BSPSceneNode
Node/SceneNode/PCZSceneNode
:)
In our project there are additional successors of a Node class.
Node/SceneNode/SlSceneNode/[SlGameObject]/[SlGeometrySwitcher]/[SlLodSwitcher] :D

To execute refactoring SM, it is necessary to work hard professional programmers.

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Fri Mar 23, 2012 8:00 pm
by spacegaier
m2codeGEN wrote:In our project there are additional successors of a Node class.
Node/SceneNode/SlSceneNode/[SlGameObject]/[SlGeometrySwitcher]/[SlLodSwitcher] :D
But those are custom nodes created by you and hence not part of Ogre :) .
To execute refactoring SM, it is necessary to work hard professional programmers.
We've got some really talented people on board.

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Fri Mar 23, 2012 8:18 pm
by dark_sylinc
A few notes based on experience:
Kojack wrote:Currently there's:
Node
Node/Bone
Node/Bone/TagPoint
Node/SceneNode
Almost all Nodes derive into SceneNodes.
The only reason there seems to be a need for "Node" is because 'SceneNode' & 'Bone' derive from it. It could be analyzed if SceneNode & Node could be merged, and make 'Bone' inherit from 'SceneNode' and ignore the rendering part.

TagPoints are really bad. Problems with TagPoints:
  • When using manual LOD, model uses a different instance of the skeleton, and thus the TagPoint just stops updating.
  • It breaks the "attach to parent node" pattern. As it is attached to "Entity".
  • All Renderables attached to the TagPoint depend on the parent Entity's visibility. For example, if the Child has visibility mask 0xFF00 while the parent has 0x00FF; the child node won't be drawn if the viewport's mask is 0xFF00.
Personally, I use a dummy TagPoint, a SceneNode, and a NodeListener to overcome those problems.
Ideally, TagPoints shouldn't exist. They should be regular SceneNodes watching for the Entity's matching Bone and for manual LOD switches. Currently, attaching to a Bone directly doesn't work because they aren't part of the Scene Graph, but that could change.

So, what I'm saying is that there should be:

Node
Node/SceneNode (Replace TagPoints)
Node/Bone

or

SceneNode
SceneNode/Bone

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Sat Mar 24, 2012 6:50 am
by Kojack
In our project there are additional successors of a Node class.
Node/SceneNode/SlSceneNode/[SlGameObject]/[SlGeometrySwitcher]/[SlLodSwitcher]
If your SlSceneNode inherits from SceneNode instead of using composition or aggregation, then hopefully you aren't using a plugin scene manager (or wrote your own scene manager), or just use them in your own code and never let an ogre scene manager see them. Otherwise things will get messy. The default scene manager (when no plugin scene managers are loaded) would be ok, it does work with scene nodes, but all the others (like the octree scene manager) will have trouble.

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Sat Mar 24, 2012 10:23 am
by Kojack
I just took a quick look at sizes. Node is 332 bytes, scenenode is 480, octreenode is 612 (some of those members I mentioned were static, so it's a little less than 184 extra bytes).
Node has a bunch of bools mixed in with the other members, so that's not going to help class packing. I also didn't realise that Node not only has position, scale, orientation, derivedposition, derivedscale and derivedorientation, but there's also initialposition, initial scale and initialorientation. These initial values are for animation tracks. But since not everything is animated, those values would be better off stored in something like the animation track, instead of the node base class.

Nodes have a cached transform matrix. It would be cool if this could be set directly instead of generated by the transform components. This way people who want shear matrices or post-rotation scaling can do it themselves, and interfacing with libraries (physics libs, etc) that work purely in matrices is easier (no need to change a matrix into ogre transforms, when they are just going to become a matrix again).
(Of course fitting this in with a cache friendly scene manager like in Pitfalls could be a little trickier, but I haven't thought much about it yet).

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Sat Mar 24, 2012 12:51 pm
by m2codeGEN
Kojack wrote: If your SlSceneNode inherits from SceneNode instead of using composition or aggregation, then hopefully you aren't using a plugin scene manager (or wrote your own scene manager), or just use them in your own code and never let an ogre scene manager see them. Otherwise things will get messy. The default scene manager (when no plugin scene managers are loaded) would be ok, it does work with scene nodes, but all the others (like the octree scene manager) will have trouble.
SlSceneManager is inherited by default scene manager. (We don`t use Octree Scene manager)

updated
Moreover, there was a need of an overload of the Node methods:: addChild and SceneManager:: createSceneNode for implementation of our logic. For example, reference counting possibility on copies of nodes was added.

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Sat Mar 24, 2012 2:06 pm
by Kojack
That's cool then.

I've seen people on here try to inherit from SceneNode without making their own scene manager. The end result will be hard to track down memory corruption.

Re: 'Tindalos' (Ogre v2.0) SceneManager redesign

Posted: Sat Mar 24, 2012 4:11 pm
by duststorm
Forgive me if my question may sound stupid, but after reading dark_sylinc's idea I was thinking whether some ideas, like software culling and optimising for SSE instructions can have a detrimental effect on speed on the mobile or older devices that Ogre is targetting.
But for any regular computer of up to 8 years old this will probably lead to a huge speed increase.