Comments
Hi AllEightUp,
Thanks for your consideration. I try to answer some of the issues you mentioned.
There are issues I have with this article, not that anything is wrong, just that it seems a bit misleading in a couple areas due to not being complete
As I stated in the introduction of the article, the techniques are going to be addressed generally here and I try to introduce references for those who are eager to learn the details. Covering details of these techniques need a separated heavy weight article.
The problem with channels is that during update you have 1-2k worth of channel data likely linear in memory and for each channel you are going to jump into the middle of that memory to read 2-4 pieces of data to compute a single quaternion or matrix depending on what you are doing. Over 30+ channels per animation you start exponentially slowing down as you destroy the cache in the CPU.
I've quoted a part of the article here about animation tracks:
"So a good technique is to keep the bones sequentially in the memory. They should not be separated because of the locality of the references."
It says the bones should be allocated linearly in memory not the animation tracks. Very detailed characters have pretty much 70 bones. The bones usually accessed several times in the process of computing animation, skinning and post processes like IK and blending between ragdoll and animation. So keeping bones sequential can help to achieve better performance. However you're right about the animation tracks. Many of the animation tracks would not be accessed in a huge period of time. So we can have a memory manager for loading and unloading them. The memory manager can be strictly related to the game animation system. It can load most recently used animations linearly in memory or it can have a learning process to weight the motion graphs transitions if the game is using the similar transitional animation techniques. So the most recent animations (or parts of animations) remain resident in memory linearly.
For animation compression, usually the sample rate is high so the curves can just use linear interpolation and the compression can be just applied during import phase and the tracks will be constructed at import time.
Again though, there is some balancing to be looked at here, really high quality low error rates (cinematics for instance) typically want the curves for C1 or C2 continuity reasons which linear starts failing to compress well.
I mentioned that users should adjust the error rate and if they needed the continuity they should leave them continuous. Users can adjust error rates to have both optimization and beauty of a the animation. Here is a quote from article:
" As you can see in Figure1, the keyframe number 2 and 3 are the same. But there is an extra motion between them. You might need this smoothness for many bones so leave it be if you need it."
You need to take into account for the full skeletal chain from the bone to any end points effected by the error. I.e. you might be able to compress channels on the shoulder exceptionally well but you start noticing the hand at the end of the chain under the shoulder jittering
About the jittering, The jittering would not be occurred for the children bones, if you simplify the parents animation tracks, because the keyframes are always restored relative to each bone binding pose and the binding pose is not changing at run time. Binding pose is stored relative to its parent bone space. So simplification should not affect the child bones motion.
On some rare situations where you have translation tracks the jittering will occur. For example you have a weapon bone which is child of the right hand, the character passes it to the left hand in the animation for a short period of time. While the weapon bone is in the left hand, the compression can create jittering on it, because it is the child of the right hand. However I haven't written anywhere in the article that you should apply the error rate separately for each bone. Here is a related quote from article:
"The curve simplification should be applied for translation, rotation and scale separately because each of which has its own keyframes and different values. Also their difference is calculated differently"
So it says the scale, translation and rotation error rates should be specified differently. This doesn't mean that you should apply error rates to each bone separately however it can be done separately but it could be hard for user to manipulate it.
Again thanks for your critics and again the techniques are discussed generally here. All of them have many details and tricks when you want to have them in action. Each of which should be considered separately in different articles and their experimental results should be addressed as well
As I mentioned, I was just pointing out the areas which seemed unclear in the article during the initial reading. But, in the case of jittering either we are crossing wires or I believe you are wrong or missing the point, the specific error is not a problem, the accumulation of error is. A really bad example is one of the bosses in a game I did the animation systems for a couple years back had well over 120 bones, most of which were in long tentacle like spiky things which each had about 10 or so bones. The animation system had no bone tranlations, only rotational data in these spines but the ends of the tentacles were jittering around like mad, crossing through each other and otherwise not looking like the fluid animations found in Maya. Basically the math works out very simple:
+ ------------------------- >.
Plus being the joint you are introducing a little rotational error to and the dot at the end is the expected end point. If the bone is 10 meters long (the general length of the spines in that character), and the rotation is off by just 1% due to error, the end point of the bone is sin(1degree)*10=~0.17 meters from where it is supposed to be in world space. Break that chain into multiple bones with each having a bit of error and while the math is inversely proportional to the distance between the joint being modified to the final end point, the grand total of error possible can leave the end point upwards of .4 meters from where it should actually be located which is exceptionally noticeable.
So, basically I always suggest anything that covers animation compression also points out that the error is cumulative down bone chains. You don't need to solve the issue in the article, I just believe it is always very important to point this out since it bites most folks at least once or twice when writing compression systems.
Again, i don't expect the article to provide solutions, I was just pointing out that I didn't believe it was particularly clear on the points and could be much better if it spelled things out more explicitly.
If the tentacles were going so crazy that it didn't look like what you made in Maya then the compression settings where probably quite a bit too extreme :)
Generally motions look fine without taking the hierarchy into account when optimizing them using keyframe reduction or so. Depends a lot on the settings. But at decent optimization rates the motion should look just fine. Maybe a tiny bit of foot sliding will happen, but yeah you can improve that by taking parent transforms into account. Definitely good to point that out indeed.
Also most of the time error on the legs/feet will be more visual disturbing than on say the arms, because you will directly notice sliding feet while you might not notice that the hands are not at the very exact locations in say an idle motion or so.
There are a few other techniques you can apply as optimizations, such as motion and skeletal LOD. Also you can do some caching when doing keyframe lookups if they are not uniformly sampled. Also you can simplify certain calculations when no scale involved etc, although you kind of mentioned some of those in the "do not transform everything". You have to watch out that the cost of figuring out if something changed is not higher than actually just calculating it though.
Optimizations in animation graphs / blend trees / networks could also be added.
Maybe a next step could be to go into more detail of some of the things you mentioned, so people can see how to implement these.
Anyway, good job on the article :)
Buck, as mentioned, anything that is 10 meters long with even very minor errors will exhibit the issues no matter how low the compression is. As you say, a little foot sliding is acceptable for a character, now make the player a mouse that see's the foot from two inches, that foot sliding is no longer acceptable. Everything depends on how you use the result, the source of the error is always there, it's simply something worth pointing out.
Yes I know, but this 10 meter tentacles is a 0.001% of the characters you have, an exception. Just wanted to point that out. Also I assume you have configurable compression settings, so you can always either disable compression on such character or make it less agressive.
Overall, a good article with some good recommendations. A few comments for consideration:
"Usually the keyframes are stored relative to a pose of the bone named binding pose."
A matter of English language semantics: "stored relative" implies data being placed in memory somewhere with a known relationship to another location. It's not clear if that's your intent. If so, it's not clear what you mean by "relative to a pose." I'm guessing you intend something similar to "Keyframe transformations are commonly local transformations, such that a bone orientation in model space is determined by KeyFrameTransformation * BindPoseTransformation." You should clarify.
Section 1, Skeletal Animation Optimization Techniques, states "These animation tracks and skeletal representation can be optimized in different ways."
In the section "Optimizing Animation Tracks," you do not indicate what the *intent* of the optimization is. That is, there are various reasons for "optimizing" code and data - speed of access, speed of calculation, memory footprint, etc. It appears that section primarily discusses the memory footprint of each track. The intent of your article is to provide optimization "tips and tricks." Tell them explicitly what benefits and penalties result from each optimization technique. Otherwise your information begs the question "Why should I do that?"
By contrast, in the section "Representation of a Skeleton in Memory," you mention access which is "cache-friendly," implying improved speed of access, thus providing readers with a reason why they might consider your recommendations. Similarly, "SSE Instructions" mentions efficiency, and the "downside" that one must be mindful of alignment issues. Good stuff.
Figure 1 and Figure 2: please provide an image with higher contrast and/or wider lines. They appear (on my monitor) as little more than black rectangles. In addition, you mention "keyframes" which don't appear in the images. If you mean the dots on the curves - label them clearly. Make it easy for your readers to make a relationship between the word description and the visual aid.
In your introduction, you imply that Section 2 is targeted at animators, as opposed to programmers. Providing information to provide a basis for discussion between programmers and other project members is an excellent approach.
However, a minor point - Section 2 contains the phrase "... your blend tree gets more complicated and high dimensional ...," which may not mean anything to an animator. I.e., if an animator is given the task of creating partial animations with the caveat: "Don't complicate the blend tree!" I can imagine the animator's response being nothing more than a blank stare. If your intent is to speak to animators, use terms and descriptions that animators use.
Hi Buckeye
Many thanks for you attention. I'll try to update the article and fix the issues you mentioned. I will edit the unclear parts based on your comment.
Hi Buckshag,
Thanks for comments and the cases you mentioned. All of them are true. BTW I'm following EmotionFX news for several years. I haven't used it before but I know it provides great run-time animation features. Hopefully I will use it in other future projects.
Ugh. I reviewed as Incomplete / Unclear. Though I was tempting to set "Extreme Poor Quality"; sadly there is no just "Poor Quality" option.
The reviewer clearly has not enough experience to be talking about this field yet (optimizing skeletal animations).
1. I'd expect cache coherency to be a big part of the section; since it's the elephant in the room. Yet only a parragraph and a minor link was shown. No source code examples either.
2. The published lerp formula is in its unoptimized form. It's nice to appreciate what is going on; but we're talking about optimizations here, and the optimized form is nowhere to be seen.
3. "Multithreading the Animation Pipeline" like in cache coherency, only two short parragraphs were spent. "Multithread your code". Gee, why didn't I think of that?
Multithreading is a hard topic. At least you link to a an Intel article. Too bad it has broken images all over its place. But at least the source code can be downloaded.
I would've let it pass if it was one of the issues; but considering everything as a whole; I have to bring this up.
4. The article points to the source code of hkVector4f for which the author does not have legal rights to distribute. I'm sure it was well intended, but the link needs to be taken down.
5. "Distance to Camera" tip. So why should we just compute something that cannot be seen? Because animations may affect logic or physics, and hence still need to be computed. Because multiplayer games have more than "one camera". Because what happens offscreen is still relevant in some games (i.e. RTS games; deterministic lockstep based games).
Author fails to mention potential problems for this approach.
6. "Using Dirty Flags For Bones". Updating the bone's local transform is the cheapest part of the whole animation process. Keeping track of dirty bones will only add overhead that will counter any potential benefit.
7. "Update Just When They Are going To Be Rendered". This is very similar to "Distance to Camera" but at least some of the pitfalls are mentioned.
However frustum culling is vastly overestimated. On a typical game with shadows and/or realtime cubemaps, almost everything gets capture by one camera. Be it the main camera, or the hidden ones like the shadow mapping camera, or one of the 6 camera needed by cubemapping (which cover a 360° FOV).
Implementing this technique means that, for every pass in the frame (instead of every frame) you need to iterate through all animations that passed culling, check if they're dirty, and update if so. You may as well update them all at once and be done.
If your game is open world with huge a world scenario; then rather than frustum culling, you'll be paging in/out the animations
8. No mentioning of GPGPU animation at all. Animation is mostly a bandwidth intensive operation; and GPUs have a lot of it, which is why most dramatic improvements are often seen in this field.
9. No mention of different ways to upload all matrices to the GPU. All bones? Only the ones that actually affect vertices? Upload per skeleton and let each draw using the same skeleton index it? Or (re)upload the bones for every draw?
Each option has its tradeoffs.
Overall the author covers a lot of topics only by surface and then references external links and let the reader on their own to understand the topic (benefits, pitfalls, how it works, how to implement it for their own applications).
I'm sure the author some day will have gathered enough experience to write marvelous article; as he seems to be on the right track.
But I can't approve this article today, as is.
I do agree with the previous comments that some parts are unclear but as I mentioned several times in the article, this is a general purpose article. What you expecting me is to write a book instead of an general purpose article. I mentioned that each of which techniques should be considered as a separated article if someone is eager to learn the details.
Multithreading is a hard topic. At least you link to a an Intel article. Too bad it has broken images all over its place. But at least the source code can be downloaded.
I addressed Havok animation as well and sent a link for it. It has source codes , examples and rich documentation. You can find what you need there. Havok has implemented the same technique more in action.
4. The article points to the source code of hkVector4f for which the author does not have legal rights to distribute. I'm sure it was well intended, but the link needs to be taken down.
The link belongs to projectanarchy. Havok came up with free tool sets for mobile developers. You can download it free there. The hkVector4f is located in the downloaded packages as well. I've put the link there so audiences can check it more easily. It's free for mobile development.
Distance to Camera" tip. So why should we just compute something that cannot be seen? Because animations may affect logic or physics, and hence still need to be computed. Because multiplayer games have more than "one camera". Because what happens offscreen is still relevant in some games (i.e. RTS games; deterministic lockstep based games).
Author fails to mention potential problems for this approach.
It seems that you haven't read the paragraph carefully. Here is a quote from article:
"Here we can define a skeleton map for our current skeleton and select more important bones to be updated and ignore the others. For example when you are far from the camera you don't need to update finger bones or neck bone. You can just update spines, head, arms and legs. These are the bones which can be seen from far"
Yeah you're right. You need some bones for logic. You have different cameras for multiplayer games and you need deterministic operations on some type of games. But do finger bones has something to do with this? Do facial bones have something to do with this? You have 28 bones for fingers, 15 bones for facial, 7 bones for ponytail, 5 bones for extra cloths! Do they have anything to do with logic? with physics? I mentioned you should have skeleton map to save the more important bones even if they are far from the camera. The important bones are affecting both visual and logic. So it seems that you haven't worked with complex structured characters before! because in these kind of characters many of the bones are just there to have better visual and nothing to do with logic and if they are not going to be seen, they should not be calculated.
"Using Dirty Flags For Bones". Updating the bone's local transform is the cheapest part of the whole animation process. Keeping track of dirty bones will only add overhead that will counter any potential benefit.
You might have complex blend trees in which you are blending more than 20 animations in a frame. There should be other procedural calculations like IK/FK blending, computing bone rotation limits in your run-time rig and many other things. They are firstly calculated in local or parent space. So having a dirty flag can help you a lot here. These calculations are heavy. They have lots of more overhead than checking for dirty flags! So ignoring these heavy calculations, when you don't need them to be updated is not cheap at all! To produce realistic characters many of these techniques are needed and managing their computation is not a cheap decision.
9. No mention of different ways to upload all matrices to the GPU. All bones? Only the ones that actually affect vertices? Upload per skeleton and let each draw using the same skeleton index it? Or (re)upload the bones for every draw?
Each option has its tradeoffs.
This part is related to mesh skinning and I mentioned in the introduction part that this article is not going to talk about mesh skinning. Here is a quote from article:
" This article is not going to talk about mesh skinning optimization and it is just going to talk about skeletal animation optimization techniques. There exists plenty of useful articles about mesh skinning around the web."
Overall, the expectation from you as a reviewer is to read the article more carefully. This one is a general purpose article. It tries to give the enthusiastic audiences some cues and introduce some techniques to them generally. I mentioned this several times in the article. So you should not expect details here. Covering each of the techniques details need a separated heavy weight article.
One article can be more general like a survey and one article can be more expert which is trying to cover all the details and experimental results. You should note that this article is general and the general articles like surveys have their own audiences as well.
Ugh. I reviewed as Incomplete / Unclear. Though I was tempting to set "Extreme Poor Quality"; sadly there is no just "Poor Quality" option.
The reviewer clearly has not enough experience to be talking about this field yet (optimizing skeletal animations).
The first thing is that you haven't payed attention to the note at the end of the page which is stated that:
"Note: Please offer only positive, constructive comments - we are looking to promote a positive atmosphere where collaboration is valued above all else."
You haven't read the article with patience. The issues you mentioned are very good and useful but the article defined its scope in its different sections.
I really appreciate the comments from Buckeye and Buckshag because they read the article carefully and with patience and they mentioned many valuable comments and stated great issues and they found what the article is tending to say to which group of audiences. I'll try to update the article based on their comments.
You mentioned that I clearly have not enough experience in this field. First I have to say that writing articles is not about promoting ourselves and it's about to share information with other enthusiastic people. So I'm sorry if I'm going to tell you some of my experiences here. I've been in the gaming industry for almost 9 years. I've just focused on animation in these years and made research and development on many animation techniques during these years. I've also been a user of different animation tools, technologies and engines in this interval.
At the end, thanks for your comments. I will update some unclear parts of the article.
Unfortunately being free does not equal it can be redistributed freely.The link belongs to projectanarchy. Havok came up with free tool sets for mobile developers. You can download it free there. The hkVector4f is located in the downloaded packages as well. I've put the link there so audiences can check it more easily. It's free for mobile development.
The license Havok grants is very permissive in the sense that allows commercial uses for free.
But they're reserving the rights on how the package is being distributed (which is via their site directly). Project Anarchy belongs to Havok, but the license header on the file makes me still dubious.
The question in that example is not why are you animating those bones, but rather why are there models other than from major characters (protagonists, some NPCs, bosses, certain enemies) with detailed finger animation and facial animations. Since such characters often do not exceed ~8.Do facial bones have something to do with this? You have 28 bones for fingers, 15 bones for facial, 7 bones for ponytail, 5 bones for extra cloths! Do they have anything to do with logic? with physics?
Yes, you could be in a game with a huge world where eventually there could be hundreds characters with detailed animations. But a system that pages them in/out based on distance to player or area they're in works at a much higher level than an animation system would.
Giving this responsibility to an animation system, the system will start micromanaging bones based on visibility, and other unrelated components will end up doing the same.
These optimizations have to be worked out at a higher level.
A more important tip would be to describe how to prepare a budget guide for artists to follow so that you don't end up with a chaotic performance nightmare: Bone count limits categorized by character relevance, max. number of affecting bones to a single bone (eg. major chars: Up to 4 bones per vertex, unimportant chars: up to 2 bones per vertex, etc), sampling framerate standards (you do cover this one though)
I was first going to refute your claim (blending 20 FK animations is cheap).You might have complex blend trees in which you are blending more than 20 animations in a frame. There should be other procedural calculations like IK/FK blending, computing bone rotation limits in your run-time rig and many other things. They are firstly calculated in local or parent space. So having a dirty flag can help you a lot here. These calculations are heavy. They have lots of more overhead than checking for dirty flags! So ignoring these heavy calculations, when you don't need them to be updated is not cheap at all!
Then afterwards you threw IK, constraints. And then's when it hit me. What we have in mind is completely different, and this basically the same problem from my comment's closure: you're covering a lot of topics only by surface and then reference external links and let the reader on their own to understand the topic.
With no reference implementation, some code, or pictures to put some context, it just ends up being confusing or even misleading. A lot of people will end up with a different idea of what you tried to say.
And I have no doubt about that. But if you would share that experience with us it would be nice: One thing is to say "don't update skeletons that aren't visible by the camera; trust me it's a performance tip"; another thing is to say "When we were working in <xx, indie game, unreleased project, can't say name due to NDA, etc> we were having hundreds of instances with 10 concurrent animations and 80 bones per instance. On retrospective we shouldn't have allowed our artists to have so many models with so many bones and animations. Even the trolls had 60 bones! Our framerate sucked at 60ms per frame, and we couldn't tell our art team to remake the animations.You mentioned that I clearly have not enough experience in this field. First I have to say that writing articles is not about promoting ourselves and it's about to share information with other enthusiastic people. So I'm sorry if I'm going to tell you some of my experiences here. I've been in the gaming industry for almost 9 years.
So we moved the animation update loop after frustum culling. Our situation improved to 45ms; but it was still unacceptable. The first problem we saw is that shadow mapping cameras were including a lot of other instances that weren't really visible.
So we first made a distance to camera calculation, which was combined with frustum culling. An instance would have to be seen by a camera and be close enough to the main user's camera. After that our times went to 35ms. We then realized that all of our bones' transform were being processed by our complex IK animation system, so we wrote a system that would keep track of which bones needed to be updated...".
I made up that story and the numbers aren't real. But see the difference?
The hostility you sensed in my comment was because you were mentioning techniques which most are just common sense (avoid updating what isn't dirty or non-visible, use multithreading, use SSE, remove redundant keyframes, lossy-remove keyframes until the quality degradation stops being acceptable, prioritize & use different update frequencies) but weren't sharing any details like numbers to convince me it's worth it; or put some context to understand why sometimes it works, why sometimes doesn't. Tell me problems I'd expect to encounter while implementing it myself.
The only time you do that well is when you talked about removing keyframes and motion retargeting.
There are many ways to keep track of bones' dirtiness. Some with higher granularity than others. Some methods much more cache friendlier than others. Just telling "keep track of dirty bones" can end up being a disaster if implemented by someone who is not yet familiar with performance sensitive code.
If you just throw me some tips and tell me to trust they will always improve my framerate (which I already know they don't always do that, or that the cost/benefit ratio is too high) I'm going to think you're don't have enough experience in this particular subject (skeletal animation)
Fair enough. That was a mistake on my behalf.This part is related to mesh skinning and I mentioned in the introduction part that this article is not going to talk about mesh skinning. Here is a quote from article:
" This article is not going to talk about mesh skinning optimization and it is just going to talk about skeletal animation optimization techniques. There exists plenty of useful articles about mesh skinning around the web."
Overall, the expectation from you as a reviewer is to read the article more carefully. This one is a general purpose article. It tries to give the enthusiastic audiences some cues and introduce some techniques to them generally. I mentioned this several times in the article. So you should not expect details here.
I found the response by Matias to be really sad and somewhat of an attack actually. The person put some time and effort into this article and I didn't think it was all that bad. Your response stinks of you being offended that someone would actually attempt to write about a topic that you feel you are more qualified to have written. I am the smartest guy in the room mentality.
Skeletal animation plays an important role in gaming industry. There are many techniques which can be used to optimize skeletal animations to make them run more efficiently in real-time scenes. This article tends to address some of these techniques in two levels. Implementation level and usage level.
There are issues I have with this article, not that anything is wrong, just that it seems a bit misleading in a couple areas due to not being complete. The issues start with the description of animations as being channels, while correct, that is generally the view from the DCC tool and definitely not the best representation for a game engine. The problem with channels is that during update you have 1-2k worth of channel data likely linear in memory and for each channel you are going to jump into the middle of that memory to read 2-4 pieces of data to compute a single quaternion or matrix depending on what you are doing. Over 30+ channels per animation you start exponentially slowing down as you destroy the cache in the CPU. This issue becomes even more major when you start distributing across multiple cores.
So, in the process of discussing animation optimization I would start by mentioning that no matter what you do there is going to be a balancing act between compression and performance. Channel based animation provides the best compression at the cost of pretty notable performance issues. Collapsing channels to full skeletal rig key frames costs compression since you can't adapt the individual curves to their needed details as easily, but performance is often nearly unbeatable.
Another item to consider is that very often, using piece wise linear approximations works for animation compression just as well at most curve fitting solutions. It uses more keys to get the same error rates but the per-key data requirement is reduced such that it generally comes out pretty similar in storage size. But, the final math used is greatly simplified and as such performance is gained. Again though, there is some balancing to be looked at here, really high quality low error rates (cinematics for instance) typically want the curves for C1 or C2 continuity reasons which linear starts failing to compress well.
Finally, when discussing curve compression you can't look at it as a single joint error rate. You need to take into account for the full skeletal chain from the bone to any end points effected by the error. I.e. you might be able to compress channels on the shoulder exceptionally well but you start noticing the hand at the end of the chain under the shoulder jittering. You need to compute the error as the means of all the chain end points in order to prevent hierarchical error accumulation from creating crappy looking animation.
Again though, the article is fine in general I'd just really like to see some of these points mentioned in order to make it clear that this is a fairly high level overview of a single direction of optimization and not a broad coverage of the subject.