So, how can we use those cores? And what does that do for us? Well, first let's take an example and see how it applies to both standard (Synchronous) and Asynchronous techniques.
For this test, we've created an Ability in Able called "Self Destruct". This Ability has no targeting and simply does a sphere query, copies those results to the Ability Context, and then deals damage to all targets. We'll put our character in a level, surrounded by 32 enemies and then fire our Ability and use Unreal's Profiler (which Able has built in support for) to chronicle our times.
- This test was repeated ~8 times and the results were captured once things started stabilizing around the same values, a better case would be to average things out over 1000, or even 10K samples.
- 32 was just a random number of NPCs that fit nicely on the map. Your game may have more or less actors that an ability may affect. Both are fine and supported. There is no hard limit to how many targets an Ability can affect.
- The CalculateDamageForActor method used by the Apply Damage Task just returned the base damage, there was no complex math going on (unfortunately).
- This was run on a Development version of the Editor on Windows, so in Release/Final version the numbers will be completely different (likely much faster due to removal of debug code and more aggressive compiler optimizations).
- As always, Profile your own game/platform as each game and platform has its own requirements and unique hardware.
Now, onto the results.
You can see here, our Max time hit 0.3 milliseconds (for reference, if your framerate target is 30FPS then you have ~33ms per frame to do your work in. If your target is 60FPS, you have ~16ms per frame). Not bad overall, especially when you consider this is what is happening under the hood:
The engine is running all our calculate damage calls one after another on a single core. If our calculate damage for actor call was more expensive, or we targeted more enemies - this time would only increase. Again, this behavior is perfectly fine and our time isn't that bad, but let's see what we can do if we apply a bit of Async work to our Ability.
A ~23% drop in time! Let's look at what's going on under the hood with this call:
There's a bit of overhead with using Async due to the setup, so our simple damage calculation is probably just on the cusp of being not worth it - but we still saved some time by farming things out to other cores so we could quickly move on to simply applying the damage and just asking the various cores for their result when we're ready for it. Pretty good considering this was done by simply checking a box on the Task!
So, What's the Down Side?
To understand that question we need to look at a couple ways Async work is completed within Unreal Engine 4 and, thus within, Able.
Both versions require some overhead in wrapping up everything for the worker threads, so that's also something to be aware of.
Async is a powerful system that can improve performance, but that power requires a bit of care in handling.
- Does your Ability need frame perfect collision (i.e. Simulators, Fighting Games)? If not, then turn on Async for any collision based tasks/features (Targeting, Queries, Sweeps, Raycasts, etc) and see if you can notice a change.
- Does your Ability affect a large number of Actors and/or have some complex damage calculations? Try turning on Async calculate and seeing if things improve.
- Abilities for NPCs should ideally make heavy use Async features as their performance isn't as noticeable/critical.
- Test and Profile! Don't be afraid to enable a feature, which is often just checking a box on a Task, try it out, and profile it if you want some solid numbers. Sometimes the numbers may tell you a different story than you would think naturally (due to any number of reasons).
So, we've taken a very broad look at what Async is and how it can help performance as well as given some tips for its use. Finally, Able was written from the ground up to make use of Async whenever possible, both at the Ability and Task level of execution.