The video games industry is changing its approach towards video games development. More and more publishers are hiring third party consultants and experts to reduce development time and costs. Confetti Interactive specializes in advanced real time graphics research for the video game and movie industry. They have worked with likes of Intel, Insomniac and Square Enix, and have provided consultation and engine extension services. Confetti also provide their own set of lighting, Post FX and Sky Dome middleware namely Aura, Pixel Puzzle and Ephemeris respectively.
GamingBolt got in touch with Wolfgang Engel who is the CEO and founder of Confetti Interactive. Wolfgang was the lead graphics programmer at Rockstar’s core technology group RAGE. Furthermore he is also founder and editor of the ShaderX and GPU Pro books series. He has also authored several books and articles on real-time rendering and a regular contributor to GDC.
We asked Wolfgang a variety of questions. You can read his response below.
Ravi Sinha: You have a great pedigree in graphics programming, having worked at Rockstar and contributed to several other games. What was the inspiration behind starting your own graphics tools company?
Wolfgang Engel: When I worked at Rockstar, it was organized in a way that there were several games studios and each of them was working on one or two games. Working on the core technology group means you are working on each of these games to a certain degree and you are providing middleware. The RAGE engine was similar to the middleware. It was provided to the game development teams. For example post effects and skydome were re-used for every game and specifically tuned for every game.
So the idea of RAGE and Rockstar at the time was to have one game engine and seven game studios that used this engine. As far as I remember there were two in Canada, two in the USA and three in England. At some point I thought, ‘hey, I can do this with more customers.’ So we switched from internal customers to external customers, created our own technology and founded Confetti. The fun part is that what Confetti does today is very similar to what the graphics and the RAGE team did at Rockstar.
The other motivation for Confetti was that we expect the game industry to move to a similar production model as the movie industry. Instead of having a team of 300-400 people they move towards smaller production companies, similar to the movie industry wherein you create a production company just for one movie and then you bring in specialists that have their own companies and after the movie is done everyone parts ways again. This is happening in the game industry now and it is more efficient when it comes to production costs.
Ravi Sinha: Yes, once they are done with the contract, they move to other projects.
Wolfgang Engel: Exactly. So you have special effects companies in there who are highly specialized and every efficient. That is one of the underlying ideas for Confetti. Most of the times we do similar things for ourcustomers and after a while we become quite efficient because we do it so often, so our customers are pretty happy that we can implement features and effectsfast following their individual specification.
If you actually see the list of customers we had in 2013 you will see a lot of cool stuff in there. That was last year and this year we already have a nice and growing list of new and old customers.
"The most expensive part is the Propagation portion, we can move the Propagation part from the CPU to the GPU or vice versa, depending on the underlying platform, what kind of quality level you plan to have and what kind of game you have. So you can reduce the pressure on the GPU or we have some time left on the CPU, so let’s move the Propagation portion there."
Ravi Sinha: So with Confetti, your main focus is on lighting and special effects.
Wolfgang Engel: Yes, effects and graphics engine development. We usually change the rendering engines of our customers and then add -for example-skin screen space refractions and all the stuff that is cool.
Ravi Sinha: I am also seeing that you have worked with AMD to create Lara Croft’s hair technology, TressFX.
Wolfgang Engel: Yep. And then later on with AMD, we worked on Battlefield 4 on CPU/GPU optimization which worked out well.
Ravi Sinha: Was there a motivation to build a complete all in one engine, given that you have extensive experience from working on the Rage Engine when you were at Rockstar?
Wolfgang Engel: You know an engine probably requires 50+ men/years in time.At Rage, I believe were –most of the time- 13 people. With 15 people and three years at hand you can build a competitive engine. The main question that remains is: who is going to pay that team for that time frame?
Ravi Sinha: That is an excellent question.
Wolfgang Engel: Honestly, most people cannot afford this. Having your own engine for game development teams is great because you have control over the technology and you have the people who wrote the code. It’s a great thing to have and is totally worth it. But creating a game engine/middleware from scratch is not easy to do. You have to do it like Crytek or Epic Games along with one or several games that utilize the features and pay for development.
Ravi Sinha: Basically like EA, where they used their own engine across their games.
Wolfgang Engel: Yes. And then you develop the engine and then you face the problem of adapting games to it since the engine only supports specific type of games out of the box. For example, you can’t do a flight simulator with most of the currently available engines.
One of the big opportunities for us is to customize an engine like Unreal Engine 4 that comes with source code. We can customize it to the needs of a certain game type and the visual.
If you build an engine for a game that is not covered, for example an open world game you might start out with an engine like Unreal Engine 4 and then heavily customize it. You need to implement a streaming mechanism that fits to your type of game.
Ravi Sinha: Let us talk about Aura. I am sure you must be aware that the Unreal Engine 4 recently did away with its custom Global Illumination method and there is no doubt that it’s still a challenge for developers to implement in all of their games. How will Aura, your own dynamic global illumination tool, is able to provide a technically less expensive solution?
Wolfgang Engel: The solution that Unreal Engine 4 was using was based on Sparse Voxel Octree. It occupies a lot of memory and is technically very expensive. I think since it was so expensive they lost the motivation to implement it. I don’t know the real motivation behind the removal though. As far as I know, Nvidia’s solution is also based on Sparse Voxel Octree and it has the same challenge. Our solution is based on Light Propagation Volume.
On the memory side it’s much cheaper and on the performance side we are flexible based on the compute units. It can compute either on the CPU or on the GPU. The most expensive part is the Propagation portion, we can move the Propagation part from the CPU to the GPU or vice versa, depending on the underlying platform, what kind of quality level you plan to have and what kind of game you have. So you can reduce the pressure on the GPU or we have some time left on the CPU, so let’s move the Propagation portion there.
Another thing we want to do is that we want to go open source. Everyone should soon be able todownload the source code for the PC version. Obviously we won’t open source the PS4 and Xbox One versions.. The reason behind this decision is to make everyone aware of what we are doing. We can get the community involved and our expectation is that whoever uses it will give us credit. If anyone needs a PS4/Xbox One version than they need to talk to us.
"When we built Aura, we knew roughly what the specifications [of PS4 and Xbox One] would be and that they will be powerful because you can go to the metal so it’s much more powerful than the specs would led you to believe. Sowe built Aura aroundthe performance characteristics of the new consoles."
Ravi Sinha: How many light sources can Aura handle at the same time? Furthermore, what kind of occlusion techniques are you using when simulating a large number of light sources?
Wolfgang Engel: We have one demo where we have used 120 light sources. It was built for Ivy Bridge, an integrated GPU. The demo was with shadows and bounce lighting. There is no limit and it all depends on how much frame time you want to dedicate because it’s increasing the number of light sources linearly.
100 light sources is not much,even on low end platforms. The cost is to render the scene from the point of view of the light sourceinto a cube reflective shadow map and they can be really small, may be 16X16, 64X64.
You can cache the Cube Reflective Shadow maps, so you don’t update them every frame and it only gets updated when something is moving in the area of influence. The number of light sources needs to be balanced with cached cube reflective shadow map, the cost is really small and you can update 5-20 at a time. I think we have a demo where we update 40 of them at the same time along with lots of moving parts.
Regarding occlusion, we can do volumetric occlusion, per-pixel occlusion but we can also do screen space ambient occlusion to add additional shadows in screen space. We can use both but one should note that the former is more expensive.
If you take a look at our tech demonstrations, that compare with and without global illumination you can see 120 light sources along with a debug code that indicates the number of light sources.
Ravi Sinha: I came across an interesting statistic for Aura. It utilizes about 1664 kb on consoles. For clarification purpose, is this for the PS4 and Xbox One or for last gen consoles [PS3/360]? If it’s for last gen, can you please provide numbers for the new consoles? Also do PS4 and Xbox One have different numbers?
Wolfgang Engel: That is actually the setting that you see in those videos. You can increase or decrease the number based on your quality settings. To be more precise, this number is the size of the reflective shadow map that is used to actually cache the data and it’s also for the light propagation volume. The bigger you make your light propagation volume, the more precisionyou have but then it also occupies slightly more memory and the cost of updating increases.
That number is a ballpark to demonstrate how small the memory requirements actually are.The main advantage of our system over other GI systems is, that we support specular, include characters, allow as many light sources as possible and our system works really well in open-world environments.
Ravi Sinha: Given that lighting is one of the biggest aspects of games development these days, Aura seems to have an extremely low memory footprint. What kind of optimization methods have you utilized to have a memory friendly yet a powerful global illumination technique?
Wolfgang Engel: If you look at this article, we started out with that and added specular and changed some of the propagation steps fundamentally so that it also covers characters. Compared to other solutions, Aura can also influence characters which meansyou get very consistent scene. When you look at the moving carpet demo,you can see that most companies cannot provide a solution similar to ours as they can’t cover moving objects. We started out with the algorithms described in GPU PRO 2 and enhanced those algorithms to cover characters and many light sources.
Ravi Sinha: Staying on the topic of global illumination, since the method is dependent on GPU and to an extent on the CPU, how is the implementation different on PS4 and Xbox One when compared to a high end PC since essentially these next gen consoles have slower CPUs?
Wolfgang Engel: When we built Aura, we knew roughly what the specifications would be and that they will be powerful because you can go to the metal so it’s much more powerful than the specs would led you to believe. Sowe built Aura aroundthe performance characteristics of the new consoles.
"Building a rendering system with a lot of lights and especially with lot shadows, on a fundamental level, is the main challenge for most people. You can mimic a movie scenario where they place a large number of lights into a scene where they mimic real life lighting with light bouncing."
Ravi Sinha: Let’s talk a bit about Ephemeris. How does it differ from other solutions?
Wolfgang Engel: The biggest advantage of our Skydome system is that it’s very efficient. Our Skydome solution is almost two generations ahead to the one I worked on when I was at Rockstar and it was written from scratch. We have multiple scattering, volumetric clouds. Most of the calculations can be fetchedfrom look up tablesand it can give you results in a short amount of time and I think our competitors are not that fast.
Ravi Sinha: What kind of improvements have you made to Ephemeris to cater high end PCs and the new consoles in terms of ray scattering and physical based rendering?]
Wolfgang Engel: The Skydome comes with a package of filters like polarization filters, to achieve a similar image as with real world camera. You might want to mix volumetric clouds with regular clouds, basically cheaper clouds and then depending on the type of game you would use more or lessvolumetric clouds. If you don’t want those clouds to come too close, for example indoors, it does not make sense to have too many of them.
The multi-scattering is so cheap in our system; we do pretty much everything that you can do for highest quality already. The clouds are actually the part that is expensive and with it you have several solutions and you pick one that fits to your game and then you actually spend your frame time on clouds.
In one of our videos you can see how the clouds are lit. Especially for cloud rendering we developed custom solutions making it efficient and fast. There is another video that shows the integration with the Vision Engine which shows the lighting on the cloud. It shows how the clouds react withthe sun light.
Ravi Sinha: Another interesting service that your company provides is engine extension. Generally speaking, what kind of requirements developers have when they approach you for upgrading their engine to a superior platform? Consider an example where you are upgrading from the complex Cell architecture of the PS3 to the X86 of the PS4. What challenges does this offer?
Wolfgang Engel: We worked with many custom engines,withCryEngine,Vision Engine, Unreal Engine 4 and Unreal Engine 3. Last year we had two Unreal Engine projects along with three CryEngine projects and the rest were all custom engines. Many clients that come to us say ‘hey, it’s a bit dated now and we have a long list of features, can you add those?’
We then give them a quote and a time estimate with two weeks milestones. The features that were asked from us were integrating middleware packages to engines, adding special effects and simple case studies. Regarding special effects, we add things like lighting, shadowing and we support a wide variety of platforms across mobiles, PC and the new consoles. Last year we also worked on iOS games.
Ravi Sinha: So basically your solutions can allow adding of any effects. Does this require additional training on the artist’s side?
Wolfgang Engel:Yes, usually we have to work with the artist. The way we work is that we get feedback from artists so they always keep asking ‘can we tune this, can we tune that?’ and we then add their requirements. We also have an artist on our side that can actually prepare example set ups.
One thing I wanted to mention is that we visit game studios and talk to the graphics teams and try and understand their requirements. I get invited for two days consultation which is cool for me, discussing things with other graphics programmers, and I learn a lot by doing this and by hearing their needs and requirements and providing solutions to them.
Ravi Sinha: So you talked about consultation. What is the number one demand from developers these days?
Wolfgang Engel: Most of time it’s lighting and shadowing solutions. Building a rendering system with a lot of lights and especially with lot shadows, on a fundamental level, is the main challenge for most people. You can mimic a movie scenario where they place a large number of lights into a scene where they mimic real life lighting with light bouncing. In video games you can mimic real life light pretty well and this is one of the things that we did in Red Dead Redemption. Later on we did that pretty much in every game we worked on. Depending on the need for a game, we implement highly efficient lighting and shadow systems.
"One of the cool features of modern games is that we have physics, and they have been traditionally implemented on the CPU and as a game developer you have to go back and ask ‘do I have to spend 40% of my CPU time’ on rendering or ‘can I reduce this so that I can use it for physics’ and this is one of the things that DirectX 12 allows you."
Ravi Sinha:What are your thoughts on DX 12?
Wolfgang Engel: I like it. It’s great and it’s a fantastic opportunity to raise the bar again. It works with the same piece of hardware, so it’s the same CPU and the same GPU, and certainly we have much more CPU time to spend.
The workload on the CPU decreases substantially, because you can utilize better the cores of the CPU. In this way you are less likely to be CPU limited. One of the cool features of modern games is that we have physics, and they have been traditionally implemented on the CPU and as a game developer you have to go back and ask ‘do I have to spend 40% of my CPU time’ on rendering or ‘can I reduce this so that I can use it for physics’ and this is one of the things that DirectX 12 allows you. This makes sure that the developers can get more out of the existing hardware.
AMD’s Mantle is also going in the same direction and has the same structural idea, reducing CPU time so that it can be free up for other tasks. One CPU usage case is multiple GPU set ups which is kind of an interesting development because when you have multiple GPUs, say two GPUs, you would not expect to be actually CPU limited. But now the CPU has to feed two GPUs which are really fast, suddenly your CPU becomes the bottleneck. DirectX 12 and Mantle are resolving that situation.
Ravi Sinha: DX 12 sounds similar to AMD’s Mantle. What are your thoughts on this and do you think having more APIs for low level access to GCN will create inconsistency?
Wolfgang Engel: The advantage of Mantle is that it’s able to squeeze out of a few more cycles from the AMD platforms. The advantage of DX12 is that it runs on all GPUs.
Ravi Sinha: DX 12 will obviously have a big impact on PC gaming. But do you think it will have the same level of technical impact on the Xbox One?
Wolfgang Engel: The Xbox One already has an API which is similar to DirectX 12. So Microsoft implemented a driver that is similar to DirectX 12 already on the Xbox One. That freed up a lot of CPU time.
Ravi Sinha: Overall there is a lot of confusion regarding the message Valve is putting out for SteamBox. With respect to that how are you approaching graphics tools development of SteamBox?
Wolfgang Engel: We helped Intel optimize OpenGLES drivers for the SteamBox and we had pretty early development kits of Intel’s SteamBox [Gigabyte Brix Pro] at the office . I also have one at home and I am impressed by them. This is a very cool mini PC, it’s very small and it’s astonishingly powerful for its size.
From my perspective, the most unfortunate part about SteamBox is that there are so many different configurations. Honestly, I would have just stayed with Gigabyte Brix Pro because it’s not very expensive and using that you can give every developer a benchmark. The big differences in configuration will make it hard to sell the whole SteamBox idea. The big advantage for consoles is that you have a fixed hardware that you built on leading to better optimization and code down to the metal.
In my perspective, Valve should have just declared that this is our SteamBox for the next 3-5 years and every developer would have made sure that this box would have been the best possible experience. Using a box that is not expensive also extends your user base.
About graphics tools development, SteamBox is Linux so getting reliable graphics driver is a challenge. As I said before that we worked with Intel for their graphics drivers and I think Intel improved their drivers in the Linux area substantiallyI. It’s really astonishing. Over the last few years, Intel has drastically changed driver development. Developers were complaining about drivers issues five to six years ago and today in general Intel’s driver support is really good. It is so good that most people actually do not notice it anymore.
The biggest challenge for SteamBox will be a consistent development environment. As soon as you have different graphics cards on Linux, driver support can be very different which means you have to check every time what SteamBox am I running on. While Microsoft makes sure that the drivers are running smoothly on a variety of hardware platforms, with the variety of SteamBoxes this challenge gets greatly increased with Linux.
Ravi Sinha:What is your opinion on Ray Tracing? Imagination thinks that modern hardware allows ray tracing, which is correct, but does the market have enough high end PCs to have it included in retail games?
Wolfgang Engel: I think it’s a cool solution as long as there is the hardware to support it. We optimized rasterized image rendering in the last 40 years to a point where we are becoming really good in mimicking real-world environments. We trained artists to create art and tune games so that they look quite real already.
With Ray Tracing we would have to train art teams again and also get some hardware support.
PowerVRs hardware implementation looks very useful, so they start with the deferred lighting system, they hold all the data in a G-Buffer and then they have Ray Tracing which adds something to the scene like shadows, reflection, global illumination but the challenge is as long as everyone is not doing it in the same way, you can only run it on PowerVR chips. I don’t see other people jumping on that bandwagon at the moment. As long as PowerVR has a largepart of the market which they are currently havingwith the Apple platforms, this might be a very cool target.
A large number of iPhones and iPadsrunning chips which are capable of ray tracing would be very cool! It might actually introduce ray tracing to end market. We will see if their solution performs well enough to be used but in essence it’s a very interesting solution that might actually work out.
Ravi Sinha: Do you think the new consoles will be able to support Ray Tracing in the future? What kin d of technical challenges will the developers face in implementing Ray Tracing on them?
Wolfgang Engel:Yes. We have a prototype ray tracer running that could do this. One of the challenges is to get things working with thesame art assets that are also used for the rasterized game. This is what we were looking into.
"The memory expensive draw calls can be rendered into eSRAM. When you don’t need so much memory bandwidth, you use the regular system memory. You have to plan ahead, you have to think how you are going to use the memory, in the most optimal way. So eSRAM gives you an advantage if you do this."
Ravi Sinha: In the recent Microsoft BUILD event, Microsoft demonstrated the technology of cloud in gaming. What are your thoughts on this and do you think it can possibly give Xbox One more processing power? Furthermore what could be the potential hindrances to this tech since most of the regions don’t have speed internet connection?
Wolfgang Engel:This is very cool tech, because you can offload tasks like AI that do not need to be real-time. This way you can relieve the CPU pressure and do other things on the CPU. I think this is an advantage for the Xbox One.
Ravi Sinha:You were at Sony’s GDC booth this year and you spoke about Compute Shader Optimizations across three AMD GPUs: RADEON 6770, RADEON 7750 and RADEON 7850. Will those same optimization methods be applicable on the PlayStation 4 since it has a GPU which is quite similar to a Radeon 7850?
Wolfgang Engel:Yes, and there will be PS4 specific optimizations, not available on PC, that will help increase performance even more.
Ravi Sinha: What are your thoughts on the Xbox One’s eSRAM potential of tiled streaming along with DX 11.2, do you think developers will utilize this feature anytime soon?
Wolfgang Engel: eSRAM is very fast memory. In general the biggest challenge that game developers are facing are memory access patterns, so while we have a lot of computation power, the memory access cost is increasing substantially over the last ten years, compared to the cost of arithmetic instructions. As long as you are in registers you are fine but as soon as you need to access memory, it becomes slower. So the challenge is to access memory in the most efficient way.
Therefore memory access patterns are the most important optimization strategies. So it’s not about counting cycles but it’s about thinking how can we re-factor an algorithm so that we can access memory in a more efficient way. eSRAM is part of that. For example with a compute shader you can access cache memory (thread group shared memory) so you can re-factor your algorithm so that it uses this memory better, resulting in a huge and substantial speed ups. With the Xbox One, the introduction of eSRAM has a similar idea.
The memory expensive draw calls can be rendered into eSRAM. When you don’t need so much memory bandwidth, you use the regular system memory. You have to plan ahead, you have to think how you are going to use the memory, in the most optimal way. So eSRAM gives you an advantage if you do this. For one of our games, we used eSRAMby creating an excel sheet first, that shows how we are going to use eSRAM through the stages of the rendering pipeline. This helped us utilize the speed improvements that were coming from the eSRAM.
Ravi Sinha: Furthermore, how do you think will Sony’s own custom API for the PS4 stack up against DX12?
Wolfgang Engel: Sony’s own custom API is more low-level and definitely something that graphics programmers love. It gives you a lot of control. DirectX 12 will be a bit more abstract because it has to work with many different GPUs, while the PS4 API can go down to the metal.
Ravi Sinha: During this year’s GDC, Emil Persson – Head of Research at Avalanche Studios,spoke about effectively utilizing ROPS for the PS4 and Xbox One. According to him PS4 is able to render 64 bit textures before it becomes bandwidth bound and 32 bit textures for the Xbox One before it runs out of bandwidth. Since you also provide consulting and solutions, how can a developer bring the same performance of 64 bit on the Xbox One?
Wolfgang Engel:Those are highly theoretical numbers because they don’t reflect a typical game scenario. In the game you start rendering into shadow maps which eats up memory bandwidth, then you start rendering into G-buffer for deferred lighting and then you render it into a reflection map which might be a cubed map, then you start rendering lights in a light buffer, followed by a post effects pipelines render which might require a lot of memory copies, depending on how you implement it.
Let us take a scenario where you have 20 different render targets per frameand you render them into memory, it will be hard to say this will be faster than this one and it’s even hard to make a general assumption about something, especially with the eSRAM in the mix. It highly depends on how you use system memory and eSRAM. You will run a lot of performance captures to see what the bottleneck is, memory access or arithmetic instructions. In most cases I predict memory access patterns will be your biggest challenge.