Engineering Game Development

Lee Winder, Technical Manager at BlitzTech on Software Engineering, Game Development and Education

Browsing Posts in Continuous Integration

Our current Continuous Integration process works quite nicely and has greatly benefited our team, but these systems can always be extended and improved upon.  The following post will simply cover some of the ideas I have for the future and how it might improve the whole process.

Visual Unit Testing

Visual Unit Testing is the process of creating small, scripted tests that run the game and allow you to test that certain states have been reached.  For example, having a test which runs the game, scripts the input so one character shoots another and then tests if the enemy is dead or the score has been increased would be a very basic test.  It could also be used to perform long soak tests with more intelligent input rather than the kind of tests that are used at the moment.

Allowing the process to take screen shots and post them to a web server with the test results would allow a huge number of systems to be tested in the environment they will actually be used in.  Extending this to allow QA Technicians to create rafts of tests would also remove the need to continually test the more mundane aspects of games, such as does the menu flow behave as it should, do the leaderboards work as intended etc.

Obviously running this process would require a large amount of time, especially if run on multiple platforms.  But it could easily be hooked into the nightly build process meaning hours of testing would be carried out when no-one was in the office.

Some Arcade titles originally supported simple scripting using Lua to perform automatic controller input, but this wasn’t taken far enough, and the BlitzTech engine now has its own state machine system which would most likely be used instead.

Automated Code Reviews

Teams at Blitz carry out code reviews at different times and to different levels, but one thing that is usually done is checking that the code conforms to a given Blitz coding standards (with people moving around teams as they do, a consistent standard is pretty much essential).  But again, checking something that is based against a set of rules is something that an automated system is much better at that our human co-workers.

Adding a code metrics step to the Cruise Control build step, meaning the code is run through after every check-in could catch if the code is not standards-compliant below a certain threshold.  If this was the case, the build would simply fail (and this step could be removed if we had a designer/artist build machine as discussed in a previous post) and the programmers would have to re-examine the code they posted.

While this could seem quite draconian, it would free up code reviews to concentrate on the bigger picture, rather than getting stuck in the rut of commenting on the use of ‘m_’ or class name pre-fixes.

Extending The Nightly Build

As I mentioned in Extending Cruise Control we collect information on how often the build is broken and the ratio of who breaks the build the most.  We also generate a massive amount of information by using Subversion, such as check-in rates and lines changed per check-in etc.

The nightly build could easily be extended to incorporate all this information and, along with links to the Visual Unit Testing web service, could produce reports on the activity and stability of the game for that day.  This would be a very useful tool for the programming managers, especially as some people already use this information, but it is usually done on an ad-hoc basis when the tools are manually run and the information manually collated.

Automatic CI Support

A large portion (at the moment pretty much all) of my current work is involved with extending the Arcade technology I developed and integrating it into the BlitzTech engine.  It has the ability to automatically generates the required configuration files to set up a Cruise Control server, but the current plug-in system (like adding automated e-mail generation, stats collection etc.) is pretty much set up to do only what it needs to do, without much flexibility.

I fully intend to make all these plug-in’s come as standard, allowing developers to cherry pick the elements they want and hook them into their CI process with very little work.  While people may start to ask why I do not use NANT since they are all hooked into CruiseControl.net, I can simply do more, and do it faster, with Ruby.

So I have a few ideas on where I want to go with our Continuous Integration process in the next few months (years?) but there is nothing like lofty goals to keep you going.  Visual Unit Testing is obviously the biggest, and will require the largest amount of time to developer, but will generate the biggest payback once it’s being used to its fullest potential.

So what kind of things would you like to see in a CI process?  Or is there something your process is doing that you think is pretty special?

So far we have a process which guarantees that the game builds and that the assets can be generated.  We have processes for doing this automatically and for making sure people are as up to date as possible.  But there is still one question which none of this answers…

When I run the game, will it actually work?

Games are a mass of systems, all interacting together to produce something that is fun and interesting to play.  It’s QA’s job (on the whole but not exclusively) to make sure that when it plays it plays well and there are no glaring bugs, but if we are automating so much, we can also automate some of the testing too?

Unit Testing

A Self Testing Build in this case is really just a fancy term for Unit Testing, something a lot of people have heard of but just as many havn’t.

The basic premise behind this is that at the end of every single compile small pieces of code are run that test various systems in the game.  These are run as post build steps so any changes to the code are instantly tested against the tests that already exist.  If some code has changed that makes one of the tests fail then the whole build fails.  This means Cruise Control fails, the nightly build fails, but most importantly everyone knows it’s broken and they should avoid updating until it is fixed.

The following are a couple of very simple tests, that are run every time the code changes to make sure the vector’s behave the same and as expected.  This might look simple (one of the main reasons people don’t unit test is because it will ‘obviously work’).  But with vector code being platform centric, a change on one platform can effects all the others.  The tests are written using UnitTest++ which is the library we use to write all our tests.

Feed back to the developers tells them something has broken, the test results are displayed in Cruise Control and the person responsible instantly sets about fixing the build.

Game Based Unit Testing

It’s often difficult to think about games from the point of Unit Tests but it often results in code being written in a more encapsulated and less coupled way, which in itself results in code being easier to write and maintain.  But as an example, the following could easily be tested in a small suite of Unit Tests…

  • Does and AI character ’see’ the player when they are in front of them
  • Is the player ignored if they are outside the enemies cone of vision?
  • What state do they enter when they hear a noise?

By testing the under-lying logic of the system we can be confident that when something breaks we will have a better idea of where the problem lies that having to look at every system at every level.

Running Tests On Different Platforms

Unit Tests are only useful if they are run often and on the platforms the code is destined to be used on.  This can cause problems when trying to run Unit Tests on development kits as the code usually needs to be loaded onto the machine before it can be run and the results fed back to the host.  While this might only take 10 seconds, doing it after every build would be a massive drain on a programmers time.  While this isn’t a problem on the PC platform, we generally expect the console tests to be run as part of the nightly build when it has all the time it needs to run and feed back the results.

But, even if it isn’t being run on every platform all of the time, we still have a system running on the PC and some testing is still better than no testing at all.

So this is how the Blitz Arcade system adds a level of testing to the Continuous Integration process.  It’s one of the hardest parts of the process because Unit Testing does take a while to get into, and to understand how code can be tested when at first it looks like it is a series of black box objects.

In the final part, I’ll take a look at what I would like the process to do in the future, and the kind of things that would be cool to have in a fully integrated Continuous Integration process.

Games are not just code.  Art, design, music and QA all get in the way, making the whole CI process harder, but actually making the game fun!  It’s just as important that the CI process benefits these guys otherwise there are still going to be a hurdles to getting a solid game released and making the process as smooth as possible.

The Nightly Build

Due to using Cruise Control, we know the build can compile and because we never leave on a failed build we know we can run a full ‘rebuild’ of the game at night.  This guarantees that we do a full rebuild of the game at least once a day, and can remove any niggles that crop up from Cruise Control only running ‘build’ steps.  And since we are doing it when no one is around, we have the time to build all the game assets, including animations, textures, models, music and anything else that we need.  Since this can take hours, it doesn’t matter if we start the build at 11pm, as by 9am the next day everything is up and ready to go.

So we finally have a fully integrated build that is being generated for the start of every working day.  It means that QA have a full build to test at the start of the day, but also the designers, artists and musicians have a fully up-to-date project in which to add their assets and tweak the game play.

But this can be difficult to first set up.  Since you are automating the entire process, everything needs to be command line driven.  This isn’t always easy with off the shelf tools, or even bespoke software, but the time it can save is definitely worth the time.  Luckily the BlitzTech SDK is fully configurable using a custom scripting language so we can not only build the assets, but run additional processing steps such as compression or pre-processing of assets.

Generating More Regular Builds

But we still have a slight problem.  What happens if 10 minutes into the day a programmer implements a new feature that the designers want to play with it right there and then?  Do they have to wait until tomorrow morning just to get hold of it?  What if the assets being generated by the artists change in a way that are no longer compatible with the build they have?

Since we have a Cruise Control machine constantly building a new version of the game, we can easily add additional steps to the end of the CC process.  We have a step at the end of a successful build that copies the executables to the network and (similar to the failed build mails) sends a mail to the designers and artists telling them a new build is available.

By providing them with a simple tool that allows them to get ‘latest build’ from whereever they want, the designers have access to the latest game features pretty quickly (this is also another reason to trim down the amount of time it takes to finish a Cruise Control build).

So the programmers are no longer disrupted with requests for new builds, and the artists on the team are no longer using half-built, half complete builds during the day. It means the full game is being used at all times and is never in an unknown state.

Some teams at Blitz add additional information to their Cruise Control builds and this is something I hope to incorporate into the Arcade process at some point.  Each CC build will display the latest Subversion revision on screen at all times.  By requesting that the programmers specify the SVN revision in bug reports or feature check-in’s, designers will instantly know if the build they have does what they need it to do.

Generating The Game Assets

It would also be cool if the Cruise Control build also came with up-to-date assets so QA could get the latest build and go from there.  Obviously time is a serious concern here, but since our entire build process is command line driven, we can add steps to build the assets at the end of a successful build and before copying it to the network.  In our process this is an optional step, as at the start of a project, when it takes minutes to generate the assets this is pretty useful, but near the end of a project this can add hours onto the process and has to be removed.

One way to avoid this, and make the process more suitable for the artists, is to have a separate build machine, which only generates the release build (which is usually the one used by the artists and designers) and builds only the common asset packages.  This is sometimes done on the bigger projects at Blitz as it can take slightly too long for artists to wait for the programmers Cruise Control machine to finish.

Generating Submission Builds

One more thing that comes from this process is our submission builds.  Since we can generate a nightly build automatically (usually by adding a single script as an automated task) we can generate all our builds by running the same script.  This means that our submission builds hook into the same process as everything else which avoids the most important builds being built in an un-tested environment and make the process even more secure.

 

So we now have a process that allows us to automatically generate the whole game at any point of the day, and at least once a night.  The designers, artists and very importantly QA can hook into the CI system giving them an up-to-date build at any point and no one is every playing a build of the game that is half-built or doesn’t have everything that is currently available.

In the next part I want to cover the process of self-testing builds and how this can make a build not only compile, but actually make sure it is doing what it is supposed to be doing.

In the last part I covered the use of CruiseControl.net in the CI process, which is pretty much one of the most important parts of the whole system.  Out of the box, CC.net is pretty useful, but it can be easily extended and this part is going to cover how we use CC.net and what we have done to get a bit more out of it.

Turnaround Time

The biggest problem with Cruise Control is the time it can take to report a new (or more importantly broken) build.  At the start of a project this can be a quick turnaround, but later, when we have masses of code, more platforms and more build configurations, it can quite easily take hours to complete a full build.  Since no-one should be going home on a non-green build, this either means they stay in the office until late, or don’t check in after 2pm.  None of these are suitable solutions.

The quickest thing we do is reduce the number of configurations we build.  The ‘profile’ configuration, which is rarely used by anyone, can often be removed, along with full debug builds (since these builds actually become unplayable near the end of a project anyway).  We can also cut down on the number of platforms.  A game might be destined for PS3, but it will probably start life on a PC.  Once everyone has finally moved onto the target platform, a whole platform plus its configurations can be removed.

Finally, we never do a ‘rebuild’ on Cruise Control.  While a ‘build’ sometimes needs a kick from behind due to linker errors or similar cropping up that full builds generally fix, this generates a massive saving of time, and the full build is always done at night anyway.

One technique that has been used in other teams at Blitz is to have a separate CC machine, which simply builds the release build of the title on the target platform.  Not much use for the programmers, but an excellent time saver for designers and artists, who don’t care if the debug PS3 build is broken.

In one case we reduced the build time for a project from around 2 hours on average to about 20 minutes.  A massive boost to productivity.

Extending Cruise Control Reporting

The Cruise Control Tray tool is useful, but it’s actually quite easy to miss the fail message and people do not always check their service tray to see if the CC is red.  The easiest way we have extended this is for the CC machine to automatically send e-mails to all programmers on a team when the build is broken.  This is apparently very easy to do with NANT, but our extensions are written in Ruby, simply because I had a lot of previously written Ruby libraries that were suitable for the job.

This can easily be extended to send a message to our Yammer service (which we use for inter-team communication already) and that is something that I hope to do in the future.

Tracking Build Statistics

It’s very easy to say “The build is always broken” but it’s also very useful to be able to have hard evidence for the % of broken versus fixed builds.  Again, Cruise Control gives you the ability to track this by storing the build status as local Environment Variables which can be used to track the state.  Along with easy access to the SVN log, we know who broke the build, when and how (was it a check-in or a forced build?).  By hooking these into a simple php graph generation tool, statistics on the broken vs. working ratio and broken builds count.

I would never make the ‘who breaks the build the most’ information public, but it is an excellent tool for finding problems in the development process or if anyone’s way of working simply isn’t… working.

 

Obviously Cruise Control can have its limitations.  Multiple projects, dependencies and configurations can soon add up and no amount of tweaks and omissions can stop this.  Once the feedback loop is redundant due to it simply taking too long, Cruise Control can lose its impact and other methods need to be used to bring it back on track.

For example, at its worst case some projects can take over 4 hours to complete a set of dependant builds.  If you are right at the end (which unfortunately I am!), the amount of information being give to you is dramatically limited.  But we have made changes which mean this delay still takes place, but other information from the CC machine is fed back much quicker meaning it becomes useful once again.

So that’s generally how we use Cruise Control.  If anyone reading this does anything differently then let me know as I’m always looking for ways to improve the tools use and make it more beneficial to the team.

In the next part I’ll look at the process of nightly builds and how that can help the rest of the game, since nothing so far has covered the fact that games are more than just code!

 

Blitz Arcade

I’ve gone over what the main benefits of Continuous Integration can be to a project, so it makes sense to move onto the various processes used at Blitz Arcade and how these benefits actually come through.  As I mentioned, CI process can come in many different shapes and sizes, but most will have the following process in common (even if they don’t use the same software or ideas).

 

Version Control Software

The basic lynchpin of any integration process is solid version control software, and with that I don’t just mean source control software but all the other assets that are used in the build process.  This can include models and textures, scripting files, sound packages or anything else that is needed to build the game.  Without version control software, how do you get the latest copies of all the game assets or the source?  Copy them off a shared network folder?  E-mail them to shared locations?  Most people will know that simply doesn’t work and if you don’t I’m sure you will learn why soon enough.

It might seem obvious and I would expect even one-man development teams to be using something like Subversion for their code base at the very least, especially with the ease that these systems can be set up, but that it not always the case.

At Blitz we use Subversion for all our source control needs as it’s solid, mature, mostly reliable and is totally free.  Some people prefer Perforce, some people are starting to see the merits of Git, some people might even use SourceSafe (but at least they are using something!).  We also use a variety of SVN 3rd party tools, namely TortoiseSVN and AnkhSVN.

For asset control we use our custom BlitzTech asset control system which allows us to associate a large amount of meta data with each asset and to review the whole history of the individual files.  But even though most source control systems are not really set up for binary data you can still hook the process together if you don’t have access to anything else.

By using these tools we can back track to any point in time and see the project as a whole, get the most up-to-date content that is required for the entire game, and review changes across every asset in the game, all with just the click of a button.

CruiseControl.Net

CruiseControl.NET is a free Automated Continuous Integration server that hooks directly into a set of source control repositories (in our case per project Subversion repositories) and automatically kicks of a build process every time a change to the repositories is detected. This will then build the project for a set of platforms and configurations, meaning that on every check-in our example project – Super Space Fighter VII- will be built on Xbox360, PS3 and PC and will usually build all debug, release and master builds.

Since this is always done on a specific Cruise Control machine, we can guarantee that the builds are being done in the same environment with the same set of conditions every time, regardless of who last checked in their work.  So when it builds on one person’s development machine, but not the Cruise Control machine, then there is something the original developer needs to fix before anyone else can get latest.

Cruise Control also provides a variety of feedback mechanisms to help the developers get the most of the system.  The CruiseControl.net system tray allows simple notifications if a build breaks on a check-in, which then allows the user to see the build output and to find a solution before checking anything else in.  Since this is constantly running the background developers are always informed when a build breaks.

So at the most basic level, we have a system that is totally version controlled, meaning we can step back to any point in time of the projects life cycle, and a project’s code base that is continually built throughout the day, meaning we now know if we even have a project that compiles from the off.  While this is a great start, Cruise Control as it stands still has problems, namely the time is takes to build a project (our example project has 9 different combinations to build) and that fact that broken builds can still be left unchecked if the developers miss the notifications.

In the next part I’ll cover how we have extended Cruise Control to improve the notifications, speed up the whole process and to track the various build stats that are happening throughout the day.

When you integrate a project, it means taking a snap-shot of its current state and building it into a working piece of software.  This can involve building the latest code base, the assets of the game and anything else that you might need.  Continuous Integration (CI) is the process of building a project and its assets consistently throughout the day or at various points of the project’s life cycle.

It also means automating this process as much as possible.

These integration systems can come in all shapes and sizes, but regardless of what you do and how you do it, you should start to see the benefits very quickly and as you continue to develop it those benefits will start to become more and more pronounced.

In the first part of this series I looked at what a ’standard’ development process would look like when CI wasn’t used.  Before I move onto discussing the various systems used at Blitz Arcade, I wanted to cover what the ideal set of outcomes should be when a good CI system is in place.

Reducing Risk

  • Because the project is being constantly built and integrated to different levels of completeness, defects in the code are found and fixed sooner.  This reduces risk because the majority of the problems are fixed by the time build day comes around, rather than on the day itself.
  • It becomes much easier to create complete builds of the project because they are done more often and usually on a daily basis rather than every two to four weeks.
  • Most developers have their own setups on their machines.  Because the build is done on a specific CI (or build) machine it reduces dependencies on specific hardware setups.

Reducing Repetitive Tasks

  • Human error will always be a problem in manual systems and that is not going to change anytime soon.  Putting the majority of what is usually the same process being run again and again in the hands of an automated system allows developers to concentrate on what they are good at – high-level tasks that involve making the game a better experience.

Generating Usable Software Continually

  • At any point in time you can create a usable build meaning that designers and artists do not have to pester anyone for a new version.  Depending on the type of system being run, a new build should always be available when needed and this will be as up-to-date as it possible.

Better Project Visibility 

  • You always have a good idea of the state and stability of the project meaning you know how long it will take to create stable builds, what the current problems are and where the issues lie.
  • It increases confidence in the build and the project because you always have the latest build of the project to hand, not a half built, half up-to-date version on various different machines.
  • This all allows you to make better decisions because you are more confident when you say how long it will take to implement new features, optimize the process or make staffing changes.

In the next part I’ll start to talk about the specific parts of the Blitz Arcade CI process, what we use, how it works together and eventually how it can be made even better.

Continuous Integration, build processes and testing are all very important parts of the development process, whether it’s a AAA game, small arcade title or a set of middleware.  It’s only in the last couple of years that this has really gone mainstream in the industry (it’s sort of swung in with the whole agile thing which has unfortunately led some people to think it can only be part of an agile process – which is thankfully not true) but it’s been used for decades in various forums throughout the IT industry.

For the next few posts I’m going to detail the CI process we use on all of the titles developed in Blitz Arcade.  I’m going to keep each post short so I can actually get around to writing them without having to write it all in one go.

Shaky Foundations

The best place to start is to describe the situation we have before we even start looking at a CI process so we at least know that we have a problem to solve…

So imagine we have a game in development, let’s call it Super Space Fighter VII, that is being developed for LIVE Arcade, PSN and will be released on PC via Steam.  It is being created by 5 programmers, 3 artists and 2 designers and is going to be a simultaneous release on all platforms, so each milestone requires builds on all three platforms, with no scope to leave a build behind for later.  Development kits are expensive and so to keep costs down we have 2 PS3 dev kits (both for the lead PS3 programmer) 3 Xbox 360 kits and the final programmer works solely on the PC.  

This means this if one of the X360 programmers changes some code, they have no way of testing (or compiling if time is an issue) the code on the PS3, so the lead PS3 programmer is going to spend a large amount of time making other peoples code work on his platform, especially as the different compliers all have their own quirks.  This is also the case for the PC developer and any changes that are not tested on the 360.

For testing purposes each programmer will have custom code on their machine (usually wrapped in a #ifdef/end block) that if not removed could be checked in and if a build is done on their machine this code could find it’s way into the finished product.

So come milestone day – usually a Friday – it is someone’s responsibility to bring together everyone’s work for that month, starting by simply making it all compile then testing the game so that it doesn’t contain any serious crash or gameplay bugs.  Since this can be a long process, people have avoided updating their local copies for the past month because they don’t want to be stopped in their tracks, so this is made even more difficult as code will simply be out of step with each other.  So to make it easier, we’ll start the build on Wednesday so it isn’t such a rush (but that means we do lose almost 3 solid days of development).  Some people will try to ease this process by doing weekly build rather than monthly builds, but you still have the same problems, just spread a bit thinner.

This is fine for the programmers but what about the designers or the artists?  They need constantly updated builds so they can test their new features or see how their new assets sit in the game.  But because people don’t want to stop working to create a new build the artists will simply be given whatever build is on the nearest programmer’s machine, or if they need a specific persons changes, they have to pester them for another build so they can continue work.  So the designers and artists work is also never tested on a ‘real’ build, which also means it might have to be manually brought together when a new build is produced.

QA are also at a disadvantage.  What is the point of stress-testing a game when it’s not actually the complete product?  Which means testing can only kick off at the end of the milestone month and that build will be out of date before they even start.

So what we end up with here is a process where the ‘real’ game is never really seen until it is really needed, and if there are serious problems in there they are simply not seen until it is too late.  Since everyone is literally using a custom build of the game every day, no-one has a full picture of where the title is, how stable it is and most importantly how fun it is to play.

And this is where Continuous Integration comes into its own…