Performance

Writing fast code

Driver hell

Recently I've been struggling with the curse of computing: The PC that's running too slow. Once again I chided myself for not spending the time to figure out what was actually wrong so I could fix my installation of Windows and ordered a nice new large hard disk. A few days later it arrived and was sitting in my PC after a less than professional installation caused by Dell not providing enough hard disk mounting points for my particular needs and I set about thinking about how I wanted to partition up the disk and what operating systems to put...

The wonders of DDS textures

I was fiddling with some of my 3D code trying to speed up the loading and I've hit one of those classic "change the overal way you do something instead of optimise your code" situations. The code was simple enough before, it loaded 25 1024x1024 textures and displayed them. The original data was in GIF format so I loaded that into a .NET Bitmap and then created a Direct3D Texture object from that for each one. It used about 64Mb of video memory and took about 1600ms per texture to load so I was pretty sure that I could cut down...

What's new in the .NET profiling API for version 2

There's an interesting article in msdn magazine about what's changed with the profiling API for version 2.0 of .NET. The quick summary for anybody maintaining code that uses the old version is that in process debugging has gone and been replaced with lots of ways of doing what you were doing before. Time to play with the changes.

Battlestar Galactica Part 2

OK, Still no real content but I'll comment in this weeks Battlestar Galactica instead: The ending will drive at least one of my friends mad, and this is a good thing. This show really isn't trying to be a normal TV show. If they don't decide to make season 2 I'll be dissapointed. In other news I got 7 synchronised  video channels working today. I could have got more but I only had 2 monitors on a bottom of the range Dell PC and I ran out of space to make it a fair test with all the windows fully visable....

Comparing the performance of Managed DX with C#

I've been doing a bit of work on the performance of C# and how it compares to C++ and so I'm looking at Managed DirectX as a good platform for doing comparisons. One of the first things I did was attack the DX assemblies with Reflector so that I could see how they were implemented and this threw up the first potential problem, ensuring that I'm comparing like with like. Consider the code for Vector3.Normalize() public void Normalize(){ volatile pinned ref Vector3 local1 = this;.D3DXVec3Normalize(local1, local1);} The first thing you notice is that it calls through to the D3DX library, which is of...

More on the benefits of inlining

My previous posts on inlining need some numbers and code to back them up so... Take the following code: void Run(){   System::Diagnostics::Debugger::Break(); // Break so we can see the x86 assembly   int i = Increment1();   i = Increment2();   System::Windows::MessageBox::Show(i.ToString()); // Stop the optimiser removing the code} public int Increment1(){   int p = pos;   p++;   if(p > 10)   {      p = 0;   }   pos = p;   return pos;} public int Increment2(){   pos++;   if(pos > 10)   {      pos = 0;   }   return pos;} This code demonstrates what I said in a previous post about accessing member variables in an if statement causing the optimiser not being able to inline your...

More on inlining

In a comment to my last post Steve asks what the benefits are to inlining code, and does it make a difference. The answer to that is, of course, it can speed up your code and it depends by how much. The code that I was examining was called hundreds of thousands times per second and so the small overhead of calling into a function was exagerated to a point where inlining the code gained a saving that could be measured in seconds per minute. The real answer to the question is no, 99% of the time you don't care about the...

Big important gotcha about inlining under the CLR

Right, new blog means being more disciplined about blogging. Expect much more technical details from now on. I've been experimenting with some code that has been written under C# in order to test that an archetecture will be fast enough for a new application and as part of trying to understand what's going on I've been looking into how the JITer inlines code. The rules are, apparently, quite simple like simple flow control and only 38 bytes of ilasm code. What this all means is that nice short functions with simple if statements get inlined. Sor should do. From what I've observed...