Over engineered build systems from Hell

While I was at Autodesk years ago we went through various build systems. When I first started the build was written in Perl. Dependencies were specified using a .csv file. I had no idea how it worked, nor did I care how it worked since I was interested in other things during those years. The build could routinely take an hour and 45 minutes, and was mind-numbingly boring since we usually had to do it everyday. And if you were unlucky multiple times per day. Even worse In fact on the build servers the build would routinely take almost 4 hours!

What a horrible tax.

Later on, a hot-shot programmer came along and rewrote the build system to use ruby and rake. It was supposed to be faster, which it kind of was. But the complexity was just as bad, and no one knew ruby nor how rake worked. Then the developer left the company, leaving behind a black box. So that silo of information was gone, and no one knew how the build system worked. It took 2 developers about a year and a half or so to learn the build system well. To really be able to work on it at the level the original developer had done.

To be sure there were problems with the build. It still took a long time to build the product. Somewhere near an hour. About the only thing we did gain was the ability to do incremental builds.

But modifying the build was the main problem. The build, written in ruby, completely reinvented the wheel on so many different areas. To understand this better, you have to understand that the product at the time was built using Microsoft tools because it was based solely on the Microsoft platform. Thus the source project files were in a build language that Microsoft created. That build language was built into visual studio and was called MSBuild. So instead of using Microsoft tools to create the build, ruby and rake were used instead. So instead of using Microsoft tools to parse the xml project files, a ruby gem was used. So instead of using anything from Microsoft to help with the build, everything was re-invented from scratch. Parsing visual studio .vcproj (and eventually .vcxproj) files was done tediously and laboriously and mind numbingly using rake and some xml gem. Talk about recreating the wheel! So much code written to duplicate a simple function call to a Microsoft function could have retrieved a fully instantiated object with full properties all intact.

Copying files to the build directory was another disaster too. It would take around 10 to 12 minutes to copy 7000~14000 files. It originally was somewhere near 7000 files, but grew over time. All written in ruby code that no one knew how to debug except to put print statements in.

Another problem was adding build properties. If you wanted to add a build property (a key value pair), you had to add it to multiple places in the ruby build, knowing exactly what to modify (in duplicate) and such. It was horrible.

Mixing ruby with MSBuild was like mixing iron and clay. They don’t mix well at all. It was alike a ruby straight jacket that hindered the build and visual studio upon which it was based.

There had to be a better way.

Eventually when the frustrations with the build boiled over, I learned MSBuild and figured out how to build max without ruby. It took over a year, from when I first got a working prototype, to get something into the main branch of max. Simply due to bureaucratic inertia. There are lots of people in positions of power who simply say no before learning about a subject. Which was something all too common there. The first liberation was freeing the list of things to get built from ruby. Yes the list of DLL’s and EXE’s to get built was specified in some arcane ruby syntax somewhere. The first thing was moving that list to a democratic XML file. Now any build language could parse it and find out what to build. The second thing was moving the list of files to copy to an XML file. Now any build system could know what files to copy as well.

Once those two things were in place, it was time to put in the build system that I originally came up with during a particular Christmas break.

It was written in pure XML, with one MSBuild extension that was written using C#. All the tools were native to visual studio, and what you built on the command line was what was built in visual studio. They both used the same properties (using native property sheets) and built in the same way.

What’s more I found that using native MSBuild tools to copy those 7000+ files to build now was incredibly fast. In fact, I while once debugging through that old ruby code responsible for copying, I found the source of the 10 minute copy task. Or why it took 10 minutes. It was using an N factorial algorithm! So given directory A with B thru Z subdirectories, it would iterate through all directories n! times. Each directory was not parsed once, but N! times according to the amount of sub-directories that existed. It was an unholy mess that proves that re-inventing the wheel is usually a disaster waiting to happen. Now back to the improvement: With the new msbuild copy mechanism it took 20 seconds to copy all those files. 20 seconds versus 10 minutes was a big improvement.

Incremental builds also greatly improved. Go ahead and build your product from scratch. Now don’t change a thing and rebuild. If you have a smart build system, it should just take a few seconds and nothing will happen. The build system will be smart enough to essentially report that nothing changed and therefore it did no work. My new build did just that in a matter of seconds. The old build system used to take about 5 minutes to do that (And it still did work anyways…).

Speaking of performance. The time to compile and link the actual files didn’t change much, because that was always in visual studio’s corner and not ruby. The improvement in performance came from the copying actions that now took 20 seconds. Also noticeable was the shorter time involved from when the build started to when the first CPP file was getting compiled. In ruby/rake it took quite a few minutes. In the new build it took a few seconds. Eventually when I got a new SSD hard-drive, I was able to get the build down to 22 minutes on my machine.

Better yet was the removal of duplication of anything. Everything was native and iron was mixed with iron and clay was mixed with clay. Sort of….

One developer, (who we called doctor No, because he said no to everything good), holding on to the fantasy that max would be multi-platform someday would not let ruby go. So there were in essence two build systems that could do the same thing. In fact he wanted an option to invoke one build system from another! So I had to put in an option to invoke msbuild from ruby/rake! This was hobbling msbuild with an old clunker. Kind of like buying a new car and towing the old one around everywhere you go. Yes extremely stupid and frustrating.

Which goes to show that old ways of thinking die hard, or don’t die at all.

The build at Century Software

Later on I moved to Century Software, a local company to where I live. That was such a fun place. Anyways their build system for their windows product was written in Make! Yes Make the original, ancient build system that you can’t even find documentation for anymore. I mean literally, I found (I think) one page of documentation somewhere on some professors lecture notes. The docs were  horrible. Make implemented here was so old it built one C file at a time. No multi-threading, no parallel builds nothing. Slow was the operative word here. That and a incomprehensible built output that was so verbose it was almost impossible to comprehend. The only good thing about it was that it immediately stopped on the first error.

So eventually I rebuilt that using MSBuild too. It took me a few months in my spare time. No bureaucratic inertia, no one telling me no. I just worked on it in my spare time and eventually I had a complete and fully functioning system for the tinyterm product. This build was the best I’ve ever done, with zero duplication, extremely small project files and a build that was very fast. It went from 45 minutes to a minute and a half.

When writing a product, the build system should be done using the tools that the platform provides. There should be no transmogrifying the code, or the build scripts (like openssl) before doing the build. When writing ruby on rails use rake for your build process’s. When targeting Microsoft platforms use msbuild. Using java, then use maven. Data that must be shared between build platforms should be in xml so that anything can parse them. And most important of all, distrust must go, and developers and managers must have an open mind to new things. Otherwise the development process’s will take so long, and be so costly, and exert such a tax that new features will suffer, bugs will not get fixed and customers will not be served and they will take their money elsewhere.

Advertisements

Solving linker error ‘LNK2022: metadata operation failed’

If you are compiling managed C++ projects, and ever encountered this idiotic linker error:

1>xxx.obj : error LNK2022: metadata operation failed (8013118D) : Inconsistent layout information in duplicated types (_PROPSHEETPAGEW): (0x020001f8).

1>xxx.obj : error LNK2022: metadata operation failed (8013118D) : Inconsistent layout information in duplicated types (_PROPSHEETPAGEA): (0x02000206).

than this post is for you.

For some reason in our code, this linker error always comes in pairs. No matter, in our code base it usually manifests itself after rearranging header files in a file that is compiled with /clr compiler switch.

The problem is caused by windows.h being included after other header files:

#pragma unmanaged
#include “file.h”
#include “otherfile.h”
#include <windows.h>

#pragma managed
….

The solution is to make sure that windows.h is included before anything else.

#pragma unmanaged
#include <windows.h>
#include “file.h”
#include “otherfile.h”

#pragma managed

….

Formatting Error in Process Explorer

The other day I was trying to reproduce a bug with 3dsmax involving altered regional settings. Many of our customers of course run max in countries where it is common in the language to use the comma as a decimal symbol. This for instance is a common practice in Quebec where the majority of the people speak French. It seems that some people have been having problems with max crashing on them when their O.S. regional settings are set to English but use a comma instead of a decimal symbol. So in attempting to reproduce this bug, I set my regional settings to use the comma. I didn’t reproduce any crashes with 3dsmax, at least not with our current production version. So anyways, it did have some strange unintended fallout however with process explorer from SysInternals.

What happened, is that all the numbers that process explorer reported were either truncated or just plain old cut off. So, for instance, instead of reporting 12 Gigabytes of Virtual memory for SQLServer.exe, it would simply report 2K. Obviously something was not right there.

Here are some links showing a screen shot of a normal and abnormal of process explorer.

Unfortunately, I had been installing and messing with a bunch of other debugging tools (Don’t you love playing with new tools?), and thought the problem was with them. Such was not the case. By reverting my regional settings to use a period for the decimal symbol, everything worked again.

How to Write Native C++ Debugger Visualizers in Visual Studio for Complicated Types

Introduction

This explains how to change how the visual studio native debugger displays data for very complicated types. It explains how to change this:

debugger_before

to this:

after

In other words, it greatly simplifies displaying native code in the debugger window.

This is not a tutorial on how to modify or customize autoexp.dat in general. This is a very specific tutorial on how to custom the debugger variable display for types that are very complicated and involve deep aggregation or inheritance.

Problem

Last year, I was trying to debug a particular problem and found it difficult to view a particular data type in the visual studio debugger. This particular data type was way over engineered and  complicated just for the sake of being complicated. It was a class that simply held a string and did path operations on the string. But it held the string in a complicated morass of class hierarchies that made displaying the actual string very difficult in the visual studio native debugger. So I embarked on a task to display the data I wanted in the debugger windows. This is possible by modifying the file:

“C:Program Files (x86)Microsoft Visual Studio 9.0Common7PackagesDebuggerautoexp.dat”

I have created a demo class hierarchy that is needlessly complex and demonstrates the problem that you have to drill down deep in the VS debugger to see the data. I also included code to run it:

  1. // NativeVisualizerTest.cpp : Defines the entry point for the console application.
  2. //
  3. #include “stdafx.h”
  4. #include <string>
  5. #include <memory> // for autoptr
  6. // The inner string, taken from std::string
  7. typedef std::basic_string<TCHAR> StupidString;
  8. // Another string that makes debugging that much harder
  9. class SpecialString    : public StupidString
  10. {
  11. public:
  12. SpecialString()     {}
  13. SpecialString(TCHAR* s)    : StupidString(s)    {}
  14. ~SpecialString()    {}
  15. };
  16. // Another class that wraps the SpecialString
  17. class PathPrivateInternal
  18. {
  19. public:
  20. PathPrivateInternal() {}
  21. PathPrivateInternal(TCHAR* s)
  22. : mString(s) {}
  23. ~PathPrivateInternal(){}
  24. private:
  25. SpecialString mString;
  26. };
  27. // The final class that holds a PathPrivateInternal
  28. // Visualizing this crazy class in the debugger will NOT be fun!
  29. class Path
  30. {
  31. public:
  32. Path()
  33. : mPtr(new PathPrivateInternal())
  34. { }
  35. Path(TCHAR* s)
  36. : mPtr(new PathPrivateInternal(s))
  37. { }
  38. ~Path()    { }
  39. private:
  40. std::auto_ptr<PathPrivateInternal> mPtr;
  41. };
  42. int _tmain(int argc, _TCHAR* argv[])
  43. {
  44. StupidString        str(_T(“I am a foo bar”));
  45. SpecialString      strB(_T(“I am a foo bar”));
  46. PathPrivateInternal ppi(_T(“I am a foo bar”));
  47. Path                  p(_T(“I am a foo bar”));
  48. return 0;
  49. }

So in this example, I have three concrete classes, a smart pointer (i.e. std::autoptr), and a typedef that have to be deciphered in order to display the string I really want to see. This is really a yucky mess: all for a class to simply hold a string and perform path like operations on it.

Demonstration

If I put a breakpoint in line 54, and add p to the watch window, I will see this in the debugger:

debugger_before

Notice how many sub tree’s I had to open in the debugger in order to see the string? !! Yuck !!

Unraveling the Gordian knot

The trick to displaying this in the debugger is to write a visualizer for each different class in this hierarchy. First you start from the inner class, and work towards the outer class. In this example the inner class is std::basic_string<TCHAR> and the outer string is Path.

The following list shows the order we need to write visualizers:

Type

StupidString
SpecialString
PathPrivateInternal
std::autoptr
Path

StupidString

In the case of StupidString we are in luck. autoexp.dat already contains a visualizer for std::basic_string that displays the data nicely.

image

But the other three types do not display helpful information in the preview window.

SpecialString

Most of the work to visualize all the data types takes place here in SpecialString. The class SpecialString isn’t a mere typedef, so there is a little more work to get it to display in the debugger. The trick is to treat it like a std::basic_string. So find the section in autoexp.dat that displays std::basic_string and plagiarize that code like crazy. Except don’t copy the children section.

Here is the ANSI string version of std::basic_string

  1. ;——————————————————————————
  2. ;  std::string/basic_string
  3. ;——————————————————————————
  4. std::basic_string<char,*>{
  5. preview        ( #if(($e._Myres) < ($e._BUF_SIZE)) ( [$e._Bx._Buf,s]) #else ( [$e._Bx._Ptr,s]))
  6. stringview    ( #if(($e._Myres) < ($e._BUF_SIZE)) ( [$e._Bx._Buf,sb]) #else ( [$e._Bx._Ptr,sb]))
  7. children
  8. (
  9. #if(($e._Myres) < ($e._BUF_SIZE))
  10. (
  11. #([actual members]: [$e,!] , #array( expr: $e._Bx._Buf[$i], size: $e._Mysize))
  12. )
  13. #else
  14. (
  15. #([actual members]: [$e,!],  #array( expr: $e._Bx._Ptr[$i], size: $e._Mysize))
  16. )
  17. )
  18. }

In this code, all that is needed is the “preview” code in line 5. Now there is a nice gotcha here. In order to properly display the type for SpecialString this program has to be compiled for non Unicode. That is because the visualizer for std::basic_string plagiarized from above is for <char> types. If I want to compile for Unicode, I need to plagiarize the following code (some stuff omitted for clarities sake):

  1. std::basic_string<unsigned short,*>|std::basic_string<wchar_t,*>{
  2. preview
  3. (
  4. #if(($e._Myres) < ($e._BUF_SIZE)) ( [$e._Bx._Buf,su] )
  5. #else ( [$e._Bx._Ptr,su] )
  6. )
  7. stringview
  8. (
  9. #if(($e._Myres) < ($e._BUF_SIZE)) ( [$e._Bx._Buf,sub] )
  10. #else ( [$e._Bx._Ptr,sub] )
  11. )
  12. }

The stringview part allows you to display your data Text Visualizer dialog. Copy that to the SpecialString visualizer too:

  1. SpecialString{
  2. preview        ( #if(($e._Myres) < ($e._BUF_SIZE)) ( [$e._Bx._Buf,s]) #else ( [$e._Bx._Ptr,s]))
  3. stringview    ( #if(($e._Myres) < ($e._BUF_SIZE)) ( [$e._Bx._Buf,sb]) #else ( [$e._Bx._Ptr,sb]))
  4. }

image

Now the debugger properly displays the string in the watch windows…

image

Notice now that the last two classes PathPrivateInternal and Path both now show some useful information in their preview windows. This is good, but not the final solution.

PathPrivateInternal

Now to visualize class PathPrivateInternal, will take just a little work.

  1. PathPrivateInternal{
  2. preview (
  3. [$e.mString]
  4. )
  5. }

Notice I only had to specify the member variable mString. There was no need to specify a type to render the data as, since the debugger already knows that mString is of type SpecialString. And the debugger already knows how render that. Here are the results:

debugger_middle

Path

The last part is to specify the visualizer for class Path.

  1. Path{
  2. preview (
  3. [$e.mPtr]
  4. )
  5. }

Which displays this:

image

However notice that it rendered the text as auto_ptr “I am a foo bar”. This is because of the visualizer for auto_ptr, which is also included in autoexp.dat:

  1. ;——————————————————————————
  2. ;  std::auto_ptr
  3. ;——————————————————————————
  4. std::auto_ptr<*>{
  5. preview
  6. (
  7. #(
  8. “auto_ptr “,
  9. (*(($T1 *)$e._Myptr))
  10. )
  11. )
  12. children
  13. (
  14. #(
  15. ptr: (*(($T1 *)$e._Myptr))
  16. )
  17. )
  18. }

Notice the hard coded string in the preview section. Also notice that the template type in the auto_ptr declarations is a star (*) meaning use this visualizer for all type’s that auto_ptr is using. There are a few things that need to be done to fix this. First, simply specialize the visualizer for when auto_ptr is holding a type of PathPrivateInternal.

  1. std::auto_ptr<PathPrivateInternal>{
  2. preview
  3. (
  4. [$e._Myptr]
  5. )
  6. }

Now, in the autoexp.dat file there are three sections, listed in order:

  1. [AutoExpand]
  2. [Visualizer]
  3. [hresult]

If the custom visualizer for std::auto_ptr<PathPrivateInternal> is placed after std::auto_ptr<*> then this custom visualizer will get ignored. This is because the debugger uses the first visualizer that it finds in the autoexp.dat file that satisfies the type criteria. And the star (*) template type unfortunately accepts all types including the std::auto_ptr<PathPrivateInternal>. Therefore put all custom visualizers at the beginning of the Visualizer section, NOT the end. After doing that, you will get the results shown below:

image

Here is another view with the Path type expanded:

image

Summary

Here is the final listing of all the visualizers:

  1. ;——————————————————————————
  2. ;  My Custom Types
  3. ;——————————————————————————
  4. SpecialString{
  5. preview        ( #if(($e._Myres) < ($e._BUF_SIZE)) ( [$e._Bx._Buf,s]) #else ( [$e._Bx._Ptr,s]))
  6. stringview    ( #if(($e._Myres) < ($e._BUF_SIZE)) ( [$e._Bx._Buf,sb]) #else ( [$e._Bx._Ptr,sb]))
  7. }
  8. PathPrivateInternal{
  9. preview (
  10. [$e.mString]
  11. )
  12. }
  13. std::auto_ptr<PathPrivateInternal>{
  14. preview
  15. (
  16. [$e._Myptr]
  17. )
  18. }
  19. Path{
  20. preview (
  21. [$e.mPtr]
  22. )
  23. }

And remember these main points:

  • Start with inner nested types and work your way outwards
  • Copy from STL types when needed.
  • Put all custom visualizers at the beginning of the [Visualizer] section in autoexp.dat
  • Use stringview to display text in the Text Visualizer if needed.
  • Use specialized template types to display special cases if needed.