Writing Stable 3ds Max Plugins

I found this document while looking through old files today, and thought I’d share it. It was from a lecture I gave at Autodesk University back in 2012. It applies to 3ds max, but has some points that would be applicable to software development in general.

[Note] I wrote this along time ago, and today I saw this blog post again. I thought in fairness, I should add a few things. These were my guidelines I came up with based off of years of experience and years of experience of fixing bugs in 3dsmax. While I firmly believe in every single last one of them, unfortunately, hardly any of these things ever entered the thoughts of most 3dsmax developers. Most coded all day long being blissly unaware of warning level 4, and none ever showed an interest in static analysis except two people. In fact management most of the time was completely unsympathetic to these ideas. Of course management, even the development managers who used to be programmers simply didn’t care. They just wanted bugs fixed fast. No matter, nor interest was given to systematically fixing the fence at the top of the cliff. All thoughts were to get ambulances to the dead bodies at the bottom of the cliff as fast as possible. As a result the fences at the top were always full of holes. By the time I left Autodesk in the spring of 2014, only a dozen or so projects compiled at warning level 4. And no systematic static analysis was being done by anyone. I could go on, but that’s a thought for another blog post.

Introduction

Preventing crashes and increasing stability in software is a difficult task. There is no practice nor set of practices that will completely prevent all crashes. However there are a few things that will help reduce errors. With that short introduction let us get started.

Basic Responsibilities

These are basic practices that would apply no matter where you worked and no matter which product you worked on.

Compile at warning level 4

You should compile your plugins at warning level 4. Level 4 warnings can help point out subtle programming errors that can lead to bugs that can be absurdly difficult to discover later on.  This is a free and practically instantaneous way to find bugs early in the development process.

Code compiled at level 4 is better than code compiled at anything less. Level 4 warnings should be turned on, and no warnings should be suppressed.

The 3ds Max SDK compiles cleanly at warning level 4, and has been that way for at least 3 years now.

Case in Point:

We turned on warning level 4 for an old project recently. One level 4 warning pointed to some unreachable code. This was caused by a break statement that was left in in a loop. This problem eventually resulted in a complete feature not working.

Compile with Static Analysis

The highest version of visual studio comes with a static analyzer called Code Analysis. This feature can be turned on for native or managed code in visual studio. Static analysis does a deep scrutinization of the code and can help spot bugs. These bugs are more complex than what level 3 or 4 warnings can give. But these warnings are usually so fundamental that they can be likened to level 1 or 2 warnings.

Case in Point:

The static analyzer can detect allocation/de-allocation mismatches. For instance we turned it on and found when memory was allocated with new[] but was de-allocated with delete, instead of delete []. We found lots of these scattered throughout our large source code base. The advantage of this is that it is so easy to detect. Without static analysis it would take a special tool like bounds checker to reveal a memory allocation mismatch, and that would only be after exhaustive testing.

Check Pointers for NULL

By far the most common fix I have seen for known crashes in 3dsmax is to check a pointer for NULL. This is the most persistent problem that I have ever seen in C/C++ code. Get into a habit now to check every pointer for NULL before using it.  A corollary to this is to initialize all pointers to NULL before and after they are used.

Case in Point:

The visual studio static analysis tool emits various warnings for possible dereferencing of null pointers. Consequently I have rarely seen this problem in code that compiles at level 4 with static analysis.
For the Rampage Release the 4th highest Crash in 64 bit max was a crash using Ray Traced Shadows. The shadow code contained a buffer of floating point values that was uninitialized. It was extremely difficult to track down, as it was only manifest when the debugger was NOT attached.

Check before casting

If you lie to the compiler, your application will come back and bite you. C is a language that is seemingly built on casts, where anything can seemingly be cast to anything else. This ability to so easily lie to the compiler and misrepresent your types to the compiler is dangerous and risky. Therefore prefer to use C++ style casts. By turning on RTTI and using C++ style casts, the results of the cast can be checked for validity.

Case in Point:

In the sdk header file imtl.h is a class called MtlBase which has 4 derived classes. One of those classes is class Mtl. I found functions in MtlBase that was blindly assuming the instance (i.e. this) was an instance of class Mtl. However this ignored the fact that there were 3 other derived classes from MtlBase. Thus it was casting the ‘this’ pointer to class Mtl, and then doing work on that miscast pointer.

Avoid stack based strings

A very common way to crash the application is over-reliance on stack based C strings. This code for instance is very dangerous:

void foo() {
TCHAR buf[SIZE];

}
One of the problems with stack based strings, is operating on a string that is bigger than the buffer. This of course can corrupt the callstack.  This is almost impossible to debug afterwards and usually makes reading minidump crash files an exercise in frustration.  The danger can be minimized by using the newer safe string functions. For instance instead of using strcat, which can easily run over the end of a string, you can use strcat_s which is a safer version.

When possible use TSTR or MSTR instead , where the actual string buffer is stored on the heap, and not the stack. Then if anything does go wrong, it will not corrupt the callstack.

Now a disclaimer: Max has a lot of stack based strings all over the place (It is has been around a long time of course). But their usage is getting reduced as we now favor TSTR or MSTR.

Case in Point:

The code for the customization dialog contained a for loop that concatenated a string into a stack based buffer of limited size. The for loop interated too many times and the buffer overflowed, corrupting other items on the stack. That stack based buffer was several frames up the stack. When that stack frame was cleaned up, it crashed. Diagnosing the problem was difficult since the symptom was several function calls away from the source of the problem.

Avoid using catch(…)

If at all possible avoid using catch(…). Prefer to handle more specific exceptions like catching an out of memory exception such as (std::bad_alloc). While using catch(…) may prevent the application from crashing, it can  also hide bugs and make it more difficult to solve crashes. It is useful for debugging to actually remove a catch(…) and let the program crash exactly where the cause of the crash is located. You should generally catch only those errors that you can handle, and let the ones that you cannot pass through so that the larger system can handle it if possible, or crash in the “correct” place rather than delay it.

Now catch(…) can be used when it does something to correct the program state. This should be done only after careful consideration, usually with multiple developers. Also side affects needs to be considered as well. If a catch is used to wrap a call to thousands of 3ds Max functions, than it probably shouldn’t be used. However wrapping a call to a 3rd party library is acceptable. Everything needs to be balanced of course.

Certain regular expressions can easily be written to help search for empty catch statements. The static analyzer PVS-Studio will also help identify these too.

Case in Point:

I regularly review the usage of catch(…) in the source code, and have over the years taken out a few catch(…). As a result, the clarity of crashes from customers in the field has increased.

Use Debug Builds

When debug builds are available, they should be used for testing and development. 3ds Max is the only M&E product that provides debug builds to ADN partners, all though they may be slow in delivery. However despite the delays a debug build provides a great resource in validating your plugins.

[Note: It turns out to be very ironic that I put this here, since the 3dsmax team does not use debug builds. Sure the devs do, but in all my years there I could never get management to move to have the QA testers use debug builds. Never the less I believe in debug builds and that they are far superior for testing than release builds.]

Watch log file, asserts and the debug output

Log File

3dsmax has a log file that writes to <max install>networkmax.log. This file is mainly used for debugging network rendering, which was its original purpose. However, it has grown to become a popular logging mechanism for max. This log can provide useful information, but it still is under-utilized and cannot be expected to report program state consistently across the product.

Asserts

Do not ignore asserts (remember debug builds?). Use asserts liberally in your own code and don’t suppress asserts unless they are logged and checked afterwards (for example, using automated testing). The assert system will automatically log all asserts (whether they are suppressed or not) to the file: <max install>3dsmax.assert.log.

Debug output window

The Visual Studio debug output window (debugging window) provides significant output and can be useful to watch during debugging sessions. Be sure to turn on display for all types of events including exceptions (very important) and regular program messages. If you want to check debug output without attaching a debugger, than you can use a Microsoft tool from sysinternals called DbgView. See the following website for details: http://technet.microsoft.com/en-US/sysinternals

Disclaimer: The MetaSL system parses a lot of files when 3ds Max starts up. This will generate a lot of exceptions that are benign, so not to worry. The reason is The MetaSL system, from Mental Images, uses a 3rd party library (antler) to parse files, which in turn uses exceptions for program flow.

Enable Break On Exception:

Visual Studio has debugging options that allow it to break execution when an exception is thrown. This should be used as often as possible. This is the corollary to the “No catch(…)” above.  There are a few places where max actually does use catch(…), for example in the maxscript kernel.  By enabling this feature, exceptions are immediately brought to the attention of the developer.

Max Specific Problems

Do not hold naked simple pointers to ReferenceTarget’s

A class that is not a ReferenceMaker should not hold a plain old pointer to a ReferenceTarget, or a class that derives from a ReferenceTarget, without some mechanism to ensure validity of the pointer before use (i.e. AnimHandles). Instead replace the simple pointer with a SingleRefMaker class instance, and have that observe the ReferenceTarget.

Good Bad
class Good{…

SingleRefMaker mObserve;

}

class Risky{…

ReferenceTarget* mObserve;

}

 

Do not write dynamic arrays of ReferenceTarget’s.

Do not write a class that holds an array of ReferenceTarget’s: especially when that array grows and shrinks at runtime.

A class like this usually has a container that holds pointers to ReferenceTargets. It usually overrides ReferenceMaker::NumRefs like this:

int RumRefs() { return myArray.Count(); }

Instead of a fixed number of items:

int RumRefs() { return 3; }

This cannot be done correctly without considering undo and redo (Subclassing class RestoreObj). The fundamental weakness of the reference system is that it expects references to be in a fixed position. That reference index is an internal implementation of the ReferenceMaker that should be invisible to clients. However clients routinely use the reference index to get a certain Target. And one of those clients is the undo system. One of the complications of such an implementation is that the Undo System usually expects that internal array to never shrink in size. If a ReferenceTarget is removed from the internal array, a RestoreObj usually should or could point to its old reference slot. The Reference System of course has no idea that the internal array shrunk in size, so if an undo action occurs it may stick that Reference back into the wrong slot. To avoid that, a common practice is to make dynamic reference arrays grow but never shrink. This wastes memory.
For example: Undo and Redo can change the size of the internal array via SetReference. So if you have an array with 10 ReferenceTarget’s and your undo/redo object happens to ‘redo’ and stick a reference back in at slot 5, well, all your other pointers from index 5 to 10 have now had their indexes bumped up by one. So now anything dependent or holding on to those moved ReferenceTarget pointers are now dangling.

There are a few alternatives to this:

  • Use class IRefTargContainer.
  • Use an array of AnimHandle’s.
  • Use a ParameterBlock

Do not access the Reference System after NOTIFY_SYSTEM_SHUTDOWN

The notification message NOTIFY_SYSTEM_SHUTDOWN (See notify.h) is broadcast before plugins are unloaded. It is critically important to drop all references to plugins in response to this message. There are many plugin modules that define ReferenceTargets that will then get unloaded shortly afterwards. Once the plugin module is unloaded, trying to access a ReferenceTarget defined in that module can result in a crash.

Do minimal work in DllMain

The MSDN docs state that minimal work should be done in DllMain. Specifically it warns against loader lock, among other things. The DllMain function can be called as a result of LoadLibrary. When LoadLibrary is executed a critical section is locked while your DllMain is active. If you try to do work that for example needs another DLL to get loaded, it could lock up the application as a race condition. Instead of doing work in DllMain on shutdown, there are a few other ways to do plugin initialization and unitialization. For example:

  • You can do uninitialization work in response to NOTIFY_SYSTEM_SHUTDOWN. (see notify.h)
  • You can and should use a LibInitialize and LibShutdown functions.

A similar warning is not to do heavy work in static variables constructors, because a static variable will get constructed close in time to when DLLMain is called. Then, when the static variable is constructed, the DLL may not be fully loaded and types needed by the constructor may not be available yet.

Do not violate party etiquette

Uninvited guests should not crash the 3ds Max party. When the party is over: go home.

Uninvited guests

Every plugin has an appropriate time in which it should be initialized, do its work and shutdown. For example:

  • A plugin for a color picker should not instantiate one when 3ds max starts up.
  • A plugin for a scene browser should be active ONLY when its UI is active.

It is entirely possible and probable that users can start max and NEVER use your plugin. Therefore do not waste memory and resources for a feature that may not get used. Do the work when users actually invoke your feature. In other words when 3ds Max starts up, the plugin should not invite itself to the 3ds Max party, it should wait for an invitation.
This rules is violated on startup by loading 3rd party libraries, instantiating plugin classes, holding pointers to the node scene graph and registering callbacks to common scene events (my favorite pet peeve: “Hey max crashed in this function even though I never used this feature?”).  When max loads a plugin, the major things 3ds Max requires from a plugin are:

  • The number of class descriptors
  • A way to get those class descriptors.
  • Some pointers to LibInitialize and LibShutdown functions.

Therefore class descriptors really are the only things that should be instantiated on module load or startup. There should be no static instances of the actual plugin class, whether it is a material plugin, shadow, utility, or renderer. Of course there are exceptions such as function published interfaces and parameter block descriptors that often are statically defined: But I’m not talking about those.

No loitering

When 3ds Max shuts down, it sends out the most important broadcast notification in all of 3ds Max (found in notify.h): NOTIFY_SYSTEM_SHUTDOWN. This means the 3ds Max party is over. The plugin should completely shut itself down or disassociate itself completely from all max data. For example: All References should be dropped. All arrays holding pointers to INode’s should be cleared out etc… And most common and most dangerous: All callbacks functions that are registered should be unregistered.

When NOTIFY_SYSTEM_SHUTDOWN is broadcast, the entire max scene is completely intact and still in a completely valid state. During any callbacks or notifications after that, 3ds Max will contain less and less of a valid state to work with. In other words as 3ds Max progresses in its shutdown sequence less and less of the max scene will be valid. So for instance the other shutdown notification NOTIFY_SYSTEM_SHUTDOWN2 is called merely when the main 3dsmax window (think HWND) is destroyed. No plugin should be responding to that message to (for example) iterate through the scene graph. Likewise the LibShutdown functions should not be iterating the scene graph.

Case In Point

Say that a plugin that depends on another library like this:
plugin.dll -> library.dll
When the plugin is loaded by max, the tertiary library will also (automatically) get loaded. But when the plugin is unloaded the tertiary library will not get unloaded. That is unless the reference count on the library is decremented to zero. This will not happen unless FreeLibrary is specifically called on library.dll (Which is not a common nor recommended practice). Thus instead, the library will get freed or shutdown long after WinMain exits and max has uninitialized and is gone. Therefore the tertiary library should not contain any dependencies on anything in the 3ds Max SDK. Thus for example GetCOREInterface() should never be called in a DllMain of a dependent module to a plugin (i.e. library.dll ).

Quality Testing

Developers can implement the following practices in their software development processes:

Automated regression testing

All good production pipelines should have regression testing that occurs automatically after a build. This is critical to help catch bugs before they get to customers in the field. Also the developers should have access to these automated tests so that they also can run these tests before submitting their code.

Dynamic Memory Analysis

This means using 3rd party tools to profile, analyze, check and verify memory during runtime of the application.

The following list of tools is a partial example of what is available:

  • MicroFocus BoundsChecker: Checks for memory leaks, or memory allocation mismatches among a host of other things.
  • Microsoft’s Application Verifier also checks for various memory problems during runtime such as accessing an array out of bounds.
  • Visual Leak Detector (Open source on codeplex.com) checks for memory leaks. It is fast, efficient and stable.

Code coverage

This is using a tool to measure how much of your application or plugin was actually tested during execution. This helps a developer to know when they have tested the product enough. It also can help a developer find areas they have not tested. Simply put untested code is buggy code, and a code coverage tool helps in this regard. The best tool I have ever seen for this is Bullseye (bullseye.com). It works for native C++ and is easy to use, and very fast. It requires instrumentation of the code during the build which can double the build time,but runtime performance is excellent.

 

 

 

 

 

 

 

 

Advertisements

One thought on “Writing Stable 3ds Max Plugins

  1. Pingback: Writing stable 3dsmax plugins | Stigatle.no

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s