I spent about 2 days on a bug recently. The bug was a result of a major architectural change I had to make to an application as a result of a requirements bug in an implementation of a feature by a previous developer. Read on to find out an interesting feature (ahem, attribute) of assemblies.
Tags: Assembly, .NET, AppDomain, Unloading, CreateInstance
The Original Problem
A previous developer had created an assembly that would reflect on arbitrary other assemblies and return essential information so that a user interface could allow users to select functions from those assemblies. Unfortunately, the developer hadn't taken into account that you cannot unload assemblies from an appdomain once they're loaded. This means that the program is stuck locking any assemblies that are loaded to be viewed in the UI.
AppDomains
This was inconvenient and confusing: users couldn't move or recompile assemblies without shutting down our application and attempts to load a different version of the assembly (at a different file location, for example) would have inconsistent, unexpected behavior. The answer according to Microsoft is to use an AppDomain.
AppDomains are to .NET what processes are to unmanaged code: a safe, contained area that must use inter-domain (process) communication techniques to communicate outside of itself. .NET allows you to have mutliple AppDomains in the same unmanaged process, and you can create and delete domains at will (aside from the default, main application domain.) You can therefore remove any references a process has to an assembly by unloading the appdomain that references the assembly.
The Solution
My solution was to load an instance of the class that will do the reflecting in the separate appdomain, which means I need to locate my assembly and the class inside of it and call AppDomain.CreateInstance with that information. Here is a simplified diagram of the solution: Figure 1
I knew this could work because I had already implemented it for the runtime aspect of the application. This design time aspect should in theory be easier because I don't need to marshal the actual call parameters with various arbitrary types from one appdomain to another, just the descriptions of the classes, methods, and parameters in the classes I reflect on.
Reviewing the reflection classes that had already been implemented, I found they could be made to work across appdomains with only a couple of minor changes. They needed to be derived from MarshalByRefObject rather than from Object so that the .NET infrastructure would not try to marshal them by value across the appdomain. This was important because the reflection class loads the other classes and assemblies it is reflecting on, so that if it does its work in our appdomain, then we are stuck with those assemblies in our appdomain. By keeping the reflector class in a separate appdomain, implementing and using only object types that are defined by reflector.dll or mscorlib.dll, and using marshal-by-ref we can be sure we never directly reference any of the assemblies under reflection.
A Two-Day Bug
Unfortunately this solution generated an exception when I tried to create my own reflector class in a separate AppDomain. Here's the code: Figure 2
The error occurs on the cast from Object to AssemblyInspector. The error says that it cannot locate the assembly:
An unhandled exception of type 'System.IO.FileNotFoundException' occurred in Unknown Module.
Additional information: File or assembly name Reflection, or one of its dependencies, was not found.
Looking into the call stack I saw that it was trying to find the assembly:
"Reflection, Version=1.0.3.5, Culture=en-US, PublicKeyToken=ba91c806523fc1cf" (I changed the version and the public key token, but everything had some non-blank value.)
If I used CreateInstance instead of CreateInstanceFrom with the Assembly.Fullname of the assembly, I would get the same error on CreateInstance. So what was going on?
Damned Culture
It turns out the important part of this was the "Culture=en-US". The culture is specified by placing a value in AssemblyInfo.cs like this:
[assembly: AssemblyCulture("en-US")]
The previous developer had put that in there. Unfortunately, using that value on CreateInstance (or when trying to load other types in which case apparently .NET tries to do something equivalent, as during the type cast) tells the .NET loader that the assembly is a resource assembly and it must be found via the resource loading rules. It will look in various subdirectories of the main assembly for the file such as .en-US, .reflection, .reflectionen-US, but it will ignore the assembly in the main directory.
This would even happen if you tried to directly reference an assembly with a non-blank culture. The reason this didn't happen with reflection.dll when the client created an instance of the reflector class factory class (and previously, the reflector class itself) is that it was always created and loaded via COM interop, which apparently doesn't use the resource loading rules when finding the assembly (it uses CreateInstanceFrom, I guess.)
Once I removed the culture attribute from the assembly everything worked great.
Lessons Learned
I spent too much time on this stupid bug. I fell into the trap of making up too many theories and eliminating them rather than eliminating each line of code that could be the cause of the problem. Keith Hill recommended I start from scratch with a test project, and I found the problem in about half an hour as I built the project up to resemble the DLL that was giving me hell.
In my defense, there is a ton of opaque technology in this assembly: assembly signing, reflection, cross-app-domain interop, security, etc. That said, in my experience 95% of bugs are much simpler than they appear and simple test & elimination will do once the problem area has been identified. Next time I'll try to get to basics and not start diving into .NET cross-app-domain security implementation to find my solution!
http://gladfelter.net/weblog/article.php?story=20060421135002565