about
07/06/2024
CsBites - Execution
C# Bites Code

C# Bites - Execution

CIL, C# virtual machine, multi-phase translation

Execution of C# programs uses a virtual machine called the Common Language Runtime (CLR). It loads bytecode called Common Intermediate Language (CIL) generated by the C# compiler and translates that to native code for execution. The purpose of this Bite is to explain how that works and why it was designed that way. All of this demonstration code was build with Visual Studio, Community Edition, version 17.11.0, preview 2.1. At the time this was written, that was the latest version of the free Community Edition.
Fig 1. C# Program Code

1.0 C# Program Code

Figure 1. is a screenshot of a C# program, visible in a Visual Studio (VS) edit window. The code was designed to take a first look at how the CLR works. Also shown is a terminal window, opened by building and running the program within Visual Studio. Its contents are what you would expect from the code. All the program's facilities reside in the main function. It sends a hello message to the console, does a little bit of arithmetic, and sends the results of that to the console in a formatted string. The program starts by importing system libraries. Only the first, System, is needed by this program. The others were supplied by a VS project template. They provide a glimpse of the many useful libraries supplied with VS. It continues by using System.Console.WriteLine to send the opening message to standard output, which, by default, is a terminal window. The program then defines two local integer variables holding values supplied from literal numbers. Those are added and stored in a local variable. All three values are then inserted into a formatted string and sent to the output terminal. Next, we build the program and disassemble its binary into CIL bytecode, using ildasm, a tool supplied with the .Net framework that is shipped with VS.
Fig 2. Disassembly

2.0 C# Program Code Disassembly

The disassembly process is simple, as illustrated in Figure 2. In the background you see ildasm being run from a Developers Command window. That results in a form that contains a tree representation of the program structure. Clicking expands the project node to show a program node that, when clicked, shows a node for the main function. Finally, clicking on that pops up a window with the CIL representation of main. With the help of a list of CIL opcodes we can start to figure out how the translation process works. Tools like ildasm disassembler and JetBrains dotPeek decompiler work by using .Net reflection. In this example, the program assembly ConsoleApp2.exe is queried by ildasm using the .Net reflection Application Programmer's Interface (API), starting with System.Object.GetType(). We won't get into those details here, but will look at the CIL disclosed in Figure 3 and see what that can tells us about the process of running C# in the CLR virtual machine.
Fig 3. Common Intermediate Language
Fig 4. CIL Processing

3.0 CIL Processing

CIL is object oriented. Main(string[] args) is a static method of the Program object defined in Figure 1. The CIL window shows us contents of Main as represented in bytecode. Here are its contents:
  • It starts by defining maxstack, the length of the CIL stack for Main, presumably a fixed buffer based on program analysis.
  • Then defines locals, an array of local storage for the variables a, b, and c which will be accessed with indexes attached to CIL opcodes.
  • A nop presumably used for a debugger hook, which probably would not be there if the program was compiled in "Release" mode.
  • A ldstr "hello C# world" which pushes a reference to the literal string into the CIL stack.
  • The call void ...Console::WriteLine(string) pops the literal string reference from the stack and passes it to the named method.
  • ldc.i4.1 pushes an i32 int with literal value of 1 onto the CIL stack.
  • stloc.0 pops that value and stores it in locals storage using index of 0.
  • Repeats that process for b.
  • At this point the stack is empty and locals holds values for a and b
  • the next statements ldloc.0 and ldloc.1 push the local values for a and b onto the stack, calls add which pops off the two locals, adds them, and pushes the result back onto the stack.
  • stloc.2 pops the result and stores it into local storage at index 2.
  • A reference to the literal output format string is pushed onto the stack.
  • The next statements push local variable a onto the stack, box the value by storing in the managed heap and replacing a with its boxed reference.
  • That process is repeated for b and c.
  • At this point, the CIL stack has the three local variables and and output format string reference on the heap, e.g., four items hence maxstack has the value 4.
  • finally, System.Console::WriteLine(string, object, object, object) is called that uses the stack contents as arguments and displays the result on the terminal.
  • then Main returns ending the program.

4.0 Conclusions

Every thing we've looked at here was generated by the C# compiler. It doesn't use any knowledge of the underlying machine architecture: registers, fast cache memory, general store, or cpu instructions. It is the reponsibility of the CLR to map the virtual machine processing onto the program's platform using those details with native code. That native code generation happens when the program is loaded for execution. So the C# compiler is independent of target machine details. Native code generation is handled by the CLR.

5.0 References:

Reference Description
dotPeek - JetBrains decompiler Tool for translating between assemblies <--> CIL <--> C# source
C# Tutorial - w3shools Simple tutorial with guided exercises
Understanding CIL - Code Project Clear introduction to the Common Intermediate Language
Common Intermediate Language - Wikipedia Quick summary
List of CIL Instructions - Wikipedia Extensive list, useful for reference
  Next Prev Pages Sections About Keys