C# Mutability Hacks
- 12 min readI’ve recently been working on a little roguelike framework/library/package for Unity to make the creation of roguelikes much simpler than starting from scratch every time. I am following a very good Rust tutorial by Herbert Wolverson. If you’re interested in reading more about him feel free to go have a look on his blog for more information.
The process of porting tutorial code that’s in Rust using a library very similar to Unity’s Entities package has been a very interesting one. I’ve come to develop a great deal of respect for the Rust language, but I’ve also fallen in love with some of the capabilities that C# has. Unity has done some cool things to leverage C# features to make a cool API.
Unity being clever
My main focus for this post will be the ref
and in
keyword in C#. When writing a system for some process in Unity you can use these keywords to indicate how you intend to use the data the system operates on. I’ll use a simple example below:
|
|
I’ll quickly unpack what this system does in case it’s not quite obvious. My game world has a bunch of entities that can have various components attached to the entity. This MoveSystem
is expressing that it wants all entities that have a Position
component and a Move
component. It’s then taking the value of the move and adding it to the position’s value. The cool thing here is the use of ref
and in
is expressing if I’m writing/reading the component value.
Unity’s Entities package is using this information to cleverly schedule a multi-threaded job that can split this work across multiple cores. If I have a few thousand entities that are moved using this system that can become quite a workload, but we know it’ll be safe to split the work across cores without having race conditions. The important part is knowing if subsequent or prior systems can operate on the same data. This is why expressing read/write intent is so important.
C# Compiler niceties
This post is focused specifically on the semantics of working with value types and how they are passed into methods. Let’s take an initial example that shows how value types are copied when being passed as parameter first:
|
|
Running this code the output is as follows:
|
|
The output is due to the Increment
method getting a separate copy of the test
variable. It correctly increments the integer stored in the structure, but the variable in the Main
method isn’t affected.
Enter ref
The ref
keyword is clever in that it’s more or less syntactical sugar for a pointer without the dangers of using a pointer. When a method expresses that it accepts a parameter by reference the method is very likely going to want to mutate the value. Updating the above examples to use ref
we’ll get the following:
|
|
The output is as follows:
|
|
It shows clearly that the original variable had been affected, but what’s nice is that ref
kind of makes it clear this could happen.
Enter in
The in
keyword can be of great help if you want to mark a property as “read only” while potentially getting the same pointer like behaviour as ref
. I’ll explore why I say there’s only the potential of getting pointer-like behaviour, but let’s first focus on what the in
keyword can give us from a compilation standpoint. If I try to mutate a simple structure passed to a method using the in
keyword there will be compilation errors. This is nicely demonstrated below:
|
|
When trying to compile this I get the following compilation error: Compilation error (line 20, col 3): Cannot assign to a member of variable 'in Test' because it is a readonly variable
. It’s cool that the compiler can “protect” myself from myself here, but playing around a little with this I realised there are some inconsistencies with how the compiler does these checks.
ref
and in
goals
Reading the documentation on writing safe and efficient code it became clear to me these keywords exist more as a way for the compiler to figure out how to do fewer copy operations when calling methods. There are some caveats to how in
functions, but it is mentioned in the documentation.
Looking back to the Unity example above you’ll notice all of the structure manipulations are defined within the system. The compiler can protect us here, but if we want to have some reusable functions you might bump into some strange behaviour.
Encapsulated manipulation
There might be some cases where you’d go and write a method within your struct so that it’s easier to share across your codebase. Take the following example:
|
|
This compiles successfully, but the output is now not quite what one would expect:
|
|
The in
keyword is suggested to work best with readonly
structures, but that’s out of scope for this post. If you are interested in knowing more the documentation linked above does explore the concept. What’s important to be aware of is that the in
keyword could still result in a copy, but it still enforces that you don’t mutate the value that was passed. This compiler check is only done on properties, fields and indexers. Methods on the type itself could make it look like mutations will occur, but these methods end up with a copy of the structure instead of manipulating the one from the calling method.
Let’s change the above example to rather use ref
in the static Increment
method:
|
|
This yields the following output:
|
|
Again we have something different happening to what one would expect. The ref
keyword is ensuring the variable is passed by reference, but now the Increment
method on the Test
type is affecting the original value. I expected that it would, but I then realised it wasn’t quite functioning in the way we’re used to value types functioning.
My main motivation for looking into this is to find a way to express this intent and still have the compiler show me where I screw up. I need a way to have encapsulated code that can be re-used and immediately warn me if I need to tell a system to expect a value by ref
instead of just by in
.
Using extension methods
One way I’ve found that does enable the compiler to help identify possible issues is with extension methods:
|
|
This outputs the following:
|
|
Now I can change the static Increment
method to using in
:
|
|
This yields a compilation error: Compilation error (line 20, col 3): Cannot use variable 'in Test' as a ref or out value because it is a readonly variable
. This is great news! I can now have my cake and eat it, but it does mean my structures need to have public properties, but for my purposes, it’s not the end of the world.
C# 8 to the rescue (sort of)
C# 8 has introduced the ability to mark property getters and methods on a structure as readonly like follows (this is also explored in the documentation linked earlier):
|
|
This is a good first step, but it still doesn’t fix the problem that an in
parameter could have a mutating method available that won’t cause an error when compiling. It’s going to be interesting to see if the language further develops these concepts to include mutability checks on methods that aren’t marked as readonly
. The other problem is that Unity doesn’t support C# 8 yet so it’s not useful for my current use-case.
Conclusion
Now you might think I’m mad for worrying about these things so much, but when building games having your tools help you as much as possible is crucial. It’s even more important if you’re doing dumb things like screwing around with pointers in structures. This isn’t even necessarily a bad thing because sometimes some manual memory management is just a more performant solution towards a problem.
I am very happy with having the extension method approach as an option, but it does come with its own set of constraints that might cause some issues. The Rust language has some superb rules that help you as a developer to reason about data ownership. The plus side of this is that if this is easier multi-threaded programming also becomes easier. Here’s to hoping more tools become available that can do this in other ecosystems!