Saturday, December 12, 2020

A C# Puzzler: Records

C# version 9.0 introduces records.  Records are a computer programming concept in which a data type declaration has a number of keys which are used to define an equality operation.  Many existing languages have them (e.g. Kotlin data class, Scala case class, Java's upcoming records).  Because I stopped programming in C# before records were available, I missed using them, but now that I'm programming in Kotlin I have the pleasure of being able to use records.

Here is a little sample of some Kotlin code that exercises the record (data class) feature. It demonstrates that you can add declarations to the body without interfering with their useful semantics:

import kotlin.math.sqrt
import kotlin.collections.HashSet

public data class Cartesian(val x: Float, val y: Float) {

private var cachedR: Float = 0F
public val r: Float
get() {
var localR = cachedR;
if (localR == 0F) {
localR = sqrt(x*x + y*y)
cachedR = localR
}
return localR
}
}

fun main(args: Array<String>) {
val set = HashSet<Cartesian>()
val c1 = Cartesian(5F, 12F)
set.add(c1)
println(c1.r) // prints 13.0
println(set.contains(c1)) // prints true

val c2 = Cartesian(5F, 12F)
println(set.contains(c2)) // prints true

val c3 = c1.copy(x = 9F)
println(c3.r) // prints 15.0 }

I tried records in C# 9.0, and I was disappointed by some of the behavior.  Here is an "equivalent" fragment of C#.  Can you guess what it does?

using System;
using System.Collections.Generic;

public record Cartesian(float X, 
float Y)
{
    private 
float cachedR = 0F;
    public 
float R
    {
        get
        {
            
float r = cachedR;
            if (r == 0F)
                r = cachedR = (
float)Math.Sqrt(X * X + Y * Y);
            return r;
        }
    }
}

class Program
{
    static void Main()
    {
        var set = new HashSet<Cartesian>();
        
var c1 = new Cartesian(5, 12);
        set.Add(c1);
        Console.WriteLine(c1.R);              // 1
        
Console.WriteLine(set.Contains(c1));  // 2

        
var c2 = new Cartesian(5, 12);
        
Console.WriteLine(set.Contains(c2));  // 3

        // with expression not specified
        
var c3 = c1 with { X = 9 };           // 4
        
Console.WriteLine(c3.R);              // 5
    }
}

My hope would be that it would behave approximately the same as the Kotlin program (and the same as equivalent programs in Java or Scala).  But in C# all fields (even private ones) are considered key members of a record.  Consequently,

  1. (in the line marked "// 1") The use of the property c1.R changes the logical value of c1.
  2. Since it is not the same as the record that was added to the set, the modified record is no longer seen as a member of the set.
  3. Since the record contained in the set has been modified (records are reference types), it is no longer seen as equals to a fresh record which has not had its R property sampled.
  4. C#'s with expression produces a fresh instance with all of the fields copied, bypassing the constructor.
  5. Therefore the field cachedR is initialized to an incorrect value.

Due to these issues, programmers would be wise to use C# records in only the simplest scenarios.

2 comments:

Barry Kelly said...

It looks like you can get around some of those limitations by providing your own implementations of methods that would otherwise be generated; e.g. equality and construction.

Of course needing to do that yourself reduces the usefulness of the feature, and introduces the possibility of error that the feature is designed to avoid.

Seems like a future modifier or attribute could improve the feature? A bit like `transient` in Java serialization, some kind of modifier which can be applied to a field to indicate that it doesn't participate in object identity.

IMO mutable records are a bit of an edge case though. Once you've gone down the road to mutability, hidden state etc. you need to cope with data races in concurrent scenarios. I'm not surprised that the feature doesn't accommodate that, there's more than just nuanced object identity.

Neal Gafter said...

You make good points, Barry. My point is that I would have hoped the feature would not infect the semantics of members declared in the body (e.g. by requiring them to be declared with some new modifier). The implementation in this example already correctly deals with data races and presents an immutable public surface.