On Value and Reference Type Variables

Published: 2021-10-08
Discussion on: Acmion (on this website), or Reddit.

Most programming languages support value and reference types, but the specifics of their implementations differ.

This difference between these two types of variables is also a source of some confusion, especially for beginners, but can also be problematic for seasoned programmers. For example:

# Value types
v0 = 0
v1 = v0

v0 += 1

# Note: different values.
print(v0) # => 1
print(v1) # => 0


# Reference types
r0 = Person("Jack")
r1 = r0

r0.name = "John"

# Note: same values.
print(r0) # => Person { name = "John" }
print(r1) # => Person { name = "John" }

Why does the behavior of these cases differ since we just mutated a variable in both cases? Well, in one case the variable was a value type and in the other a reference type. However, this is not a particularly good explanation for beginners. Additionally, how do we determine what is a reference type and what is not?

C# uses struct for value types and class for reference types in the declaration phase. Primitives (int, float, char etc.) are value types, but they can also be passed as reference types to functions with the ref keyword. This approach is somewhat problematic since both are still initialized in the same manner:

var v = new ValueType() // Was declared with struct
var r = new ReferenceType() // Was declared with class 

Some languages default to everything being a value type and use * to dereference a pointer and & to pass a reference (same would also apply to keywords that accomplish the same, for example, ref and deref). However, in my opinion, this is noisy, difficult to teach and one must always remember to use the correct operator:

let v0 = new ValueType() // A value type
let v1 = new ValueType() // Still a value type

some_function0(v0) // A value type passed by value
some_function1(&v1) // A value type passed by reference

let v2 = &v1 // A reference to v1
let v3 = *v2 // A copy of v1 (dereference of v2, which was a reference to v1)

I am not satisfied with either approach and I am as such searching for alternatives. Some thoughts that I had:

C# Style, but Without struct and class Semantics and With Different Initialization Keywords

var v = init ValueType() 
var r = new ReferenceType() 

void SomeFunction(ValueType v, ReferenceType r) 

Problems: The function declaration syntax is ambiguous with regards to value and reference semantics. One would probably have to rely on an IDE. And how would one pass a ValueType by reference? Ref<ValueType>? Or an extra ref keyword?

Always default to a reference type

This would be similar to mathematics, where x = y means that if x changes so does y.

Problems: One would again an extra keyword for the inevitable dereferencing.

Different introducing keywords

# Note the difference between val (= value type) and var (= reference type)

val v0 = new SomeType(0) 
var r0 = new SomeType(0) 

val v1 = v0 # A copy of the value of v0 at this point
val v2 = r0 # A copy of the value of r0 at this point
var r1 = v0 # A reference to v0
var r2 = r0 # A reference to r0

print(v0) # => 0
print(v1) # => 0
print(v2) # => 0
print(r0) # => 0
print(r1) # => 0
print(r2) # => 0

v0.inc()

print(v0) # => 1
print(v1) # => 0
print(v2) # => 0
print(r0) # => 0
print(r1) # => 1
print(r2) # => 0

r0.inc()

print(v0) # => 1
print(v1) # => 0
print(v2) # => 0
print(r0) # => 1
print(r1) # => 1
print(r2) # => 1

# Function declaration would not care about the semantics and by utilizing
# multiple dispatch, one would compile overloads with the correct value and
# reference type semantics.
def some_function(a0: SomeType, a1: SomeType) 

# Optionally one can define the value and reference type semantics.
def some_function(var a1: SomeType, val a1: SomeType) 

Problems: What about immutables? One could define keywords like vari and vali. Performance? The compiler could probably fix most issues. Accidentally copying large constructs, for example, big arrays. How should references to references work? Just a reference to the base variable or an actual reference to the referenced variable? What about the highly likely case of wrong variable type? One would probably once again need to define explicit ref and deref keywords, but they would most likely not have to be used too often.

Discussion

Discuss this post here! You may also discuss it on or Reddit.

Post a Comment

Error! Incorrect value.
Note: Commenting is completely anonymous, thus, comments can not be edited.

No one has posted a comment yet. Be the first one to comment!