Avoid returning null in your code
If you have read the book “Clean Code: A Handbook of Agile Software Craftsmanship” by Robert C. Martin, you know that how things could go wrong when you return null from a method! This time I am going to look into some possible solutions to avoid returning null in your code.
In general, it ends with a solution like the following:
public class Modelo { public Persona GetPersonaByName(string nombre) { Persona persona = new Persona(); if (nombre == "pepe") { persona = new Persona { Nombre = nombre, Edad = 14 }; } return persona; } } public class Persona { public string Nombre { get; set; } public int Edad { get; set; } public Persona() { Nombre = string.Empty; Edad = 0; } }
I do not like it for several reasons, the main one is that it “hides” the cause, returning a person “without data” when the person is not there. It is establishing a convention that nobody else knows: that a person with the empty name “does not exist” in reality. The code will end up filling with ifs to validate whether the name is empty or not to do something or not. Unlike a null, where you forget about the if it generates an error in execution (and therefore is visible ), leaving you an if in this case will make your code behave badly … and sometimes this can be a lot, but that much, more difficult to detect the null reference, which at least comes with stack trace. Another problem is that a value that can be used as an “indicator that there is no data” does not always exist throughout the range of possible values.
There are several patterns to deal with those cases, the best known to be the NullObject. In fact, a NullObject “badly done” is what Juan proposes in his post. I am not very fond of NullObject, although I have used it sometimes (the last one was recently in a refactoring, where it was necessary to “deactivate” a whole functionality.) In this case, we did it by creating a NullObject of the object that was being used, so that the impact on the rest of the code (about 90 VS projects) was null).
I do not want to talk about the NullObject, if another solution is available. In this case, a class that contains the value plus an indicator of whether the value exists or not. It is the equivalent to Nullable <T> but for any type (yes … even those that can be null). A first it seems that we achieved nothing but allow me to explain and you will see the advantages it brings.
The initial version of our class would be:
public struct Maybe<T> { private readonly T _value; private readonly bool _isEmpty; private readonly bool _initialized; public T Value { get { return _value; } } public bool IsEmpty { get { return (!_initialized) || _isEmpty; } } public Maybe(T value) { _value = value; _isEmpty = ((object)value) == null; _initialized = true; } public static Maybe<T> Empty() { return new Maybe<T>(); } }
An important point is that it is not a class, it is a structure. That is to avoid null as structs cannot be null in this case (remember that we want to avoid nulls).
Ok, this structure, as it is, does not give us anything useful. The model method presented by Juan would now be as:
public Maybe<Persona> GetPersonaByName(string nombre) { Persona persona = new Persona(); if (nombre == "pepe") { persona = new Persona { Nombre = nombre, Edad = 14 }; return new Maybe<Persona>(persona); } return Maybe<Persona>.Empty(); }
Either we return a Maybe filled with the person or we return an empty Maybe. Here is a test:
[TestMethod] public void GetPersonaByName_con_null_devuelve_string_empty() { var modelo = new Modelo(); var persona = modelo.GetPersonaByName(null); Assert.AreEqual(string.Empty, persona.Value.Nombre); }
This test fails and the reason is obvious: persona.Value is null so persona.Value.Name gives a NullReferenceException. I could add an if in the code to validate if person.IsEmpty is true, and in this case, do nothing. Personally, I prefer a thousand if (person.IsEmpty) than an if (person.Name == “”) because the first if it makes it very clear that it is intended. But it is clear, that we have not won much. As I say, this structure hardly contributes anything.
The good thing is to prepare this structure so that it can be used as a monad. Sorry, I am unable to find simple words to define what is a monad because the concept is very deep, so I leave you with the link of the wikipedia: http://en.wikipedia.org/wiki/Monad_(functional_programming)
Now we are going to prepare our structure so that it can be used as a monad:
public struct Maybe<T> { private readonly T _value; private readonly bool _isEmpty; private readonly bool _initialized; public T Value { get { return _value; } } public bool IsEmpty { get { return (!_initialized) || _isEmpty; } } public Maybe(T value) { _value = value; _isEmpty = ((object)value) == null; _initialized = true; } public static Maybe<T> Empty() { return new Maybe<T>(); } public void Do(Action<T> action) { if (!IsEmpty) action(Value); } public void Do(Action<T> action, Action elseAction) { if (IsEmpty) { action(Value); } else { elseAction(); } } public TR Do<TR>(Func<T, TR> action) { return Do(action, default(TR)); } public TR Do<TR>(Func<T, TR> action, TR defaultValue) { return IsEmpty ? defaultValue : action(Value); } public Maybe<TR> Apply<TR>(Func<T, TR> action) { return IsEmpty ? Maybe<TR>.Empty() : new Maybe<TR>(action(Value)); } }
I have added two overloaded of methods:
- Do method to do something only if Maybe has value
- Apply method to link Maybes. This is the most powerful and we will see later.
Let’s start with the methods Do. These methods basically allow us to avoid the if (). They are little more than a little help provided by the structure. My test would be as follows:
[TestMethod] public void GetPersonaByName_con_null_devuelve_string_empty() { var modelo = new Modelo(); var persona = modelo.GetPersonaByName(null); var name = string.Empty; persona.Do(p => name = p.Nombre); Assert.AreEqual(string.Empty, name); }
The code of the Do is executed only if there is value, that is, if a person has been returned.
We could rewrite the test using another of the Do variants:
[TestMethod] public void GetPersonaByName_con_null_devuelve_string_empty() { var modelo = new Modelo(); var persona = modelo.GetPersonaByName(null); var name = persona.Do(p => p.Nombre, "no_name"); Assert.AreEqual("no_name", name ); }
There is not much more to say about the Do methods … because the most interesting method is Apply 😉
The Apply method allows me to link Maybes. To see its potential, I will change the Model’s method:
public Maybe<Persona> GetPersonaByName(string nombre) { Persona persona = new Persona(); if (nombre == "pepe") { persona = new Persona {Nombre = null, Edad = 42}; return new Maybe<Persona>(persona); } return Maybe<Persona>.Empty(); }
Now if I pass “pepe” it gives me to return a Person but with the Name to null. Treating those cases with ifs becomes very complex and expensive. Apply comes to our aid:
[TestMethod] public void Acceder_a_nombre_null_no_da_probleamas() { var modelo = new Modelo(); var persona = modelo.GetPersonaByName("pepe"); // En este punto tenemos un Maybe relleno pero value.Nombre es null var nombreToUpper = string.Empty; nombreToUpper = persona.Apply(p => p.Nombre).Do(s =>s.ToUpper(), "NO_NAME"); Assert.AreEqual("NO_NAME", nombreToUpper); }
The person variable is a Maybe <Person> with a value. The Apply method basically performs a transformation on the value (the Person object) and returns a Maybe with the result. In this case we transform the person object to p.Name, so the value returned by Apply is a Maybe <string>. And since the value of p.Name was null, the Maybe is empty.
The combination of Apply and Do allows dealing with null values in a very simple and elegant way.
If I ask you what functional capabilities C # has, surely many will answer LINQ … Why do not we make our Maybe <T> class participate in the LINQ game? Luckily that is very simple. To do this, it is enough that Maybe <T> implement IEnumerable <T> by adding those two functions:
public IEnumerator<T> GetEnumerator() { if (IsEmpty) yield break; yield return _value; } IEnumerator IEnumerable.GetEnumerator() { return GetEnumerator(); }
Basically, a full Maybe <T> behaves like a collection of an element of type T, while a Maybe <T> empty behaves like an empty collection. From here … we have all the power of LINQ to perform transformations, queries, unions, etc … with our Maybe <T> with others Maybe <T> or any other collection. E.g. we could have the following code:
[TestMethod] public void Comprobar_Que_Nombre_es_Null() { var modelo = new Modelo(); var persona = modelo.GetPersonaByName("pepe"); var tiene_nombre= persona.Apply(p => p.Nombre).Any(); Assert.IsFalse(tiene_nombre); }
And of course, we can iterate with foreach on the elements of a Maybe <T> 🙂
And finally, we are going to add a little more infrastructure to the Maybe <T> class. In particular support for comparison:
public static bool operator ==(Maybe<T> one, Maybe<T> two) { if (one.IsEmpty && two.IsEmpty) return true; return typeof(T).IsValueType ? EqualityComparer<T>.Default.Equals(one._value, two._value) : object.ReferenceEquals(one.Value, two.Value); } public bool Equals(Maybe<T> other) { return _isEmpty.Equals(other._isEmpty) && EqualityComparer<T>.Default.Equals(_value, other._value); } public override bool Equals(object obj) { if (ReferenceEquals(null, obj)) return false; return obj is Maybe<T> && Equals((Maybe<T>)obj); } public static bool operator !=(Maybe<T> one, Maybe<T> two) { return !(one == two); } public override int GetHashCode() { unchecked { return (_isEmpty.GetHashCode() * 397) ^ EqualityComparer<T>.Default.GetHashCode(_value); } }
Maybe <T> tries to replicate the comparison behavior of T. That is:
- Two Maybe <T> are “Equals” if the two Ts of each Maybe are “Equals”
- A Maybe <T> == another Maybe <T> if:
- Both Ts are the same object (in the case of type by reference)
- Both Ts are “Equals” in the case of types by value
E.g. the following test validates the behavior of ==:
var i = 10; var i2 = 10; var p = new Persona(); var p2 = new Persona(); Assert.IsTrue(new Maybe<int>(i) == new Maybe<int>(i2)); Assert.IsFalse(new Maybe<Persona>(p) == new Maybe<Persona>(p2));
And finally we added support for the implicit conversion of Maybe <T> to T:
public static implicit operator Maybe < T > (T from) { return new Maybe < T > (from); }
This conversion allows us to simplify the functions that should return a Maybe <T>. So now the function of the model can be:
public Maybe<Persona> GetPersonaByName(string nombre) { return nombre == "pepe" ? new Persona {Nombre = null, Edad = 42} : null; }
Notice that the GetPersonaByName function is still returning a Maybe <Person> but for the code, it is as if it were returning a Person. The return null is translated to return a Maybe <Person> empty.
Well … with that I finish the post. I hope you found it interesting and that you have seen other possible ways to deal with the happy null references.
Regards!