adams.co.tt

Consider overriding Equals and Hashcode as a general rule

15th June 2009

In both C# and Java, objects have methods for checking equality and producing hashcodes. For the purposes of this post I’ll mostly refer to the java methods equals and hashcode, but C# has the equivalent methods Equals and GetHashCode respectively.

If you have two objects, and you wish to test their equality, the equals method is the obvious choice. The default implementation in java is simply to check if the references are the same, i.e. the default implementation of this:

objA.equals(objB)

is the same as this:

objA == objB

However, value objects such as String, have a different implementation of equals:

public boolean equals(Object anObject) {
if (this == anObject) {
return true;
}
if (anObject instanceof String) {
String anotherString = (String)anObject;
int n = count;
if (n == anotherString.count) {
char v1[] = value;
char v2[] = anotherString.value;
int i = offset;
int j = anotherString.offset;
while (n– != 0) {
if (v1[i++] != v2[j++])
return false;
}
return true;
}
}
return false;
}


This implementation actually checks the value of another string that is passed to it, and will return true if the type and the value match.

Seeing this implementation, I postulate the following: All objects that represent a value should override equals and by extension hashcode.

You might argue that this is pointless if you aren’t performing comparisons between these objects of this type. What is the point of writing and maintain code that never gets executed at any time within your application? Well, there are a few reasons.

Firstly, utilities such as java’s Collections classes use equals and hashcode to manage their contents and to spot duplicates. Since the default behaviour is assume that different instances are not equal, it is possible, for example, to have multiple instances of the same value object in a hashmap key set. Bugs like this are subtle and can prove difficult to spot. Often they might be missed during unit testing, because it is possible that the scenario where multiple instances with the same value may not have been considered.

Another reason to properly override equals regularly is that it aids testability. If you are using matchers for testing the following line will be a common one in your tests:

assertThat(objA, equalTo(objB));

If you are using the default implementation of equals, then what that line does is check that objA and objB are the same instance. That would mean that the above assertion would be equivalent to the following if you have not overridden equals.

assertThat(objA, same(objB));

This is clearly not the case. The intent expressed by equalTo and same is quite definitely different.

A third reason to implement equals and hashcode on your classes is that if you are producing code to be used by another group of individuals or any sort of API, you have no idea how those individuals intend to use your objects. Even if you aren’t handling them in collections, they quite easily might.

However, as I alluded to earlier, there are reasons not to implement equals and hashcode on classes. The first that this is a lot of code to implement, test and maintain on many classes. Having said this, I would argue implementing them is still the right thing to do. There are a few cases where implementing these methods just really doesn’t make sense. That would be on things like static utility classes which never have objects instantiated. Another example might be on objects where the instance itself can be considered the value, for example a node in a representation of a tree.

tags: java, post
blog comments powered by Disqus