Thursday, November 23, 2006

Why no Equals() and GetHashCode() in System.Collection?

Today I had another brain freeze when I was trying to understand why two array lists would generate two different hash codes even if they contained the same data. But when I started my investigation in MSDN, I see that the ArrayList does not override the GetHashCode() nor the Equals() method. Ehhhhh?

So my question is how can I check if two lists are equal? NO, I don't want to go through all items. This should be handled by the framework classes and I don't want to do any unnecessary implementation.

If I look in Java, the ArrayList implements equals(), hashCode() as all good objects should. They have also a good explanation on how it is done here.

If I look through the other classes in System.Collection I see that no implements GetHashCode() or Equals(). If none of them do, then it must be a design choice. But what design choice would that be? If anyone have any explanation, please make a comment to this blog. I really want to know.

There is an interface in System.Collections.Generic that is named IEqualityComparer, which has two methods, Equals() and GetHashCode(). Why make an interface when every class implements those methods? And why doesn't the default list/collection classes implement this interface?

I did a little test application to test this out, because I can not believe it. Full listing here.
Pseudo code:

string a = "A";
string b = "B";
ArrayList list1 = new ArrayList();
ArrayList list2 = new ArrayList();

Console.WriteLine("list1.equals(list2) ? " + (list1.Equals(list2)));
Console.WriteLine("list1.equals(list1.clone) ? " + (list1.Equals(list1.Clone())));
Console.WriteLine("list1.hash == list2.hash ? " + (list1.GetHashCode() == list2.GetHashCode()));

This program would show:
list1.Equals(list2) ? False
list1.Equals(list1.Clone()) ? False
list1.hash == list2.hash ? False

If I create two array lists that contains the same items, the Equals() method will return FALSE. If I clone an array list and checks if they are equal the Equals() method returns FALSE. How can a CLONE not be an exact copy of the object, or how can two exact copies not be equal to each other?

The best part is that Hashtable doesn't implement GetHashCode() either. And that class require all other objects to implement that method in order to be used, but they refuse to do it themselves. "Do what I tell you, not what I do."

Is it because they know people will not implement those methods since developers are lazy? But why have they put them into the Object class if they can't be trusted? Or was it a copy-paste mistake from Java?
To me this is so wrong.


Tom Hawtin said...

The .Net way seems reasonable to me. Attempting to define value equality on objects with non-value semantics is dodgy.

redsolo said...

I agree that not all classes should implement Equals(). But Lists can (and most of the time) contain value objects, so why shouldnt it be possible to compare a sequence of value objects?

jeremiah said...

perhaps for some reason they WANT you to do basic object equality when you use Equals(). I can't fathom WHY though.

Maybe I've been dipped in Java too many times, but to me, when you want to check if the values of any instances of the same class are equal, you need to override Equals() and GetHashCode(). Not overriding those means you merely check for object reference equality.

I can't understand right away why that would be the default behavior for Collections in C#.

Anonymous said...

Well, I too think is very strange indeed but I can't offer any explanation.

All I can add is that IMO .Net didn't do such a good job with the serialization either. Last time I looked there was no equivalent to writeReplace or readResolve - maybe thats changed now though.

Cheers, Rob.

RichB said...

better late than never...

Enumerable.SequenceEquals() is your friend.

Pavel said...

Just in case someone is looking for a solution:

Enumerable.SequenceEqual(collection1, collection2)

