I was writing a simple class today that needed to check to see if a value was in a list.
As the list was about 2,000 strings, I wondered what the most performance efficient approach was. Normally I create a List<String> and call the .Contains() method to find out if the value I’m checking is in there – but this list is going to be hit millions of times. So getting it wrong could have performance implications.
I did a quick Google search and HashSet<T> seems like a good candidate. So I wrote a very simple console app that would create a list of strings, and then store the list in either a List<String> or a HashSet<String>.
Initially I set my list size as 2000 to model my application, and generated 10,000 calls to the list with a random element. I set up the lists so that about 10% of the random numbers should hit the list, and 90% not be in.
The first run was so quick that I was not sure about the results, so I reran with 20,000 numbers in the lists and 100,000 test hits:
|Method||Time to run:||Hits/misses|
|List<String>||24.68 seconds||9921 / 90079|
|HashSet<String>||0.0257 seconds||9719 / 90281|
I hadn’t expected such a clear cut difference: a HashSet<T> will be about a thousand times faster than List<T>