Codementor Events

.NET: Do not return `.ToList()`!

Published Aug 02, 2019Last updated Sep 17, 2019
.NET: Do not return `.ToList()`!

Check out the original version of this article on my Developer Blog.

This is a pattern I see emerging from many developers, especially after hearing about the principle of Accept the weakest, return the strongest.

Returning something.ToList();

public class SomeClassHoldingData 
{
    private readonly List<SomeDataObject> _rawData;
    // imagine ctor that fills _rawData
    
    public IList<SomeDataObject> GetFilteredDataBad() => _rawData.Where(DataFilter).ToList();

    public static bool DataFilter(SomeDataObject obj){ /* Some filtering */ }
}

The reasoning I hear is that why return only IEnumerable when they can return I(ReadOnly)List, and that is a lot more useful for callers. Which seems true, doesn't it?

But, if you take a closer look, doing this provides no further utility for the caller compared to returning IEnumerable. The caller can call .ToList() on your returned object as well if they need to, but here you are forcing them to do so even when completely unnecessary.

This can cause considerable performance bottlenecks and problems with the semantics. Let me show you why.

Let's add a fixed method: public IList<SomeDataObject> GetFilteredData() => _rawData.Where(DataFilter); and let's run a thought experiment with a few usage scenarios with an assumed raw data size of 1000:

Usage 1 - The caller needs the full capability of a list (let's say it needs to modify it).

Old method:

IList<SomeDataObject> filteredList = myHolder.GetFilteredDataBad();
filteredList.Add(...);
  • Allocates a new List (takes time both initially then later for GC)?: Yes
  • Number of DataFilter invocations: 1000

New method:

List<SomeDataObject> filteredList = myHolder.GetFilteredData().ToList();
filteredList.Add(...);
  • Allocates a new List (takes time both initially then later for GC)?: Yes
  • Number of DataFilter invocations: 1000

Performance is the same, but we still have two major advantages:

  • The caller has the full power of List available (with AddRange, Sort, etc.). You could "fix" this by returning List instead of IList but you really should not do that.
  • The question of data ownership is a lot clearer. Using the old method it kind of looks like myHolder owns the data. I have to question myself: "Can I really add to that collection? Will that change the behavior of myHolder, or any other objects using it?" This can (and should) be documented, but relying on documentation for something that can be expressed using the language is bad practice. Using the fixed version clarifies everything: of course the list is ours (we created it), we can do whatever we want with it, it is not going to affect anyone else.

Usage 2 - The caller needs to iterate once

Old method:

foreach(var dataItem in myHolder.GetFilteredDataBad()){ /*do something with dataItem*/ }
  • Allocates a new List (takes time both initially then later for GC)?: Yes
  • Number of DataFilter invocations: 1000

New method:

foreach(var dataItem in myHolder.GetFilteredData()){ /*do something with dataItem*/ }
  • Allocates a new List (takes time both initially then later for GC)?: No
  • Number of DataFilter invocations: 1000

Here we do not even have to change the caller at all to fix the problem. Along with the list of advantages mentioned above we also gain the performance of not copying every filtered item into a new List for no reason. One less allocation is always a good thing.

Usage 2 - The caller needs one item

Old method:

var dataItem = myHolder.GetFilteredDataBad().FirstOrDefault(SomeCondition)
  • Allocates a new List (takes time both initially then later for GC)?: Yes
  • Number of DataFilter invocations: 1000

New method:

var dataItem = myHolder.GetFilteredData().FirstOrDefault(SomeCondition)
  • Allocates a new List (takes time both initially then later for GC)?: No
  • Number of DataFilter invocations: Best case: 1 - Avg: 500 - Worst case: 1000

In these usecases we have the potential for enormous speedups. (Again, without touching the consumer.) Adding to the list of advantages gained in the preceding usecases we also spare tons of DataFilter calls!

The Old Method example executes the filter on every element in order to build the list, then FirstOrDefault(SomeCondition) is run on the filtered list.

In contrast the New Method will run DataFilter until it finds a matching element, then immediately go on to execute SomeCondition. If the condition is not met it resumes filtering. But if it is, it stops immediately, skipping the DataFilter call on the rest of the elements.

Non-performance considerations

Intent clarity

As I mentioned in Usecase 1, using the fixed method clears up possible confusion about the ownership of the list. I have seen code consuming a .ToList-y method that made a defensive copy just to make sure it owns the list! That is now 2 (potentially expensive) completely unnecessary copies.

Properties

The wonders of working on enterprise software include that You can (will) always find worse code™. Imagine someone turned GetFilteredDataBad into a property.

What you have now is a property that 1. causes serious side effects (1000 DataFilter calls & a 1000-item array allocation) 2. has really weird behavior:

var originalCount = myHolder.FilteredDataBad.Count;
myHolder.FilteredDataBad.Add(someNewItem);
Assert.Equal(originalCount + 1, myHolder.FilteredDataBad.Count);

This fails. Or even simpler:

Assert.Equal(myHolder.FilteredDataBad, myHolder.FilteredDataBad);

This fails as well. Don't go on trying to convince anyone this is sane behavior.

Changing the fixed GetFilteredData() into a property is a no-problem.

There are other smaller advantages as well (e.g. versioning), but that is outside the scope of this article, here I wanted to keep the focus on the most acute issues. Of course as with most things in software design, do not take this as a hard rule. Sometimes you just have to do the copy, but try to avoid it if you are not forced to do so.

Feedback

If you enjoyed this article, feel free to check out my Developer Blog where I share more stories like this.

If you have any feedback or wish to learn more about a topic with me in a 1:1 mentoring session, feel free to contact me anytime!

Discover and read more posts from Marcell Toth
get started
post comments1Reply
Austin Bryan
4 years ago

Wow, really interesting read, never considered this.