Linq performance for in-memory collection

I have a list: Collection users which has around 100K+ records of users (all user objects fully loaded from the database with fields like Bio, First name, last name etc). This collection is fetched on application start from the database and is kept in memory.

Then I have code like:

User cachedUser = users.FirstOrDefault(x => string.Equals(x.UserName, username,
StringComparison.CurrentCultureIgnoreCase));

Which I use to fetch users from this collection. But somehow I noticed that this operation is incredibly slow. Is there a performance issue while using Linq to query in memory collection of large objects? Should I instead call the DB each time I want to get a user?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

I think you might need to re-think your architecture based on the information you have given us. Take advantage of the database and let it do the search work for you. Observe, measure, and make changes accordingly after that. You might realize that you prematurely optimized the whole thing.

Method 2

If you want to optimize your response time and you could create a Dictionary<T,U> and search the user within:

    Dictionary<string, User> usersDictionary = new <Dictionary<string, User>(StringComparer.CurrentCultureIgnoreCase);

    // After querying the users from the DB add them to the dictionary             
    usersDictionary.Add(user.UserName, user);

    // Then when you need to retrieve a user
    User retrieveUser = null; 
    usersDictionary.TryGetValue(username, out retrieveUser);

Hope that helps !

Method 3

Your LINQ query like any other iteration technique (loop, search in array) will access every single record until the requested record is found. In the worst case that means 100k comparisons. To make this faster, you have the following options:

  1. use a sorted list or a dictionary: a binary search is a lot faster. Sort
    the data when fetching it from the database by using ORDER BY
  2. use a DataSet. It’s like an In-Memory database which provides faster search
  3. Leave the data in the database and set appropriate indexes for faster access

I suggest to use the database due to the following reasons:

  • It’s a waste of memory to store 100k records, which you probably never use
  • As soon as you change your data, you will have to refresh your cache, which might be rather complex
  • web applications are multithreaded (every request runs in its own thread). In case you change your data, you will have to synchronize with locks.
  • a database can cache frequently called data
  • you have to write less code
  • you have a stateless web application which scales better (web farms)
  • your application probably has other data, you cannot store everything in memory

Method 4

The different in the search performance that you notice is because the database is use indexing to locate the string in the database, but you in the memory you simple search all records until you find the one. Also the database keep a hash number for the string and search for this number hash that is a lot faster, and not make actually string compare.

The Dictionary<> make also an indexing, but have a delay to add data, when the data start grow because when its add some data, every time is search where to place it in the correct index point.

Also the database cache the results, many database cache also the indexing and create extra statistics that help to locate fast what you looking for.

Is better to let the database make the search, except if you can make something faster for extra custom cases.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x