-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query: Avoid adding unnecessary sort columns in query with projection and optional navigation properties #6861
Comments
We are adding order by columns since the navigation is optional which requires left outer join. We achieve left outer join by rewriting navigation in group join. Since our group join operator is optimized for streaming, we requires the ordering. Assigning to @maumar since it is around the area of navigation rewriting. |
…xtra sort columns in ORDER BY for foreign key columns When creating LEFT OUTER JOIN calls we add ORDER BY calls based on the key of the outer entity. This is because out GroupJoin method is streaming and we need to have results in order to create the correct groups in a performant way. However we can there are some optimizations we can do: - we don't need ORDER BY if the join is from dependent to principal, e.g. orders.Select(o => o.Customer.Name) because there will always be at most one customer associated with a given order. - in case when we fully translate GroupJoin into SQL we also don't need the extra ORDER BYs because we no longer perform streamed grouping on the client.
…xtra sort columns in ORDER BY for foreign key columns When creating LEFT OUTER JOIN we add ORDER BY clause(s) based on the key of the outer entity. This is because out GroupJoin method is streaming and we need to have results in order to create the correct groups in a performant way. However we can there are some optimizations we can do: 1. we don't need ORDER BY if the join is from dependent to principal because there will always be at most one customer associated with a given order. e.g. orders.Select(o => o.Customer.Name) which translates to: from o in Orders join c in Customers on o.CustomerId equal c.Id into grouping from c in grouping.DefaultIfEmpty() select c.Name . 2. in case when we fully translate LOJ into SQL (which is now very often the case) we also don't need the extra ORDER BYs because we no longer perform streamed grouping on the client.
…xtra sort columns in ORDER BY for foreign key columns When creating LEFT OUTER JOIN we add ORDER BY clause(s) based on the key of the outer entity. This is because out GroupJoin method is streaming and we need to have results in order to create the correct groups in a performant way. However we can there are some optimizations we can do: 1. we don't need ORDER BY if the join is from dependent to principal because there will always be at most one customer associated with a given order. e.g. orders.Select(o => o.Customer.Name) which translates to: from o in Orders join c in Customers on o.CustomerId equal c.Id into grouping from c in grouping.DefaultIfEmpty() select c.Name . 2. in case when we fully translate LOJ into SQL (which is now very often the case) we also don't need the extra ORDER BYs because we no longer perform streamed grouping on the client.
…xtra sort columns in ORDER BY for foreign key columns When creating LEFT OUTER JOIN we add ORDER BY clause(s) based on the key of the outer entity. This is because out GroupJoin method is streaming and we need to have results in order to create the correct groups in a performant way. However we can there are some optimizations we can do: 1. we don't need ORDER BY if the join is from dependent to principal because there will always be at most one customer associated with a given order. e.g. orders.Select(o => o.Customer.Name) which translates to: from o in Orders join c in Customers on o.CustomerId equal c.Id into grouping from c in grouping.DefaultIfEmpty() select c.Name . 2. in case when we fully translate LOJ into SQL (which is now very often the case) we also don't need the extra ORDER BYs because we no longer perform streamed grouping on the client.
…xtra sort columns in ORDER BY for foreign key columns When creating LEFT OUTER JOIN we add ORDER BY clause(s) based on the key of the outer entity. This is because out GroupJoin method is streaming and we need to have results in order to create the correct groups in a performant way. However we can there are some optimizations we can do: 1. we don't need ORDER BY if the join is from dependent to principal because there will always be at most one customer associated with a given order. e.g. orders.Select(o => o.Customer.Name) which translates to: from o in Orders join c in Customers on o.CustomerId equal c.Id into grouping from c in grouping.DefaultIfEmpty() select c.Name . 2. in case when we fully translate LOJ into SQL (which is now very often the case) we also don't need the extra ORDER BYs because we no longer perform streamed grouping on the client.
…xtra sort columns in ORDER BY for foreign key columns When creating LEFT OUTER JOIN we add ORDER BY clause(s) based on the key of the outer entity. This is because out GroupJoin method is streaming and we need to have results in order to create the correct groups in a performant way. However we can there are some optimizations we can do: 1. we don't need ORDER BY if the join is from dependent to principal because there will always be at most one customer associated with a given order. e.g. orders.Select(o => o.Customer.Name) which translates to: from o in Orders join c in Customers on o.CustomerId equal c.Id into grouping from c in grouping.DefaultIfEmpty() select c.Name 2. in case when we fully translate LOJ into SQL (which is now very often the case) we also don't need the extra ORDER BYs because we no longer perform streamed grouping on the client.
…xtra sort columns in ORDER BY for foreign key columns When creating LEFT OUTER JOIN we add ORDER BY clause(s) based on the key of the outer entity. This is because out GroupJoin method is streaming and we need to have results in order to create the correct groups in a performant way. However we can there are some optimizations we can do: 1. we don't need ORDER BY if the join is from dependent to principal because there will always be at most one customer associated with a given order. e.g. orders.Select(o => o.Customer.Name) which translates to: from o in Orders join c in Customers on o.CustomerId equal c.Id into grouping from c in grouping.DefaultIfEmpty() select c.Name 2. in case when we fully translate LOJ into SQL (which is now very often the case) we also don't need the extra ORDER BYs because we no longer perform streamed grouping on the client.
…xtra sort columns in ORDER BY for foreign key columns When creating LEFT OUTER JOIN we add ORDER BY clause(s) based on the key of the outer entity. This is because out GroupJoin method is streaming and we need to have results in order to create the correct groups in a performant way. However we can there are some optimizations we can do: 1. we don't need ORDER BY if the join is from dependent to principal because there will always be at most one customer associated with a given order. e.g. orders.Select(o => o.Customer.Name) which translates to: from o in Orders join c in Customers on o.CustomerId equal c.Id into grouping from c in grouping.DefaultIfEmpty() select c.Name In case of composite keys we need to check that entire key is being used as a join condition. Moreover, if there are additional elements in the join condition (on top of the FK->PK) we don't need to add order by either, because the grouping will be even more restrictive. 2. in case when we fully translate LOJ into SQL (which is now very often the case) we also don't need the extra ORDER BYs because we no longer perform streamed grouping on the client.
…xtra sort columns in ORDER BY for foreign key columns When creating LEFT OUTER JOIN we add ORDER BY clause(s) based on the key of the outer entity. This is because out GroupJoin method is streaming and we need to have results in order to create the correct groups in a performant way. However we can there are some optimizations we can do: 1. we don't need ORDER BY if the join is from dependent to principal because there will always be at most one customer associated with a given order. e.g. orders.Select(o => o.Customer.Name) which translates to: from o in Orders join c in Customers on o.CustomerId equal c.Id into grouping from c in grouping.DefaultIfEmpty() select c.Name In case of composite keys we need to check that entire key is being used as a join condition. Moreover, if there are additional elements in the join condition (on top of the FK->PK) we don't need to add order by either, because the grouping will be even more restrictive. 2. in case when we fully translate LOJ into SQL (which is now very often the case) we also don't need the extra ORDER BYs because we no longer perform streamed grouping on the client.
Fixed in 398dec4 |
Thanks. I greatly appreciate it. |
I want to test this out. On the main page, it mentions, nightly builds and Myget. I've never used this before. What I'm wondering if I can add Myget to Visual Studio so that I can get the nightly builds off Nuget? Are there instructions somewhere for doing this? |
You can use the following myget feed: https://dotnet.myget.org/gallery/aspnetcore-ci-dev |
@jemiller0 yeah you should be able to add it directly to Visual Studio, rather than modify the nuget.config |
Thanks maumar. That's great that the dev builds are available like that. I just tried it out and got it working. I ran into a versioning issue though since I had the Npgsql provider included in my project. However, I commented that code out for the time being and have it working. However, now, I'm running into a different problem. .StartsWith() filters are adding an extra CHARINDEX() call. EF 6 didn't do this. I'm assuming there's a reason for it. But, it looks like it's slowing down SQL Server's performance on large tables. I have a table with 7 million rows in it. With the CHARINDEX() call, the query takes 11 seconds to execute. Without it, it takes 1 second. I'm thinking it must be ignoring the index when CHARINDEX() is used. WHERE [b].[updated_by] LIKE N'jemiller' + N'%' AND (CHARINDEX(N'jemiller', [b].[updated_by]) = 1) |
Steps to reproduce
Run a query that does a projection such as the following.
The following zip file contains the definitions for the POCO classes that are used by the above query.
QueryOrderByBug.zip
The issue
EF Core 1.1.0-preview1-final generates the following SQL. There are a few problems with the generated SQL. The main one is that you can see that it is sorting by FormatId and LanguageId columns in addition to the LastWriteTime column which is the only one that was specified in the LINQ query. The Bibs table used below is a table containing 7.7 million rows. The query takes 12 seconds to execute. If I run the SQL using SQL Management Studio, but, comment out the extra order by columns, the query executes in 0 seconds. It looks like what it's doing is adding sort columns for the foreign key columns that are not required. The project above is also bringing in the BibStatus navigation property. That one is required and there isn't a sort for it.
A second problem is that the query is doing a projection, but, it looks like EF Core is bringing in all columns from the tables that are used. In this particular query, there is a column of type VARCHAR(MAX) that contains a lot of data. So, I want to avoid bringing that in.
Another thing that I noticed is that it appears that it's using an INNER JOIN on BibStatus, which makes sense because it's a required field. It's doing LEFT JOINs on the the other fields which are not required, which also makes sense. However, EF 6 used a LEFT JOIN for all of the JOINs. I want to say LEFT JOIN is faster than INNER JOIN. For this particular query, it doesn't appear to make a difference. The main problem that seems to be slowing everything down is the extra ORDER BY columns on the foreign key fields.
The following is what the SQL looks like for EF 6.1.3.
The query appears to work correct, just much slower than it should due to the extra sort columns.
Further technical details
EF Core version: 1.1.0-preview1-final
Operating system: Windows 10 Anniversary Update
Visual Studio version: 2015
The text was updated successfully, but these errors were encountered: