Blog coding and discussion of coding about JavaScript, PHP, CGI, general web building etc.

Saturday, September 3, 2016

Surprising SQL speed increase

Surprising SQL speed increase


I?ve just found out that the execution plan performance between the following two select statements are massively different:

select * from your_large_table  where LEFT(some_string_field, 4) = '2505'    select * from your_large_table  where some_string_field like '2505%'  

The execution plans are 98% and 2% respectively. Bit of a difference in speed then. I was actually shocked when I saw it.

I've always done LEFT(xxx) = 'yyy' as it reads well. I actually found this out by checking the LINQ generated SQL against my hand crafted SQL. I assumed the LIKE command would be slower, but is in fact much much faster.

My question is why is the LEFT() slower than the LIKE '%..'. They are afterall identical?

Also, is there a CPU hit by using LEFT()?

Answer by Dan Sydner for Surprising SQL speed increase


There's a huge impact on using function calls in where clauses as SQL Server must calculate the result for each row. On the other hand, like is a built in language feature which is highly optimized.

Answer by hamishmcn for Surprising SQL speed increase


If you use a function on a column with an index then the db no longer uses the index (at least with Oracle anyway)
So I am guessing that your example field 'some_string_field' has an index on it which doesn't get used for the query with 'LEFT'

Answer by mfx for Surprising SQL speed increase


It looks like the expression LEFT(some_string_field, 4) is evaluated for every row of a full table scan, while the "like" expression will use the index.

Optimizing "like" to use an index if it is a front-anchored pattern is a much easier optimization than analyzing arbitrary expressions involving string functions.

Answer by FredV for Surprising SQL speed increase


Why do you say they are identical? They might solve the same problem, but their approach is different. At least it seems like that...

The query using LEFT optimizes the test, since it already knows about the length of the prefix and etc., so in a C/C++/... program or without an index, an algorithm using LEFT to implement a certain LIKE behavior would be the fastest. But contrasted to most non-declarative languages, on a SQL database, a lot op optimizations are done for you. For example LIKE is probably implemented by first looking for the % sign and if it is noticed that the % is the last char in the string, the query can be optimized much in the same way as you did using LEFT, but directly using an index.

So, indeed I think you were right after all, they probably are identical in their approach. The only difference being that the db server can use an index in the query using LIKE because there is not a function transforming the column value to something unknown in the WHERE clause.

Answer by BradC for Surprising SQL speed increase


More generally speaking, you should never use a function on the LEFT side of a WHERE clause in a query. If you do, SQL won't use an index--it has to evaluate the function for every row of the table. The goal is to make sure that your where clause is "Sargable"

Some other examples:

Bad: Select ... WHERE isNull(FullName,'') = 'Ed Jones'  Fixed: Select ... WHERE ((FullName = 'Ed Jones') OR (FullName IS NULL))    Bad: Select ... WHERE SUBSTRING(DealerName,4) = 'Ford'  Fixed: Select ... WHERE DealerName Like 'Ford%'    Bad: Select ... WHERE DateDiff(mm,OrderDate,GetDate()) >= 30  Fixed: Select ... WHERE OrderDate < DateAdd(mm,-30,GetDate())     Bad: Select ... WHERE Year(OrderDate) = 2003  Fixed: Select ... WHERE OrderDate >= '2003-1-1' AND OrderDate < '2004-1-1'  

Answer by David Aldridge for Surprising SQL speed increase


What happened here is either that the RDBMS is not capable of using an index on the LEFT() predicate and is capable of using it on the LIKE, or it simply made the wrong call in which would be the more appropriate access method.

Firstly, it may be true for some RDBMSs that applying a function to a column prevents an index-based access method from being used, but that is not a universal truth, nor is there any logical reason why it needs to be. An index-based access method (such as Oracle's full index scan or fast full index scan) might be beneficial but in some cases the RDBMS is not capable of the operation in the context of a function-based predicate.

Secondly, the optimiser may simply get the arithmetic wrong in estimating the benefits of the different available access methods. Assuming that the system can perform an index-based access method it has first to make an estimate of the number of rows that will match the predicate, either from statistics on the table, statistics on the column, by sampling the data at parse time, or be using a heuristic rule (eg. "assume 5% of rows will match"). Then it has to assess the relative costs of a full table scan or the available index-based methods. Sometimes it will get the arithmetic wrong, sometimes the statistics will be misleading or innaccurate, and sometimes the heuristic rules will not be appropriate for the data set.

The key point is to be aware of a number of issues:

  1. What operations can your RDBMS support?
  2. What would be the most appropriate operation in the case you are working with?
  3. Is the system's choice correct?
  4. What can be done to either allow the system to perform a more efficient operation (eg. add a missing not null constraint, update the statistics etc)?

In my experience this is not a trivial task, and is often best left to experts. Or on the other hand, just post the problem to Stackoverflow -- some of us find this stuff fascinating, dog help us.

Answer by WorkRelated for Surprising SQL speed increase


As @BradC mentioned, you shouldn't use functions in a WHERE clause if you have indexes and want to take advantage of them.

If you read the section entitled "Use LIKE instead of LEFT() or SUBSTRING() in WHERE clauses when Indexes are present" from these SQL Performance Tips, there are more examples.

It also hints at questions you'll encounter on the MCSE SQL Server 2012 exams if you're interested in taking those too. :-)


Fatal error: Call to a member function getElementsByTagName() on a non-object in D:\XAMPP INSTALLASTION\xampp\htdocs\endunpratama9i\www-stackoverflow-info-proses.php on line 72

0 comments:

Post a Comment

Popular Posts

Powered by Blogger.