SELECT MAX vs ORDER BY LIMIT 1 question has been answered here several times, but if I add a WHERE clause, things change dramatically
Here is my table:
Note, that both columns are indexed.
And here are the requests:
SELECT id from table WHERE open_time > 0 ORDER BY id DESC LIMIT 1 SELECT MAX(id) from BTCUSDT1mHist WHERE open_time > 0
EXPLAIN ANALYZE shows the following:
-> Limit: 1 row(s) (cost=0.10 rows=1) (actual time=0.038..0.038 rows=1 loops=1) -> Filter: (table.open_time > 0) (cost=0.10 rows=1) (actual time=0.037..0.037 rows=1 loops=1) -> Index scan on table using PRIMARY (reverse) (cost=0.10 rows=2) (actual time=0.036..0.036 rows=1 loops=1)
-> Aggregate: max(table.id) (cost=325890.06 rows=1081033) (actual time=1025.181..1025.181 rows=1 loops=1) -> Filter: (table.open_time > 0) (cost=217786.76 rows=1081033) (actual time=0.032..866.890 rows=2180645 loops=1) -> Index range scan on table using open_time (cost=217786.76 rows=1081033) (actual time=0.031..705.926 rows=2180645 loops=1)
ORDER BY finishes in 0.0012 seconds, while MAX() does in 1.026 seconds
I have read this question also, but it doesn’t seem to cover my situation
The question is: why does MAX() takes so much longer than ORDER BY LIMIT?
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Compare in the analyze:
-> Index scan on table using PRIMARY (reverse) (cost=0.10 rows=2) (actual time=0.036..0.036 rows=1 loops=1)
-> Index range scan on table using open_time (cost=217786.76 rows=1081033) (actual time=0.031..705.926 rows=2180645 loops=1)
Examining 2 rows must be a lot quicker than examining 2,180,645 rows.
In the query with
ORDER BY id DESC LIMIT 1, it uses the primary key index. It starts at the end because it’s a reverse order. Then it just iterates down the leaf nodes of the index (in descending order), until it examines the first row that also matches
open_time > 0. Then the LIMIT optimization allows the query execution to finish. Based on its statistics, it estimates this will happen after examining 2 rows.
In the query with
MAX(id), it uses the index on
open_time. But because it’s a range condition
open_time > 0, it can’t assume the maximum id is found at the start or end of that range. So it must examine every matching entry in the
open_time index, searching for the greatest value of
id (primary keys are implicitly part of a secondary index). There’s no chance of early-termination as there is in the query with
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0