GROUP BY making the query extremely slow

I’m running the following query which fails to show results after 10 mins.

SELECT
    t1.broker_name,
    t1.agency_name,
    t2.type,
    cmb.Centris_No,
    cmb.Price AS sell_price,
    cmb.Rent_Price AS rent_price
FROM
    brokers_global2 t1
        INNER JOIN
    brokers_to_listings t2 ON t1.broker_id = t2.broker_id
        INNER JOIN
    (SELECT * FROM all_mls_1_i UNION 
    SELECT * FROM all_mls_2_i UNION 
    SELECT * FROM all_mls_3_i UNION 
    ) cmb 
    ON 
    t2.mls_id = cmb.Centris_No
WHERE
    t1.agency_name LIKE '%String%'
        AND cmb.target_date > 20210101
        GROUP BY cmb.Centris_No
LIMIT 0 , 50000

The query works fine, but when I add GROUP BY cmb.Centris_No it comes to complete a halt..

As per another SO solution: I added an index onto the Centris_No column, I also modified the target_date column to INT instead of varchar and added another index on it.

But the result is the same.

Is there anything else I can try?

This is the create statement for all_mls_ tbls

CREATE TABLE `all_mls_1` (
  `Centris_No` varchar(25) DEFAULT NULL,
  `ST` varchar(250) DEFAULT NULL,
  `Mun_Bor` varchar(250) DEFAULT NULL,
  `Address` varchar(250) DEFAULT NULL,
  `Price` varchar(250) DEFAULT NULL,
  `Price_raw` varchar(250) DEFAULT NULL,
  `Rent_Price` varchar(250) DEFAULT NULL,
  `Rent_Price_raw` varchar(250) DEFAULT NULL,
  `CP` varchar(250) DEFAULT NULL,
  `BT` varchar(250) DEFAULT NULL,
  `PT` varchar(250) DEFAULT NULL,
  `Lot_Area_Imperial` varchar(250) DEFAULT NULL,
  `Lot_Area_metric` varchar(250) DEFAULT NULL,
  `region` varchar(250) DEFAULT NULL,
  `latitude` varchar(45) DEFAULT NULL,
  `longitude` varchar(45) DEFAULT NULL,
  `city` varchar(250) DEFAULT NULL,
  `district` varchar(250) DEFAULT NULL,
  `target_date` varchar(250) DEFAULT NULL,
  `date_added` datetime DEFAULT NULL,
  UNIQUE KEY `MLS_INDEX` (`Centris_No`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

The all_mls_# tables have about a 100k rows in total

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

put it here since it doeasn’t fit in comments.

two small things you can do to improve it a little bit:

  1. use union all instead of union
  2. in your subquery only select columns that you need:
INNER JOIN
    (SELECT Centris_No,Price,Rent_Price,target_date FROM all_mls_1_i UNION ALL
    SELECT Centris_No,Price,Rent_Price,target_date FROM all_mls_2_i UNION ALL
    SELECT Centris_No,Price,Rent_Price,target_date FROM all_mls_3_i 
    ) cmb
  1. also look at your execution plan to figure our where the bottleneck is

Method 2

SELECT  t1.broker_name, t1.agency_name, t2.type, cmb.Centris_No,
        cmb.Price AS sell_price, cmb.Rent_Price AS rent_price
    FROM  brokers_global2 t1
    INNER JOIN  brokers_to_listings t2  ON t1.broker_id = t2.broker_id
    INNER JOIN  
    (
        SELECT  *
            FROM  all_mls_1_i
            UNION 
         SELECT  *
            FROM  all_mls_2_i
            UNION 
         SELECT  *
            FROM  all_mls_3_i
            UNION  
    ) cmb  ON t2.mls_id = cmb.Centris_No
    WHERE  t1.agency_name LIKE '%String%'
      AND  cmb.target_date > 20210101
    GROUP BY  cmb.Centris_No
    LIMIT  0 , 50000
  • Have 1 table, not many UNION’d together.
  • Instead of SELECT *, use SELECT Centris_No, target_date, Rent_Price, Price
  • Use DATE or DATETIME or TIMESTAMP, not varchar(250), for all date-related columns.
  • Move and add onWHERE target_date > ‘2021-01-01’` into the derived table so that there is less to fetch.
  • A LIMIT without an ORDER BY leads to random rows being delivered.
  • A LIMIT of 50000 seems like a waste?
  • Can Centris_No really be NULL? I hope not. If not, then promote it to be the PRIMARY KEY.
  • Add INDEX(target_date)
  • brokers_global2 needs an index starting with broker_id.

(There may be more; fix these, the I will look at it some more.)


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x