simon

simon

github

MySQL Slow Query Investigation Record: A Performance Crisis Triggered by an Optimizer Misjudgment

1. Problem Background#

A certain online interface has a response time of over 50 seconds, frequently timing out, severely affecting user experience. The testing environment runs smoothly, prompting an investigation.

2. Preliminary Investigation Process#

✅ 1. Link tracing to identify slow interfaces

Using a link tracing system (such as SkyWalking), we quickly pinpointed that a certain database query was slowing down the interface response.

✅ 2. Logs + SQL interceptor to print complete SQL

Combined with MyBatis's SQL interceptor, we obtained the complete SQL after parameter concatenation:

SELECT MAX(t.id)
FROM user_log t
WHERE t.deleted_at = '1970-01-01 00:00:00'
  AND LENGTH(t.platform_id) > 0
  AND t.platform_id != '0'
  AND t.platform_type = 'MOBILE'
  AND t.nickname LIKE CONCAT('%', 'abc', '%')
GROUP BY t.platform_id
LIMIT 20;

✅ 3. Execution plan analysis (EXPLAIN)

It was found that the production environment incorrectly hit an index starting with deleted_at:

idx_deletedat_memberaccountid_updatedat
→ Scanned rows: 2,126,736
→ filesort + temporary table
→ Total time: 50 seconds

In contrast, the testing environment used a more appropriate index idx_platformtype_deletedat_platformid_nickname, executing in just a few hundred milliseconds.

3. Root Cause of the Problem#

The optimizer in the production environment misjudged the optimal index, leading to a full table scan + temporary table sorting.

The reasons are as follows:

Condition FieldTesting Environment Match ScaleProduction Environment Match Scale
deleted_atThousands (about 5,000)Millions (about 2,100,000)
platform_typeHundreds of thousands (about 700,000)Millions (about 1,250,000)

💡 Inference: The excessive matching data for deleted_at in the production environment led the optimizer to incorrectly judge it as a highly selective field, resulting in the selection of a sub-optimal index.

4. Final Optimization Plan: Combine Two Strategies#

✅ Plan One: Conditionally Use USE INDEX

Dynamically inject in MyBatis XML:

<if test="ro.platformType != null and ro.partialNickname != null and ro.talentPlatformIdList == null">
    USE INDEX (idx_platformtype_deletedat_platformid_nickname)
</if>

Effect:
• Avoids the optimizer choosing incorrectly when querying the nickname list and when there is no precise range for platform_id;
• Restrictive injection ensures that the index is enforced only under specific combinations, avoiding a one-size-fits-all approach.

✅ Plan Two: Range Optimization to Replace Function Filtering

Replace the original redundant judgment logic:

-- Original

AND LENGTH(t.platform_id) > 0 AND t.platform_id != '0'

-- Optimized

AND t.platform_id > '0'

Advantages:

  • Length and != will not utilize the index, resulting in a full table scan; changing to > allows the index to be used for querying.
  • Avoids interference from functions on fields, increasing the probability of index usage;
  • Concise and efficient, semantically equivalent.

5. Optimization Effect#

EnvironmentBefore OptimizationAfter OptimizationImprovement Factor
Production Environment~50 seconds~0.9 seconds55x

6. Subsequent Optimizations#

The following are further optional optimization directions summarized based on the current plan, suitable for future evolution considerations. Currently, no further optimizations will be made due to low ROI below.

  1. ✅ Use full-text indexing to optimize fuzzy searches
  • Current LIKE '%xxx%' queries cannot use B-Tree indexes.
  • FULLTEXT indexes can be used (suitable for VARCHAR or TEXT fields in InnoDB) to improve fuzzy search performance:
ALTER TABLE some_table ADD FULLTEXT INDEX idx_nickname_ft (nickname);
  • Rewrite the query as:
SELECT ... FROM some_table
WHERE MATCH(nickname) AGAINST('+abc' IN BOOLEAN MODE);
  1. ✅ Split SQL to handle OR conditions
  • The original query contains:
WHERE platform_id IN (...) OR home_link IN (...)
  • MySQL typically cannot utilize two indexes simultaneously when encountering OR, easily triggering a full table scan.
  • It is recommended to split the SQL into two subqueries followed by UNION ALL:
(SELECT ... FROM ... WHERE platform_id IN (...))
UNION ALL
(SELECT ... FROM ... WHERE home_link IN (...))
  1. ✅ Design multiple composite indexes to handle different parameter combinations
  • Design dedicated composite indexes for different query scenarios (e.g., whether it includes platform_type, whether it is a fuzzy search, whether there is a notification_id);
  • Use MyBatis conditions + USE INDEX to dynamically control the path, ensuring the optimal execution plan is always followed.
  1. ✅ Improve the optimizer's understanding of data distribution through histograms (MySQL 8+)
  • If ANALYZE TABLE is still insufficient to accurately reflect the cardinality distribution of a certain field, consider using:
ANALYZE TABLE some_table UPDATE HISTOGRAM ON platform_type WITH 256 BUCKETS;
  • Enhance the accuracy of the optimizer's index cost assessment.

7. Summary and Insights#

🎯 Optimization Keywords:
• USE INDEX conditional control
• Function filtering replaced by range judgment
• Analyze data distribution differences
• Pay attention to optimizer behavior
• Strong guidance under precise conditions to avoid global strong binding

🧠 Experience Summary:

RecommendationExplanation
Always EXPLAIN comparison before optimizationUnderstand why there is a huge difference between production and testing
Use caution with function-wrapping index fieldsFunction filtering often leads to index invalidation
Conditional USE INDEX instead of global fixedIncreases compatibility and robustness
Field distribution determines optimizer behaviorDifferent environments need separate tuning

A wrong index choice can magnify query time by 50 times; a fine-tuned combination can rejuvenate the system.

When facing the optimizer, don't "fully trust" or "stubbornly confront"; use more detailed strategies to collaborate with it, which is the advanced way of SQL optimization.

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.