Version: 2.0 MySQL Version: 8.0+ Last Updated: January 2026
Query optimization in MySQL is critical for application performance. Two key concepts drive effective optimization:
This guide demonstrates how to analyze and optimize queries using these principles.
Understanding these fundamental terms will help you follow the guide more easily.
| Term | Definition |
|---|---|
| Index | A data structure that helps MySQL find rows faster. Think of it like a book’s index—you can quickly find topics without reading every page. |
| Cardinality | The number of unique values in a column. High cardinality = many unique values (e.g., email addresses). Low cardinality = few unique values (e.g., true/false flags). |
| Selectivity | The percentage of unique values = cardinality ÷ total rows. Higher selectivity (closer to 100%) means better for indexing. |
| Composite Index | An index on multiple columns. Order matters! Like organizing books by genre, then author, then title. |
| Covering Index | An index that includes ALL columns needed by a query, avoiding lookups to the main table. Like having complete information in the book index without opening the book. |
| Query Optimizer | MySQL’s decision-making engine that chooses how to execute your query (which indexes to use, join order, etc.). |
| Buffer Pool | MySQL’s memory cache for data and indexes. Larger = faster queries because data stays in RAM instead of reading from disk. |
| Column | What It Means | Interpretation |
|---|---|---|
| type | How MySQL accesses data | Best to worst: const → eq_ref → ref → range → index → ALL |
| key | Which index MySQL is using | NULL = no index used (problem!) |
| key_len | Bytes of index being used | Larger = more columns from composite index utilized |
| rows | Estimated rows to examine | Lower is better |
| Extra | Additional information | See details below |
| Value | Meaning | Performance |
|---|---|---|
Using index |
Covering index (no table lookup) | Excellent ✓ |
Using where |
Filtering after reading rows | Less efficient |
Using filesort |
Sorting requires extra work | May need optimization |
Using temporary |
Temporary table created | Expensive × |
Using index condition |
Index Condition Pushdown active | Good ✓ |
| Term | Definition |
|---|---|
| ICP (Index Condition Pushdown) | Filtering happens at the storage engine level (faster) instead of the MySQL server level. |
| MRR (Multi-Range Read) | Optimization that sorts row lookups before reading from disk to reduce random I/O. |
| InnoDB | MySQL’s default storage engine. Handles transactions, row-level locking, and crash recovery. |
| DDL (Data Definition Language) | Schema changes: CREATE INDEX, ALTER TABLE, DROP TABLE |
| DML (Data Manipulation Language) | Data operations: SELECT, INSERT, UPDATE, DELETE |
| OLTP (Online Transaction Processing) | Systems with many small read/write operations (e.g., e-commerce, banking). |
| OLAP (Online Analytical Processing) | Systems focused on complex queries for reporting/analytics (e.g., data warehouses). |
| Replication Lag | Time delay before changes on primary database appear on replica databases. |
| Table Scan | Reading every row in a table (slow for large tables). Happens when no suitable index exists. |
Cardinality refers to the number of distinct values in a column relative to the total number of rows.
| Type | Description | Examples |
|---|---|---|
| High Cardinality | Many unique values | user_id, email, UUID, order_number |
| Low Cardinality | Few unique values | gender, status, boolean flags, country_code |
MySQL’s query optimizer uses cardinality statistics to:
-- View cardinality for all indexes on a table
SHOW INDEX FROM table_name;
-- Calculate selectivity for a column
SELECT
COUNT(DISTINCT column_name) AS cardinality,
COUNT(*) AS total_rows,
ROUND(COUNT(DISTINCT column_name) / COUNT(*) * 100, 2) AS selectivity_percent
FROM table_name;
Formula: Selectivity = Cardinality ÷ Total Rows
| Selectivity | Index Effectiveness | Action |
|---|---|---|
| > 90% | Excellent | Ideal for indexing |
| 50-90% | Good | Good candidate |
| 10-50% | Moderate | Consider composite index |
| < 10% | Poor | Avoid standalone index |
Rule of Thumb: Index columns with selectivity > 10%
| Index Type | Characteristics | Use Case |
|---|---|---|
| Primary Key | Unique, non-null, clustered | Table’s main identifier |
| Unique | Ensures uniqueness | Email, username, SKU |
| Regular (Non-unique) | Most common, allows duplicates | Foreign keys, search columns |
| Composite (Multi-column) | Multiple columns (order matters!) | Complex WHERE conditions |
| Full-Text | Text searching | Blog posts, descriptions |
| Spatial | Geographic data | Locations, maps, GIS |
| Column Cardinality | Index Effectiveness | Example Columns | Action |
|---|---|---|---|
| Very High (>90%) | Excellent ⭐⭐⭐ | user_id, email, order_number |
Always index |
| High (50-90%) | Good ⭐⭐ | product_sku, customer_name |
Good candidate |
| Medium (10-50%) | Moderate ⭐ | category_id, city |
Use in composite index |
| Low (<10%) | Poor ⚠️ | gender, is_active, status |
Avoid standalone index |
Key Principle: Index columns with selectivity > 10% (0.1)
-- Basic EXPLAIN
EXPLAIN SELECT columns FROM table WHERE conditions;
-- Extended EXPLAIN (MySQL 8.0+)
EXPLAIN FORMAT=TREE SELECT ...;
-- With actual execution metrics
EXPLAIN ANALYZE SELECT ...;
| Column | Description | Good Values | Bad Values |
|---|---|---|---|
| type | Join/access type | const, eq_ref, ref, range |
ALL, index |
| possible_keys | Indexes considered | Multiple options | NULL |
| key | Index actually used | Any index name | NULL |
| key_len | Bytes of index used | Higher (more columns used) | Low (partial index use) |
| rows | Estimated rows to examine | Low numbers (< 1000) | High numbers (> 10000) |
| filtered | % of rows filtered by condition | High % (> 50%) | Low % (< 10%) |
| Extra | Additional information | Using index |
Using filesort, Using temporary |
| Type | Description | Performance | When It Occurs |
|---|---|---|---|
| const | Single row match | ⭐⭐⭐⭐⭐ Excellent | WHERE id = 1 on PRIMARY KEY |
| eq_ref | One row per join | ⭐⭐⭐⭐ Excellent | JOIN on PRIMARY KEY or UNIQUE |
| ref | Multiple matching rows | ⭐⭐⭐ Good | Non-unique index lookup |
| range | Index range scan | ⭐⭐ Moderate | BETWEEN, >, <, IN |
| index | Full index scan | ⭐ Poor | Scanning entire index |
| ALL | Full table scan | ⚠️ Very Poor | No suitable index (avoid!) |
Warning:
ALLtype scans every row in the table. For tables with >10,000 rows, this is usually a problem.
-- Create sample tables
CREATE TABLE customers (
customer_id INT PRIMARY KEY AUTO_INCREMENT,
email VARCHAR(255) UNIQUE NOT NULL,
first_name VARCHAR(100),
last_name VARCHAR(100),
country VARCHAR(50),
registration_date DATE,
status ENUM('active', 'inactive', 'suspended') DEFAULT 'active',
INDEX idx_country (country),
INDEX idx_status (status),
INDEX idx_registration_date (registration_date)
) ENGINE=InnoDB;
CREATE TABLE orders (
order_id INT PRIMARY KEY AUTO_INCREMENT,
customer_id INT NOT NULL,
order_date DATETIME NOT NULL,
total_amount DECIMAL(10,2),
payment_status ENUM('pending', 'paid', 'failed', 'refunded'),
shipping_country VARCHAR(50),
INDEX idx_customer_id (customer_id),
INDEX idx_order_date (order_date),
INDEX idx_payment_status (payment_status),
FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
) ENGINE=InnoDB;
CREATE TABLE order_items (
item_id INT PRIMARY KEY AUTO_INCREMENT,
order_id INT NOT NULL,
product_id INT NOT NULL,
quantity INT,
unit_price DECIMAL(10,2),
INDEX idx_order_id (order_id),
INDEX idx_product_id (product_id),
FOREIGN KEY (order_id) REFERENCES orders(order_id)
) ENGINE=InnoDB;
CREATE TABLE products (
product_id INT PRIMARY KEY AUTO_INCREMENT,
product_name VARCHAR(255),
category VARCHAR(100),
price DECIMAL(10,2),
stock_quantity INT,
INDEX idx_category (category)
) ENGINE=InnoDB;
SELECT
c.customer_id,
c.first_name,
c.last_name,
c.email,
o.order_id,
o.order_date,
o.total_amount,
p.product_name,
oi.quantity,
oi.unit_price
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
WHERE c.country = 'United States'
AND c.status = 'active'
AND o.order_date >= '2024-01-01'
AND o.payment_status = 'paid'
AND p.category = 'Electronics'
ORDER BY o.order_date DESC
LIMIT 100;
EXPLAIN SELECT
c.customer_id,
c.first_name,
c.last_name,
c.email,
o.order_id,
o.order_date,
o.total_amount,
p.product_name,
oi.quantity,
oi.unit_price
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
WHERE c.country = 'United States'
AND c.status = 'active'
AND o.order_date >= '2024-01-01'
AND o.payment_status = 'paid'
AND p.category = 'Electronics'
ORDER BY o.order_date DESC
LIMIT 100;
Expected Problems:
-- Check cardinality for customers table
SELECT
'country' as column_name,
COUNT(DISTINCT country) AS distinct_values,
COUNT(*) AS total_rows,
ROUND(COUNT(DISTINCT country) / COUNT(*) * 100, 2) AS selectivity_percent
FROM customers
UNION ALL
SELECT
'status',
COUNT(DISTINCT status),
COUNT(*),
ROUND(COUNT(DISTINCT status) / COUNT(*) * 100, 2)
FROM customers;
-- Check cardinality for orders table
SELECT
'payment_status' as column_name,
COUNT(DISTINCT payment_status) AS distinct_values,
COUNT(*) AS total_rows,
ROUND(COUNT(DISTINCT payment_status) / COUNT(*) * 100, 2) AS selectivity_percent
FROM orders
UNION ALL
SELECT
'order_date',
COUNT(DISTINCT DATE(order_date)),
COUNT(*),
ROUND(COUNT(DISTINCT DATE(order_date)) / COUNT(*) * 100, 2)
FROM orders;
-- Check cardinality for products table
SELECT
'category' as column_name,
COUNT(DISTINCT category) AS distinct_values,
COUNT(*) AS total_rows,
ROUND(COUNT(DISTINCT category) / COUNT(*) * 100, 2) AS selectivity_percent
FROM products;
Example Results (from actual test with 10K customers, 50K orders):
| Table | Column | Distinct Values | Total Rows | Selectivity |
|---|---|---|---|---|
| customers | country | 5 | 10,000 | 0.05% |
| customers | status | 3 | 10,000 | 0.03% |
| orders | payment_status | 4 | 50,000 | 0.01% |
| orders | order_date | 730 | 50,000 | 1.46% |
| products | category | 5 | 15 | 33.33% |
Key Observations:
order_date has the highest selectivity (1.46%), making it the best leading column for composite indexesBased on cardinality analysis, create composite indexes with high-to-low cardinality order:
-- Composite index for customers table
-- Put higher cardinality column first
DROP INDEX idx_country ON customers;
DROP INDEX idx_status ON customers;
CREATE INDEX idx_country_status_regdate ON customers(country, status, registration_date);
-- Composite index for orders table
-- order_date has higher cardinality than payment_status
DROP INDEX idx_order_date ON orders;
DROP INDEX idx_payment_status ON orders;
CREATE INDEX idx_customer_orderdate_payment ON orders(customer_id, order_date, payment_status);
-- Add covering index for better performance
CREATE INDEX idx_orderdate_payment_customer ON orders(order_date, payment_status, customer_id);
-- Products: category has low cardinality, but still useful with other filters
-- Keep simple index as is, or create composite if other filters exist
EXPLAIN SELECT
c.customer_id,
c.first_name,
c.last_name,
c.email,
o.order_id,
o.order_date,
o.total_amount,
p.product_name,
oi.quantity,
oi.unit_price
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
WHERE c.country = 'United States'
AND c.status = 'active'
AND o.order_date >= '2024-01-01'
AND o.payment_status = 'paid'
AND p.category = 'Electronics'
ORDER BY o.order_date DESC
LIMIT 100;
Expected Improvements:
type: Should show ref or range instead of ALLkey: Should show composite indexes being usedrows: Should show significantly fewer rows examinedExtra: Should avoid “Using filesort” and “Using temporary”Actual Test Results (BEFORE optimization):
Table: o (orders)
type: range
key: idx_order_date
key_len: 5 (only order_date used)
rows: 25,121
Extra: Using index condition; Using where
Actual Test Results (AFTER optimization):
Table: o (orders)
type: range
key: idx_orderdate_payment_customer
key_len: 7 (order_date + payment_status used)
rows: 24,450
Extra: Using index condition
✅ Key Improvements:
payment_status moved from “Using where” (post-filtering) to index condition (index-level filtering)If the above query still has issues, consider rewriting:
-- Option 1: Filter early with subquery
SELECT
c.customer_id,
c.first_name,
c.last_name,
c.email,
o.order_id,
o.order_date,
o.total_amount,
p.product_name,
oi.quantity,
oi.unit_price
FROM (
SELECT customer_id, first_name, last_name, email
FROM customers
WHERE country = 'United States' AND status = 'active'
) c
JOIN orders o ON c.customer_id = o.customer_id
AND o.order_date >= '2024-01-01'
AND o.payment_status = 'paid'
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
AND p.category = 'Electronics'
ORDER BY o.order_date DESC
LIMIT 100;
-- Option 2: Start from the most selective table
SELECT
c.customer_id,
c.first_name,
c.last_name,
c.email,
o.order_id,
o.order_date,
o.total_amount,
p.product_name,
oi.quantity,
oi.unit_price
FROM orders o
JOIN customers c ON c.customer_id = o.customer_id
AND c.country = 'United States'
AND c.status = 'active'
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
AND p.category = 'Electronics'
WHERE o.order_date >= '2024-01-01'
AND o.payment_status = 'paid'
ORDER BY o.order_date DESC
LIMIT 100;
-- Enable profiling
SET profiling = 1;
-- Run original query
SELECT /* original */ ... ;
-- Run optimized query
SELECT /* optimized */ ... ;
-- View results
SHOW PROFILES;
-- Detailed analysis of last query
SHOW PROFILE FOR QUERY 2;
| Practice | Reason |
|---|---|
| Index foreign key columns | Improves JOIN performance |
| Create composite indexes high→low cardinality | Optimizer uses most selective column first |
| Put selective columns first | Filters more rows earlier in execution |
| Use covering indexes for hot queries | Eliminates table lookups completely |
| Index columns in WHERE, JOIN, ORDER BY | Direct index usage for these operations |
| Update statistics regularly | Keeps optimizer decisions accurate |
| Practice | Reason |
|---|---|
| Over-index tables (>15 indexes) | Each index adds write cost and maintenance overhead |
| Index low cardinality columns alone | Poor selectivity, full scans may be faster |
| Create redundant indexes | (a,b) already covers queries on (a) |
| Index tiny tables (<1000 rows) | Full scan is fast enough |
| Index columns with many NULLs | High NULL ratios reduce selectivity; prefer composite or specific queries |
Golden Rule: High Cardinality → Low Cardinality
-- ✅ GOOD: email (high cardinality) before status (low cardinality)
CREATE INDEX idx_customer_filter ON customers(email, status);
-- ❌ BAD: status (low) before email (high)
CREATE INDEX idx_customer_filter_bad ON customers(status, email);
A composite index on (a, b, c) can be used for:
| Query Pattern | Index Used? |
|---|---|
WHERE a = ? |
✓ Yes |
WHERE a = ? AND b = ? |
✓ Yes |
WHERE a = ? AND b = ? AND c = ? |
✓ Yes (fully) |
WHERE b = ? |
× No (skips leftmost) |
WHERE c = ? |
× No (skips leftmost) |
WHERE b = ? AND c = ? |
× No (skips leftmost) |
Remember: The first column in a composite index determines whether it can be used at all.
-- Find unused indexes
SELECT
t.TABLE_SCHEMA,
t.TABLE_NAME,
s.INDEX_NAME,
s.CARDINALITY
FROM information_schema.STATISTICS s
LEFT JOIN performance_schema.table_io_waits_summary_by_index_usage i
ON s.TABLE_SCHEMA = i.OBJECT_SCHEMA
AND s.TABLE_NAME = i.OBJECT_NAME
AND s.INDEX_NAME = i.INDEX_NAME
JOIN information_schema.TABLES t
ON s.TABLE_SCHEMA = t.TABLE_SCHEMA
AND s.TABLE_NAME = t.TABLE_NAME
WHERE s.INDEX_NAME != 'PRIMARY'
AND i.INDEX_NAME IS NULL
AND t.TABLE_SCHEMA NOT IN ('mysql', 'performance_schema', 'information_schema')
GROUP BY t.TABLE_SCHEMA, t.TABLE_NAME, s.INDEX_NAME;
-- Check index statistics
SELECT
TABLE_NAME,
INDEX_NAME,
SEQ_IN_INDEX,
COLUMN_NAME,
CARDINALITY,
SUB_PART,
INDEX_TYPE
FROM information_schema.STATISTICS
WHERE TABLE_SCHEMA = 'your_database'
AND TABLE_NAME = 'your_table'
ORDER BY TABLE_NAME, INDEX_NAME, SEQ_IN_INDEX;
Use this checklist for every slow query investigation:
| Step | Check | What to Look For |
|---|---|---|
| □ | Run EXPLAIN |
Get execution plan details |
| □ | Check type column |
Avoid ALL (full table scan) |
| □ | Verify key column |
Should NOT be NULL |
| □ | Examine rows column |
Should be low (< 10% of table) |
| □ | Review Extra field |
Avoid Using filesort, Using temporary |
| □ | Check column cardinality | Ensure indexed columns are selective |
| □ | Match composite indexes | Order should match query filters |
| □ | Consider covering indexes | Include all SELECT columns if possible |
| □ | Test with production data | Use realistic row counts |
| □ | Benchmark changes | Measure before/after performance |
Avoid these mistakes that prevent MySQL from using indexes effectively.
Problem: Applying functions to indexed columns prevents index usage.
| ❌ Avoid | ✅ Use Instead |
|---|---|
WHERE YEAR(order_date) = 2024 |
WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01' |
WHERE UPPER(email) = 'USER@EMAIL.COM' |
WHERE email = 'user@email.com' (use case-insensitive collation) |
WHERE DATE(created_at) = '2024-01-15' |
WHERE created_at >= '2024-01-15' AND created_at < '2024-01-16' |
Why: MySQL can’t use the index when you transform the column value.
Problem: Comparing different data types forces MySQL to convert values, preventing index usage.
| Column Type | ❌ Wrong Type | ✅ Correct Type |
|---|---|---|
INT |
WHERE id = '123' |
WHERE id = 123 |
VARCHAR |
WHERE code = 123 |
WHERE code = '123' |
DECIMAL |
WHERE price = '19.99' |
WHERE price = 19.99 |
Why: Type conversion happens on every row, making indexes useless.
Problem: OR across different columns prevents efficient index usage.
-- ❌ BAD: Can't efficiently use indexes on both columns
WHERE country = 'US' OR status = 'active'
-- ✅ GOOD: Use UNION to let MySQL use both indexes
SELECT * FROM customers WHERE country = 'US'
UNION
SELECT * FROM customers WHERE status = 'active'
Exception:
ORon the same column works fine:WHERE status IN ('active', 'pending')
Problem: Leading wildcards prevent index range scans.
| ❌ Can’t Use Index | ✅ Can Use Index |
|---|---|
WHERE email LIKE '%gmail.com' |
WHERE email LIKE 'john%' |
WHERE name LIKE '%smith%' |
WHERE name LIKE 'Smith%' |
Alternative: For suffix searches, consider Full-Text Index or reverse the string.
When to use: Rarely, only when MySQL consistently chooses a poor index.
-- Suggest an index (MySQL may ignore)
SELECT * FROM orders USE INDEX (idx_orderdate_payment_customer)
WHERE order_date >= '2024-01-01' AND payment_status = 'paid';
-- Force an index (MySQL must use it)
SELECT * FROM orders FORCE INDEX (idx_orderdate_payment_customer)
WHERE order_date >= '2024-01-01';
-- Ignore a specific index
SELECT * FROM orders IGNORE INDEX (idx_customer_id)
WHERE order_date >= '2024-01-01';
Warning: Hints are brittle and may become incorrect as data changes. Prefer updating statistics or adding better indexes.
EXPLAIN ANALYZE
SELECT ... FROM ... WHERE ...;
This shows:
Example from our test query:
-> Limit: 100 row(s) (cost=18252 rows=100) (actual time=4.03..5.04 rows=100 loops=1)
-> Index range scan on o using idx_orderdate_payment_customer
(cost=4946 rows=24450) (actual time=3.99..4 rows=94 loops=1)
Key metrics:
cost: Optimizer’s estimated cost (18252)rows: Estimated rows (24450) vs actual rows (94)actual time: Real execution time in milliseconds (3.99ms to start, 4ms total)SET optimizer_trace='enabled=on';
SELECT ... FROM ... WHERE ...;
SELECT * FROM information_schema.OPTIMIZER_TRACE\G
SET optimizer_trace='enabled=off';
This reveals MySQL’s decision-making process for query execution.
This section covers advanced optimization techniques and production considerations.
The Cost: Each index adds:
-- Check how much space indexes use
SELECT
TABLE_NAME,
ROUND(DATA_LENGTH / 1024 / 1024, 2) AS 'Data (MB)',
ROUND(INDEX_LENGTH / 1024 / 1024, 2) AS 'Index (MB)',
ROUND(INDEX_LENGTH / DATA_LENGTH * 100, 2) AS 'Index Overhead %'
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = DATABASE()
ORDER BY INDEX_LENGTH DESC;
Guidelines:
What it is: An index that contains ALL columns your query needs, so MySQL never has to look at the actual table.
-- Normal index (must look up customer_id and total_amount from table)
CREATE INDEX idx_orderdate_payment ON orders(order_date, payment_status);
-- Covering index (everything needed is in the index!)
CREATE INDEX idx_covering
ON orders(order_date, payment_status, customer_id, total_amount);
How to verify: EXPLAIN shows Extra: Using index
Trade-offs:
What it is: MySQL 5.6+ optimization that filters data at the storage engine level (deeper in the database) instead of at the MySQL server level.
Why it matters: Fewer rows need to be passed up to the server = faster queries.
-- Example with composite index (order_date, payment_status)
EXPLAIN
SELECT * FROM orders
WHERE order_date >= '2024-01-01'
AND payment_status = 'paid'; -- ICP filters this at storage engine level
-- Look for: Extra: Using index condition
Best for: Composite indexes with multiple WHERE conditions on indexed columns.
What it is: Tells MySQL about the distribution of your data. By default, MySQL assumes data is evenly spread out, which is often wrong.
Example Problem: If 90% of your orders are from 2024 but MySQL thinks they’re spread evenly across all years, it will choose a poor query plan.
Solution:
-- Create histogram (tells MySQL about data distribution)
ANALYZE TABLE orders UPDATE HISTOGRAM ON order_date WITH 256 BUCKETS;
-- Remove histogram
ANALYZE TABLE orders DROP HISTOGRAM ON order_date;
When to use:
What it is: Make an index invisible to the optimizer without dropping it. This is like turning off an index temporarily to test if it’s actually needed.
-- Hide the index (MySQL won't use it)
ALTER TABLE orders ALTER INDEX idx_payment_status INVISIBLE;
-- Monitor performance for a few days
-- If no slowdown, it's safe to drop
-- Delete the index
DROP INDEX idx_payment_status ON orders;
-- Or make it visible again
ALTER TABLE orders ALTER INDEX idx_payment_status VISIBLE;
Use for: Testing if an index is actually helping before permanently removing it.
What it is: MySQL keeps statistics about your data to make good decisions. If these statistics get outdated, queries slow down.
-- Check how old your statistics are
SELECT
TABLE_NAME,
UPDATE_TIME,
TIMESTAMPDIFF(DAY, UPDATE_TIME, NOW()) AS days_old
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = DATABASE()
ORDER BY UPDATE_TIME;
-- Update statistics
ANALYZE TABLE orders;
-- Enable automatic updates when data changes
ALTER TABLE orders STATS_AUTO_RECALC=1;
Update statistics after:
SHOW INDEXWhat it is: Like a filing cabinet that gets messy over time with many updates/deletes, indexes get fragmented (wasted space between data).
-- Rebuild all indexes (use during maintenance window)
OPTIMIZE TABLE orders;
-- Or rebuild specific index without locking table
ALTER TABLE orders
DROP INDEX idx_name,
ADD INDEX idx_name(columns),
ALGORITHM=INPLACE, LOCK=NONE;
Rebuild when:
What it is: Creating indexes without locking the table (users can still read/write during creation).
-- Production-safe way to create indexes (no blocking)
CREATE INDEX idx_new ON orders(column_name)
ALGORITHM=INPLACE, LOCK=NONE;
-- Check progress (MySQL 8.0+)
SELECT
WORK_COMPLETED,
WORK_ESTIMATED,
ROUND(WORK_COMPLETED / WORK_ESTIMATED * 100, 2) AS 'Progress %'
FROM performance_schema.events_stages_current;
Important: Use ALGORITHM=INPLACE, LOCK=NONE when supported. If not supported, plan a controlled maintenance window or use online schema change tooling.
What it is: Indexes can be created in descending order to match your query’s ORDER BY direction.
-- Query: ORDER BY order_date DESC, customer_id ASC
-- Create index matching the sort order
CREATE INDEX idx_order_desc_customer_asc
ON orders(order_date DESC, customer_id ASC);
Use for: Queries with mixed sort directions (some DESC, some ASC) or pagination showing newest items first.
What it is: MySQL can use multiple indexes for one query (usually with OR conditions).
-- Query with OR - MySQL may use both indexes
SELECT * FROM orders
WHERE order_date >= '2024-01-01'
OR customer_id = 12345;
-- EXPLAIN shows: type: index_merge
-- Extra: Using union(idx_order_date,idx_customer_id)
Note: Usually helpful, but if the same OR query runs frequently, create a single composite index instead.
Enable and configure:
-- Enable slow query log
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL slow_query_log_file = '/var/log/mysql/slow-query.log';
SET GLOBAL long_query_time = 1; -- Log queries > 1 second
SET GLOBAL log_queries_not_using_indexes = 'ON';
-- Check current settings
SHOW VARIABLES LIKE 'slow_query%';
SHOW VARIABLES LIKE 'long_query_time';
Analyze with pt-query-digest (Percona Toolkit):
pt-query-digest /var/log/mysql/slow-query.log \
--limit 95%:20 \
--output json > slow-query-analysis.json
# Or for human-readable output
pt-query-digest /var/log/mysql/slow-query.log \
--limit 95%:20 \
--order-by Query_time:sum
Key metrics to monitor:
Track index usage in production:
-- Enable Performance Schema (if not already)
-- In my.cnf: performance_schema = ON
-- Find most used indexes
SELECT
OBJECT_SCHEMA,
OBJECT_NAME,
INDEX_NAME,
COUNT_STAR AS 'Uses',
COUNT_READ,
COUNT_WRITE,
COUNT_FETCH,
COUNT_INSERT,
COUNT_UPDATE,
COUNT_DELETE
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE OBJECT_SCHEMA = DATABASE()
AND INDEX_NAME IS NOT NULL
ORDER BY COUNT_STAR DESC;
-- Find unused indexes (candidates for removal)
SELECT
t.TABLE_SCHEMA,
t.TABLE_NAME,
s.INDEX_NAME,
s.CARDINALITY
FROM information_schema.STATISTICS s
JOIN information_schema.TABLES t USING (TABLE_SCHEMA, TABLE_NAME)
LEFT JOIN performance_schema.table_io_waits_summary_by_index_usage i
ON s.TABLE_SCHEMA = i.OBJECT_SCHEMA
AND s.TABLE_NAME = i.OBJECT_NAME
AND s.INDEX_NAME = i.INDEX_NAME
WHERE s.TABLE_SCHEMA = DATABASE()
AND s.INDEX_NAME != 'PRIMARY'
AND (i.INDEX_NAME IS NULL OR i.COUNT_STAR = 0)
ORDER BY s.TABLE_NAME, s.INDEX_NAME;
-- Table scan operations (may need indexes)
SELECT
OBJECT_SCHEMA,
OBJECT_NAME,
COUNT_READ,
COUNT_WRITE,
COUNT_FETCH,
SUM_TIMER_WAIT / 1000000000000 AS 'Total Wait (sec)'
FROM performance_schema.table_io_waits_summary_by_table
WHERE OBJECT_SCHEMA = DATABASE()
ORDER BY SUM_TIMER_WAIT DESC;
-- Reset statistics for fresh analysis
TRUNCATE TABLE performance_schema.table_io_waits_summary_by_index_usage;
InnoDB Buffer Pool tuning:
-- Check buffer pool hit ratio (should be > 99%)
SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_read%';
-- Calculate hit ratio
SELECT
CONCAT(
ROUND(
(1 - (Innodb_buffer_pool_reads / Innodb_buffer_pool_read_requests)) * 100,
2
),
'%'
) AS 'Buffer Pool Hit Ratio'
FROM (
SELECT
VARIABLE_VALUE AS Innodb_buffer_pool_reads
FROM performance_schema.global_status
WHERE VARIABLE_NAME = 'Innodb_buffer_pool_reads'
) AS reads,
(
SELECT
VARIABLE_VALUE AS Innodb_buffer_pool_read_requests
FROM performance_schema.global_status
WHERE VARIABLE_NAME = 'Innodb_buffer_pool_read_requests'
) AS requests;
-- Check buffer pool usage
SELECT
CONCAT(
ROUND(
(SELECT SUM(DATA_LENGTH + INDEX_LENGTH)
FROM information_schema.TABLES
WHERE ENGINE = 'InnoDB') / 1024 / 1024 / 1024,
2
),
' GB'
) AS 'Total InnoDB Data',
@@innodb_buffer_pool_size / 1024 / 1024 / 1024 AS 'Buffer Pool (GB)';
-- View what's in the buffer pool
SELECT
TABLE_NAME,
INDEX_NAME,
COUNT(*) AS 'Pages in Buffer',
ROUND(SUM(IF(IS_OLD = 'YES', 1, 0)) / COUNT(*) * 100, 2) AS 'Old Pages %'
FROM information_schema.INNODB_BUFFER_PAGE
WHERE TABLE_NAME IS NOT NULL
GROUP BY TABLE_NAME, INDEX_NAME
ORDER BY COUNT(*) DESC
LIMIT 20;
Optimal buffer pool size:
For MySQL 5.7 and earlier:
-- Note: Query cache removed in MySQL 8.0
-- Use Redis/Memcached or application-level caching instead
-- Check query cache status (MySQL 5.7)
SHOW VARIABLES LIKE 'query_cache%';
SHOW STATUS LIKE 'Qcache%';
Before creating an index:
(avg_row_length * cardinality) * 1.2ALGORITHM=INPLACE, LOCK=NONE when supportedAfter creating an index:
Backup index definitions:
# Export table structures (includes indexes)
mysqldump -u root -p --no-data --databases your_db > schema_backup.sql
# Export specific table indexes
mysql -u root -p -e "SHOW CREATE TABLE your_table\G" your_db > table_indexes.sql
Document critical indexes:
-- Generate index documentation
SELECT
TABLE_NAME,
INDEX_NAME,
GROUP_CONCAT(COLUMN_NAME ORDER BY SEQ_IN_INDEX) AS Columns,
INDEX_TYPE,
CONCAT(
'CREATE INDEX ', INDEX_NAME,
' ON ', TABLE_NAME,
' (', GROUP_CONCAT(COLUMN_NAME ORDER BY SEQ_IN_INDEX), ')'
) AS DDL
FROM information_schema.STATISTICS
WHERE TABLE_SCHEMA = DATABASE()
AND INDEX_NAME != 'PRIMARY'
GROUP BY TABLE_NAME, INDEX_NAME;
What it is: Instead of indexing an entire long text column, index only the first N characters.
Problem: Indexing long VARCHAR/TEXT columns is expensive.
Solution: Index only the first N characters.
-- Full index (large and slow)
CREATE INDEX idx_email_full ON customers(email);
-- Prefix index (smaller and faster)
CREATE INDEX idx_email_prefix ON customers(email(20));
-- Find optimal prefix length
SELECT
COUNT(DISTINCT LEFT(email, 5)) AS len_5,
COUNT(DISTINCT LEFT(email, 10)) AS len_10,
COUNT(DISTINCT LEFT(email, 15)) AS len_15,
COUNT(DISTINCT LEFT(email, 20)) AS len_20,
COUNT(DISTINCT email) AS full_length
FROM customers;
-- Use prefix that achieves ~95% of full cardinality
Limitations:
What it is: Create an index on a calculated expression or function result, not just a column value.
-- Index a function result
CREATE INDEX idx_year_month ON orders((YEAR(order_date)), (MONTH(order_date)));
-- Now this query uses the index
SELECT * FROM orders
WHERE YEAR(order_date) = 2024 AND MONTH(order_date) = 1;
-- Index JSON fields
ALTER TABLE products ADD COLUMN attrs JSON;
CREATE INDEX idx_json_price ON products((CAST(attrs->>'$.price' AS DECIMAL(10,2))));
-- Verify usage
EXPLAIN SELECT * FROM products
WHERE CAST(attrs->>'$.price' AS DECIMAL(10,2)) > 100;
What it is: MySQL 8.0.13+ can use a composite index even when you skip the first column in your WHERE clause.
-- Index: (country, city, status)
CREATE INDEX idx_location_status ON customers(country, city, status);
-- Old behavior: Cannot use index (skips first column)
SELECT * FROM customers WHERE city = 'New York' AND status = 'active';
-- MySQL 8.0.13+: Skip scan can use the index!
-- Set optimizer_switch='skip_scan=on'; (default in 8.0.13+)
-- Verify with EXPLAIN
EXPLAIN SELECT * FROM customers WHERE city = 'New York' AND status = 'active';
-- Look for: Extra: Using index for skip scan
When skip scan helps:
What it is: MySQL sorts row IDs before reading data from disk, reducing random I/O and making disk access more efficient.
-- Enable MRR
SET optimizer_switch='mrr=on,mrr_cost_based=on';
-- MRR is beneficial for range queries
SELECT * FROM orders
WHERE order_date BETWEEN '2024-01-01' AND '2024-12-31'
ORDER BY order_date;
-- EXPLAIN shows: Extra: Using MRR
-- Benefits:
-- - Reduces random I/O
-- - Better disk access patterns
-- - More efficient buffer pool usage
What it is: Override MySQL’s optimizer decisions when it chooses the wrong index or query plan.
-- Force specific index
SELECT * FROM orders FORCE INDEX (idx_orderdate_payment_customer)
WHERE order_date >= '2024-01-01';
-- Ignore specific index
SELECT * FROM orders IGNORE INDEX (idx_customer_id)
WHERE customer_id > 1000;
-- Suggest multiple indexes (MySQL chooses best)
SELECT * FROM orders USE INDEX (idx_orderdate_payment_customer, idx_customer_id)
WHERE order_date >= '2024-01-01' AND customer_id > 1000;
-- Modern approach: Optimizer hints (MySQL 8.0+)
SELECT /*+ INDEX(orders idx_orderdate_payment_customer) */
*
FROM orders
WHERE order_date >= '2024-01-01';
-- Join order hint
SELECT /*+ JOIN_ORDER(o, c, oi) */
*
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN order_items oi ON o.order_id = oi.order_id;
What it is: Split huge tables into smaller physical pieces (partitions), so queries only scan relevant partitions instead of the entire table.
-- Partition by date range
CREATE TABLE orders_partitioned (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE,
total_amount DECIMAL(10,2)
)
PARTITION BY RANGE (YEAR(order_date)) (
PARTITION p_2022 VALUES LESS THAN (2023),
PARTITION p_2023 VALUES LESS THAN (2024),
PARTITION p_2024 VALUES LESS THAN (2025),
PARTITION p_future VALUES LESS THAN MAXVALUE
);
-- Query automatically prunes partitions
EXPLAIN PARTITIONS
SELECT * FROM orders_partitioned
WHERE order_date >= '2024-01-01';
-- Shows: partitions: p_2024,p_future (others pruned!)
-- View partition information
SELECT
PARTITION_NAME,
TABLE_ROWS,
AVG_ROW_LENGTH,
DATA_LENGTH / 1024 / 1024 AS 'Data (MB)'
FROM information_schema.PARTITIONS
WHERE TABLE_NAME = 'orders_partitioned';
Partitioning benefits:
What it is: Adjust MySQL’s internal cost calculations for how “expensive” different operations are. This is advanced and risky - only change if you know what you’re doing!
-- View current cost configuration
SELECT * FROM mysql.server_cost;
SELECT * FROM mysql.engine_cost;
-- Adjust costs (use with extreme caution!)
-- Example: Make disk seeks appear more expensive
UPDATE mysql.server_cost
SET cost_value = 2.0
WHERE cost_name = 'disk_temptable_row_cost';
-- Flush to apply changes
FLUSH OPTIMIZER_COSTS;
-- Reset to defaults
UPDATE mysql.server_cost SET cost_value = DEFAULT;
FLUSH OPTIMIZER_COSTS;
Warning: Only tune after extensive testing. Incorrect values can severely degrade performance.
What it is: Control how many data pages MySQL samples when calculating index statistics. More samples = more accurate but slower; fewer samples = less accurate but faster.
-- Global setting
SET GLOBAL innodb_stats_persistent_sample_pages = 50; -- Default: 20
-- Per-table setting
ALTER TABLE orders STATS_SAMPLE_PAGES = 100;
-- More samples = better accuracy but slower ANALYZE
-- Fewer samples = faster ANALYZE but less accurate
-- For large tables with skewed data, increase sampling
Problem:
-- Dashboard query takes 30+ seconds
SELECT
DATE(order_date) AS day,
COUNT(*) AS orders,
SUM(total_amount) AS revenue
FROM orders
WHERE order_date >= DATE_SUB(NOW(), INTERVAL 30 DAY)
GROUP BY DATE(order_date)
ORDER BY day DESC;
Analysis:
DATE(order_date)Solution:
-- 1. Create functional index (MySQL 8.0.13+)
CREATE INDEX idx_order_date_func ON orders((DATE(order_date)));
-- 2. Or rewrite query to avoid function
SELECT
DATE(order_date) AS day,
COUNT(*) AS orders,
SUM(total_amount) AS revenue
FROM orders
WHERE order_date >= DATE_SUB(CURDATE(), INTERVAL 30 DAY)
AND order_date < DATE_ADD(CURDATE(), INTERVAL 1 DAY)
GROUP BY DATE(order_date)
ORDER BY day DESC;
-- 3. For recurring queries, use materialized view/summary table
CREATE TABLE daily_order_summary (
day DATE PRIMARY KEY,
order_count INT,
total_revenue DECIMAL(12,2),
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
);
-- Update via scheduled job or trigger
INSERT INTO daily_order_summary (day, order_count, total_revenue)
SELECT DATE(order_date), COUNT(*), SUM(total_amount)
FROM orders
WHERE DATE(order_date) = CURDATE()
GROUP BY DATE(order_date)
ON DUPLICATE KEY UPDATE
order_count = VALUES(order_count),
total_revenue = VALUES(total_revenue);
Result: 30s → 50ms
Problem:
Analysis:
-- Check index count
SELECT
TABLE_NAME,
COUNT(*) AS index_count,
SUM(CARDINALITY) AS total_cardinality
FROM information_schema.STATISTICS
WHERE TABLE_SCHEMA = DATABASE()
AND TABLE_NAME = 'orders'
GROUP BY TABLE_NAME;
-- Result: 12 indexes (too many!)
Solution:
-- 1. Identify unused indexes
SELECT
s.INDEX_NAME,
s.CARDINALITY,
IFNULL(i.COUNT_STAR, 0) AS uses
FROM information_schema.STATISTICS s
LEFT JOIN performance_schema.table_io_waits_summary_by_index_usage i
ON s.TABLE_SCHEMA = i.OBJECT_SCHEMA
AND s.TABLE_NAME = i.OBJECT_NAME
AND s.INDEX_NAME = i.INDEX_NAME
WHERE s.TABLE_SCHEMA = DATABASE()
AND s.TABLE_NAME = 'orders'
AND s.INDEX_NAME != 'PRIMARY'
GROUP BY s.INDEX_NAME
HAVING uses = 0;
-- 2. Make candidates invisible first (test impact)
ALTER TABLE orders ALTER INDEX idx_rarely_used INVISIBLE;
-- 3. Monitor for 7 days, then drop if no issues
DROP INDEX idx_rarely_used ON orders;
-- 4. Consolidate overlapping indexes
-- Instead of: idx_customer(customer_id) and idx_customer_date(customer_id, order_date)
-- Keep only: idx_customer_date (covers both use cases via leftmost prefix)
Result: Write latency back to 8ms (acceptable trade-off)
Problem:
Root Cause: Stale statistics
Solution:
-- 1. Check statistics age
SELECT
TABLE_NAME,
UPDATE_TIME,
TABLE_ROWS,
AVG_ROW_LENGTH
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = DATABASE()
ORDER BY UPDATE_TIME;
-- 2. Update statistics
ANALYZE TABLE orders, customers, order_items;
-- 3. Enable automatic statistics
ALTER TABLE orders STATS_AUTO_RECALC = 1;
ALTER TABLE orders STATS_SAMPLE_PAGES = 50;
-- 4. Create histograms for skewed columns
ANALYZE TABLE orders UPDATE HISTOGRAM ON order_date, payment_status WITH 256 BUCKETS;
-- 5. Schedule regular statistics updates
-- Add to cron: mysqlanalyze --auto-repair --optimize --all-databases
Result: Query time back to baseline, automated maintenance prevents regression
Problem:
-- Page 1000 takes 10+ seconds
SELECT * FROM orders
ORDER BY order_date DESC
LIMIT 10000, 10;
Why it’s slow: MySQL must scan 10,010 rows to skip 10,000
Solution:
-- 1. Keyset pagination (best for large offsets)
SELECT * FROM orders
WHERE order_date < '2024-01-15 10:30:00' -- Last value from previous page
ORDER BY order_date DESC
LIMIT 10;
-- 2. Deferred join (covering index + late row lookup)
SELECT o.*
FROM orders o
INNER JOIN (
SELECT order_id
FROM orders
ORDER BY order_date DESC
LIMIT 10000, 10
) AS page USING (order_id);
-- 3. Add limit on date range
SELECT * FROM orders
WHERE order_date >= '2024-01-01' -- Reasonable range
ORDER BY order_date DESC
LIMIT 10000, 10;
Result: 10s → 100ms
These checklists help maintain optimal database performance over time.
SELECT * FROM sys.session WHERE command = 'Query' AND time > 10;
ANALYZE TABLE affected_tables;
SELECT /*+ INDEX(table idx_name) */ ...
-- For queries like: "Show me recent active events for user X"
CREATE INDEX idx_timeseries ON events(
created_at DESC, -- Recent first (DESC)
event_type, -- Filter column
user_id -- Join/filter column
);
-- For queries filtering by tenant (customer isolation)
CREATE INDEX idx_tenant_composite ON records(
tenant_id, -- ALWAYS first column for data isolation
created_at DESC, -- Time ordering
status -- Status filter
);
-- Cover common filter combinations
CREATE INDEX idx_search_filters ON products(
category_id, -- Primary filter
price, -- Range filter
stock_quantity, -- Availability filter
brand_id -- Secondary filter
);
-- Covering index for fast aggregations (SUM, COUNT, AVG)
CREATE INDEX idx_reporting_cover ON orders(
order_date, -- Time range
status, -- Filter
total_amount, -- What you're aggregating
customer_id -- GROUP BY column
);
-- Efficiently query only active (non-deleted) records
CREATE INDEX idx_active_records ON users(
deleted_at, -- NULL = active
status,
created_at
);
-- Note: MySQL doesn't support partial indexes, but this pattern still helps
Effective MySQL query optimization is built on these foundations.
| # | Principle | Action |
|---|---|---|
| 1 | Measure first | Use EXPLAIN and slow query log before optimizing |
| 2 | High cardinality first | Order composite indexes by uniqueness (most unique → least unique) |
| 3 | Balance reads vs writes | Indexes can speed reads but add write overhead; measure impact |
| 4 | Keep statistics current | Run ANALYZE TABLE after significant data changes |
| 5 | Monitor in production | Use Performance Schema to track actual index usage |
| Category | Recommendation |
|---|---|
| Good indexes | Selectivity >10%, used in WHERE/JOIN/ORDER BY |
| Bad indexes | Cardinality <10% alone, rarely used, redundant |
| Production DDL | Use ALGORITHM=INPLACE, LOCK=NONE when supported |
| Buffer pool | Target >99% hit ratio |
| Index count | OLTP: 3-7 per table, OLAP: 10-15 per table |
| Statistics | Update weekly or after 10%+ data change |
“Premature optimization is the root of all evil.” — Donald Knuth
Always profile with real data, test thoroughly, and monitor after deployment. The best index strategy:
| Resource | URL |
|---|---|
| MySQL Optimization Guide | https://dev.mysql.com/doc/refman/8.0/en/optimization.html |
| Understanding EXPLAIN | https://dev.mysql.com/doc/refman/8.0/en/explain-output.html |
| Index Statistics | https://dev.mysql.com/doc/refman/8.0/en/index-statistics.html |
| Online DDL Operations | https://dev.mysql.com/doc/refman/8.0/en/innodb-online-ddl.html |
| Optimizer Hints | https://dev.mysql.com/doc/refman/8.0/en/optimizer-hints.html |
| Tool | Purpose |
|---|---|
| pt-query-digest | Analyze slow query logs (Percona Toolkit) |
| pt-online-schema-change | Zero-downtime schema changes for large tables |
| MySQLTuner | Automated configuration recommendations |
| Percona Monitoring (PMM) | Comprehensive database monitoring |
| sys schema | Built-in performance analysis views (MySQL 8.0+) |
“High Performance MySQL” by Baron Schwartz et al. (O’Reilly) The definitive guide to MySQL optimization
“MySQL 8 Query Performance Tuning” by Jesper Wisborg Krogh (Apress) Detailed query optimization techniques for MySQL 8.0+
“Effective MySQL: Optimizing SQL Statements” by Ronald Bradford Practical SQL optimization patterns
| Resource | URL |
|---|---|
| MySQL Server Blog | https://mysqlserverteam.com/ |
| Planet MySQL | https://planet.mysql.com/ |
| Percona Blog | https://www.percona.com/blog/ |
| Version | 2.0 (Accessible Edition) |
| Last Updated | January 2026 |
| MySQL Version | 8.0+ (tested on 8.0.41) |
| Target Audience | Database Administrators, Developers, DevOps Engineers, Performance Engineers |
| License | Educational Use |
This guide is based on production experience and real-world testing. All examples have been validated against MySQL 8.0.41 with actual data (10K customers, 50K orders, 100K order items).