Database Meets AI: A Survey

Learning-based Database Configuration

Problem: to find the best set of configurations for a DBMS.

Finding the best configurations by branching and bound.

Finding the bets configurations using traditional ML-based methods.

SIGMOD'17 - Automatic Database Management System Tuning ThroughLarge-scale Machine Learning
- Alias: OutterTune
- Read Note
EDBT'19 - SparkTune: tuning Spark SQL through query cost modeling
Cons
- The optimal solution obtained in the current stage is not guaranteed to be optimal in other stages.
- Requires a large number of high quality samples for training.
- Cannot support too many knobs.

Uses a Reinforcement Learning (RL) agent to find the best configurations for a DBMS.

SIGMOD'19 - An End-to-End Automatic Cloud Database Tuning System Using Deep Reinforcement Learning
- Alias: CDBTune
- Method
  - The RL Modeling:
    - Environment: a cloud DBMS
    - State: the internal metrics of the DBMS (similar to OutterTune)
    - Action: the values for increasing or decreasing configurations (knobs)
    - Reward: the difference of DBMS's performance
    - Agent Model: Deep Deterministic Policy Gradient (DDPG)
- Pros
  - Does not need high-quality training data
- Cons
  - without considering workload features
VLDB'19 - QTune: A Query-Aware Database Tuning System with DeepReinforcement Learning
- Alias: QTune
- Method
  - Basically same with CDBTune but considers workloads.
  - Uses Double-state Deep Reinforcement Learning (DS-DRL)

(Reading...)