How to create a partition table in MySQL (Aurora)
DB | Version |
---|---|
MySQL | 5.7.24 |
Oracle has long had a partitioning feature, and MySQL also has a partitioning feature.
Generally, time-series transaction data is partitioned (in this case, the RANGE COLUMNS partition) to maintain performance.
As a rule of thumb, MySQL’s response seems to deteriorate significantly when the number of transactions exceeds 1,000,000, so using the partitioning feature can greatly improve performance.
Suppose you have a table named “person table” that contains a surrogate key, name, and date of birth.
CREATE TABLE person( id INT NOT NULL, NAME VARCHAR(30), borned DATE) PARTITION BY RANGE COLUMNS(borned) ( PARTITION p201712 VALUES LESS THAN ('2018-01-01 00:00:00') COMMENT = '2017-12-data', PARTITION p201801 VALUES LESS THAN ('2018-02-01 00:00:00') COMMENT = '2018-01-data', PARTITION p201802 VALUES LESS THAN ('2018-03-01 00:00:00') COMMENT = '2018-02-data' );
The above creates three partitions, partitioning the December 2017 data, January 2018 data, and February 2018 data by the column value borned (date of birth).
Partitioning makes it possible to narrow down the data in advance.
It is also very useful to delete only the partitioned data.
Partitions are included in the conditions.
There is a PARTITION clause, which allows you to separate the partitions by PARTITION name. In the above example, p201712, p201801, and p201802 are the partition names.
SELECT * FROM person PARTITION (p201801)
You can issue SQL like this. If you want to narrow down the search even further, you can specify a WHERE clause after the PARTITION clause.
By including partitions in the condition in this way, the data in p201712 and p201802 are not scanned, which speeds up performance. Omission of scanning unnecessary partitions is sometimes referred to as partition pruning. Partition pruning can be applied with the WHERE clause alone, without using the PARTION clause.
To see specifically which partitions are being used, prefix the SQL with “EXPLAIN PARTITIONS”.
EXPLAIN PARTITIONS SELECT * FROM person PARTITION (p201801)
To create a future date partition
Eventually, partitions will have to be created from time to time for future dates. In the above table, there is no partition for the March 2018 data.
If we insert the March 2018 data in this state, “Table has no partition for value from column_list” will occur.
In the case of MySQL, one way to cover this is to use operations. (add partitions one by one at the time of maintenance with ALTER statement)
The other is to use MAXVALUE, which I personally find troublesome and do not recommend, but if you insert data after March 2018 with this method, the data will be stored in the partition named pmirai forever, so the error can be avoided. However, if you insert data after March 2018, the data will be in the partition named pmirai forever.
However, I think this is not a realistic feature, because you will find that the longer the operation period, the more data will be in the pmirai partition.
CREATE TABLE person( id INT NOT NULL, NAME VARCHAR(30), borned DATE) PARTITION BY RANGE COLUMNS(borned) ( PARTITION p201712 VALUES LESS THAN ('2018-01-01 00:00:00') COMMENT = '2017-12-data', PARTITION p201801 VALUES LESS THAN ('2018-02-01 00:00:00') COMMENT = '2018-01-data', PARTITION p201802 VALUES LESS THAN ('2018-03-01 00:00:00') COMMENT = '2018-02-data', PARTITION pmirai VALUES LESS THAN (MAXVALUE) // this );
Oracle 11g has interval partitioning, which automatically creates a partition at the same time as an insert, so Oracle is still a great choice for RDB.
Disadvantages of putting up INDEX
There is a disadvantage that putting an INDEX on everything slows down the registration update process. Therefore, it is necessary to investigate the impact of the delay in the registration update process when putting up an INDEX.
Find out the number of cases per partition
The following SQL can be used to find out how many data items are contained for a partition.
SELECT TABLE_SCHEMA, TABLE_NAME, PARTITION_NAME, PARTITION_ORDINAL_POSITION, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_NAME = 'table_name';
primary key must include all columns in the table’s partitioning function
In the case of RANGE COLUMNS partitioning, the above error occurs, so the PK must be a composite primary key that includes both id and borned.
Setting up partitioning in liquibase
How to set up partitioning with liquibase is described in “How to use liquibase with Spring Boot (MySQL)“.
コメント