Pre-creating Interval Partitions
09/02/2016 6 Comments
One of the major problems with interval-based partitioning is the statistics. Partitions appear dynamically as-needed based upon data being inserted or udpated, and any partition magically appearing isn’t going to have any statistics.
This is generally a stability issue as you will, at best, be using dynamic statistics for your optimizations. So how do we deal with it? My preferred method is to pre-create the partitions and copy statistics from a good partition into the pre-created partition. But how do we get the partitions to appear? Here’s 2 options:
- Insert data into the row with the partition key for the partition we wish to create, and rollback. This can be tricky, especially with tables containing many NOT NULL columns, and is subject to failure based upon table evolution.
- Lock the relevant partition in shared mode using the commandLOCK TABLE .x. PARTITION FOR <partition-key> IN SHARE MODE;
This will place a shared lock on the non-existant partition, which Oracle will create so it can lock it. A much less messy solution, and not one I had thought of until shown the light by Dan Jankowski.
So does option 2 work? Here’s a quick example:
10:51:55 NEIL @ ORCL01 > CREATE TABLE interval_table (id NUMBER, created_date DATE) 10:51:55 2 PARTITION BY RANGE (created_date) INTERVAL (NUMTOYMINTERVAL(1,'MONTH')) 10:51:55 3 ( PARTITION part_01 values LESS THAN (TO_DATE('01-JAN-2015','DD-MON-YYYY'))) 10:51:55 4 / Table created. 10:51:55 NEIL @ ORCL01 > select table_name, partition_name,high_value from user_tab_partitions order by 1,2; TABLE_NAME PARTITION_NAME HIGH_VALUE ------------------------------ ------------------------------ -------------------------------------------------------------------------------- INTERVAL_TABLE PART_01 TO_DATE(' 2015-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA use a shared lock to generate a new partition 10:51:55 NEIL @ ORCL01 > lock table interval_table partition for (to_date('01-JAN-2016','DD-MON-YYYY')) in share mode; Table(s) Locked. 10:51:55 NEIL @ ORCL01 > select table_name, partition_name,high_value from user_tab_partitions order by 1,2; TABLE_NAME PARTITION_NAME HIGH_VALUE ------------------------------ ------------------------------ -------------------------------------------------------------------------------- INTERVAL_TABLE PART_01 TO_DATE(' 2015-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA INTERVAL_TABLE SYS_P647 TO_DATE(' 2016-02-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA 10:51:55 NEIL @ ORCL01 > lock table interval_table partition for (to_date('01-FEB-2016','DD-MON-YYYY')) in share mode; Table(s) Locked. 10:51:55 NEIL @ ORCL01 > lock table interval_table partition for (to_date('01-MAR-2016','DD-MON-YYYY')) in share mode; Table(s) Locked. 10:51:55 NEIL @ ORCL01 > lock table interval_table partition for (to_date('01-APR-2016','DD-MON-YYYY')) in share mode; Table(s) Locked. 10:51:55 NEIL @ ORCL01 > select table_name, partition_name,high_value from user_tab_partitions order by 1,2; TABLE_NAME PARTITION_NAME HIGH_VALUE ------------------------------ ------------------------------ -------------------------------------------------------------------------------- INTERVAL_TABLE PART_01 TO_DATE(' 2015-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA INTERVAL_TABLE SYS_P647 TO_DATE(' 2016-02-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA INTERVAL_TABLE SYS_P648 TO_DATE(' 2016-03-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA INTERVAL_TABLE SYS_P649 TO_DATE(' 2016-04-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA INTERVAL_TABLE SYS_P650 TO_DATE(' 2016-05-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA 10:51:55 NEIL @ ORCL01 > -- and release the locks... the rollback doesn't rollback the creation of the new partitions. 10:51:55 NEIL @ ORCL01 > rollback; Rollback complete.
So now what? To get the stats right I follow the following rule set:
Firstly, lock the table stats when you create the table and have a dedicated, focused stats job. This stop the general stats job from getting in the way of this this important partitioned table.
- Create a new partition
- Copy stats using DBMS_STATS.COPY_TABLE_STATS from your fullest partition (with FORCE=TRUE to override the locked stats option). Always try to pretend you have more data than you really have if faking it – it’s safer as an optimized plan with larger expected data sets processing a smaller data set tends to work much better than the stats assuming a small data set and being surprised by lots of data. Consider using SCALE_FACTOR if you have a growing dataset. Don’t reply upon Optimizer magic, such as Adaptive Execution Plans to dig out of a hole.
- Once a partition becomes “old” (like a no-longer used date-range partition), gather actual stats to get all your stats for that partition accurate. That may even become your new baseline stats copy partition. You will possibly never need to gather stats again for that partition.
- Be careful if you are copying stats when you have an object-level change. For example, if you put a new index on, you need to re-copy stats post change to any pre-created partitions. Empty pre-created partitions will have accurate (i.e. empty) stats and that’s really not what you want.
- Repeat as often as you pre-create a new partition
It looks to me like a nice test only. What is the benefits of implementation of such meaningless partition names you are ending up with? Any table, partition, constraint should always have strong meaningful name. For Oracle all management still performed on Object ID level. This name should always simplify management of the system. Otherwise even navigation on the system getting so complicated that mistakes are unavoidable.
Any object name on a system always should carry a good sense to a human mind. This requirement imposed by Oracle not by occasion and any DBA should follow it.
LikeLike
This is the name that Oracle gives them. You cannot name interval partitions.
The name of the partition is not relevant when it comes to how the optimizer interacts with the partitions or how we can perform partition-wise actions such as truncate.
The advantages of interval partitions are hugely outweighted by a system-generated naming standard. The main drawback is the lack of statistics, but Oracle will be rectifying this in a future release.
LikeLike
Neil, Oracle always provide you “default option” – read – the worst option. Default essentially is a rope to hung up yourself. After such “fast prebuild” good DBA have to come and rename each partition in a “human” way.
After all – Partitions has been invented as a way to simplify and speed up data management (partition archiving, partition archiving exchange and so on). Any “System Named” object is a nightmare for all human maintenance operations in the future. “Optimizer” argument actually out of scope – as far as statistics get collected at the time we processing data in the database it is almost irrelevant at which partition that statistics get listed – at partition A or B. It get collected – this is end of the story.
In the past I’ve been working with hourly/daily table partitions (size about 100 mill recs each).
Can you imagine headache of renaming and other maintenance such “prebuild” brings to life?
DBA of your statue should not even try to dispute obvious things with such childish arguments.
I hope now you will agree.
Thank you.
Norman
LikeLike
Norman,
Partitioning is not just implemented for data management.
Some sites use it to improve performance and don’t need to actively manage partitions.
Some sites don’t care what the partitions are called and use the high_value to identify and manipulate the partitions in case someone named a partition incorrectly. It’s only a name and does not affect function.
Some sites would rather have dynamically appearing partitions via INTERVAL partitioning to ensure all data is appropriately captured in case unexpected values appear. In Oracle 12.2, this is extended to LIST partitioning, so there’s obviously a market for dynamic partition creation, regardless of partition name.
The fundamental problem with interval partitions is that it is not possible to capture the point of creation, so you will inevitably end up with bad execution plans if you have data in the partition but no statistics. My blog post provides an option to help with this. It comes from the real world. It’s not a test case.
I would expect that, given your use-case, that you manually pre-create each partition and sub partition in advance and spend a lot of time understanding and managing your database. I suspect that you don’t use INTERVAL partitioning, but you seem to be against it because the system generated partition names aren’t nice without considering the appropriate use cases.
It is increasing the case that DBA’s don’t spend time on a single database and need to manage dozens. I believe Gartner said that 40 is average. Given this, the average DBA has 1 hour per database per week. Any form of automation is welcome and INTERVAL partitioning is appropriate in many cases. When Oracle add the statistical templates to INTERVAL partitioning in a future release (probably 12.2.3), this blog post will be a little obsolete – but only once everyone has upgraded.
Oracle is a very flexible and configurable product. I hope you will think a little more about use-cases which do no match your own challenging environment.
regards
Neil
LikeLike
I am two years late with a comment, but here it is anyway:
With the new “FOR …” syntax using the partition keys content instead of the names they are needed in far fewer cases. If meaningful names are required, then they can be set immediately after creation, for example
alter table interval_table rename partition for(date ‘2015-01-01’) to P_201501;
LikeLiked by 1 person
Identifying the table partition by its value using the new syntax arrived in 12.1 and lowers the chance of a mistake due mistyping the generated names.
LikeLike