Thursday 5 December 2013

Exchange Partition and data validation


One of the thorny issues that keeps reoccurring in any related partition maintenance operation is ensuring rows end up in valid partitions based on the partition key associated with each row. The issue does not apply to hash partitioned tables, but does rear its head for list, range and composite partitioned tables.

A scenario will be shown that simulates a typical load and data related problem using the partition exchange technique. Identifying  mismatched rows and a few data cleansing suggestions will follow. The same sample tables that occur in a previous post will be used.

A few data rows will be inserted first, to both partitioned and non-partitioned tables tables. The non-partitioned table will then be loaded into a partition using the partition exchange technique where the partition keys are mismatched, as follows:

SQL> create table Trade_Account
  2  (
  3     Acc_ID           number not null
  4    ,Date_Created     date   not null
  5    ,is_Joint_Account varchar2(1) default 'N' not null
  6    ,Acc_Type_ID      number
  7  )
  8  partition by range (date_created)
  9  (
 10  partition p131109 values less than(to_date('2013-11-10', 'YYYY-MM-DD')),
 11  partition p131110 values less than(to_date('2013-11-11', 'YYYY-MM-DD')),
 12  partition p131111 values less than(to_date('2013-11-12', 'YYYY-MM-DD')),
 13  partition p131112 values less than(to_date('2013-11-13', 'YYYY-MM-DD')),
 14  partition p131113 values less than(to_date('2013-11-14', 'YYYY-MM-DD')),
 15  partition p131114 values less than(to_date('2013-11-15', 'YYYY-MM-DD')),
 16  partition pmax values less than (maxvalue));

Table created.

SQL> create table trade_account_load as
  2  (select * from trade_account where rownum < 1);

Table created.

SQL> insert into Trade_Account values
  2  (1,to_date('20131112','yyyymmdd'),'Y',1);

1 row created.

SQL> insert into Trade_Account values
  2  (2,to_date('20131113','yyyymmdd'),'N',1);

1 row created.


SQL> commit;

Commit complete.

Three rows will be inserted into the non-partitioned table, which will be used as the loading table. Two of the rows would map to the P131114 partition in an exchange partition. The row with a date of 20131109 however would not map to the P131114 partition.

SQL> insert into Trade_Account_load values
  2  (5,to_date('20131114','yyyymmdd'),'Y',1);

1 row created.

SQL> insert into Trade_Account_load values
  2  (6,to_date('20131114','yyyymmdd'),'N',1);

1 row created.

SQL> insert into Trade_Account_load values
  2  (7,to_date('20131109','yyyymmdd'),'N',1); --date mismatch with partition key

1 row created.

SQL> commit;


Commit complete.

Gather stats and verify number of rows per partition in the partitioned table:
 
SQL> begin
  2    dbms_stats.gather_table_stats(sys_context('userenv','current_schema')
  3                                , 'TRADE_ACCOUNT');
  4  end;
  5  /

PL/SQL procedure successfully completed.

SQL> select partition_name,num_rows
  2  from user_tab_partitions
  3  where table_name = 'TRADE_ACCOUNT';

PARTITION_NAME           NUM_ROWS
---------------------- ----------
P131109                         0
P131110                         0
P131111                         0
P131112                         1
P131113                         1
P131114                         0
PMAX                            0

7 rows selected.


Ensure that the primary keys match or else an exchange partition will raise an exception.
SQL> alter table trade_account
  2  add constraint trade_account_pk primary key(acc_id, date_created)
  3  using index
  4  (create unique index trade_account_pk
  5   on trade_account(acc_id,date_created) local);

Table altered.

SQL> alter table trade_account_load
  2  add constraint trade_account_load_pk primary key (acc_id,date_created);

Table altered.


Perform the exchange with validation (default behaviour):

SQL> alter table trade_account
  2  exchange partition P131114 with table trade_account_load
  3  including indexes;
exchange partition P131114 with table trade_account_load
                                      *
ERROR at line 2:
ORA-14099: all rows in table do not qualify for specified partition
 

An exception is raised since the row which does not match the partition key fails to load. All rows will therefore fail to load as can be seen from gathering the stats again, which shows no change from the previous stats gather above:

SQL> begin
  2    dbms_stats.gather_table_stats(sys_context('userenv','current_schema')
  3                                , 'TRADE_ACCOUNT');
  4  end;
  5  /

PL/SQL procedure successfully completed.

SQL> select partition_name,num_rows
  2  from user_tab_partitions
  3  where table_name = 'TRADE_ACCOUNT';

PARTITION_NAME                   NUM_ROWS
------------------------------ ----------
P131109                                 0
P131110                                 0
P131111                                 0
P131112                                 1
P131113                                 1
P131114                                 0
PMAX                                    0

7 rows selected.


If loading large volumes of data using an exchange partition, then not loading all rows due to a single row mismatch is generally an inefficient approach. A better approach, in cases where the majority of rows match, would be to load all and then attempt to identify rows that occur in a mismatched partition. This can be done using the WITHOUT VALIDATION clause as follows:

SQL> alter table trade_account
  2  exchange partition P131114 with table trade_account_load
  3  including indexes without validation;

Table altered.


All rows were loaded into the P131114 partition as can be seen from the updated stats:

SQL> begin
  2    dbms_stats.gather_table_stats(sys_context('userenv','current_schema')
  3                                , 'TRADE_ACCOUNT');
  4  end;
  5  /

PL/SQL procedure successfully completed.

SQL> select partition_name,num_rows
  2  from user_tab_partitions
  3  where table_name = 'TRADE_ACCOUNT';

PARTITION_NAME                   NUM_ROWS
------------------------------ ----------
P131109                                 0
P131110                                 0
P131111                                 0
P131112                                 1
P131113                                 1
P131114                                 3
PMAX                                    0

7 rows selected.


Identifying the single mismatched row can be done by using the ANALYZE with the associated VALIDATE STRUCTURE statement. The statement loads all mismatched rows into a table, INVALID_ROWS (created using $ORACLE_HOME/rdbms/admin/utlvalid.sql). Ensure the table exists prior to issuing the statement:

SQL> create table invalid_rows (
  2  owner_name         varchar2(30),
  3  table_name         varchar2(30),
  4  partition_name     varchar2(30),
  5  subpartition_name  varchar2(30),
  6  head_rowid         rowid,
  7  analyze_timestamp  date);

Table created.


SQL> analyze table trade_account partition (P131114) validate structure;

Table analyzed.

SQL> select  head_rowid
  2  from
  3          invalid_rows
  4  where
  5          owner_name    = sys_context('userenv','current_schema')
  6  and     table_name    = 'TRADE_ACCOUNT'
  7  and     partition_name= 'P131114';

HEAD_ROWID
------------------
AAAd63AAIAAADKmAAC

SQL> select * from trade_account
  2  where rowid =
  3  (select head_rowid
  4   from
  5          invalid_rows
  6   where
  7          owner_name    = sys_context('userenv','current_schema')
  8   and    table_name    = 'TRADE_ACCOUNT'
  9   and    partition_name= 'P131114');

    ACC_ID DATE_CREATED        I ACC_TYPE_ID
---------- ------------------- - -----------
         7 01/11/2013 00:00:00 N           1


The single mismatched row is easily identifiable using the ROWID. The row can be deleted and an alternative exchange partition using a different non-partition table could be performed to load mismatched rows.

SQL> delete from trade_account where rowid =
  2  (select head_rowid
  3   from
  4          invalid_rows
  5   where
  6          owner_name    = sys_context('userenv','current_schema')
  7   and    table_name    = 'TRADE_ACCOUNT'
  8   and    partition_name= 'P131114');

1 row deleted.

SQL> commit;

Commit complete.


Gathering stats shows the number of rows for the P131114 partition has reduced by one.

SQL> begin
  2    dbms_stats.gather_table_stats(sys_context('userenv','current_schema')
  3                                , 'TRADE_ACCOUNT');
  4  end;
  5  /

PL/SQL procedure successfully completed.

SQL> select partition_name,num_rows
  2  from user_tab_partitions
  3  where table_name = 'TRADE_ACCOUNT';

PARTITION_NAME                   NUM_ROWS
------------------------------ ----------
P131109                                 0
P131110                                 0
P131111                                 0
P131112                                 1
P131113                                 1
P131114                                 2
PMAX                                    0

7 rows selected.

Wednesday 4 December 2013

Exchange partition and ensuring indexes are usable


In my previous post I highlighted a few of the issues one can come across when dealing with indexes and the Partition Exchange Technique (PET). The index was marked as unusable, even though the index columns and constraint type (primary) matched.

For example:

SQL> alter table trade_account
  2  exchange partition P131113 with table trade_account_load;

Table altered.

SQL> select uix1.index_name
  2       ,null as partition_name
  3       ,uix1.status
  4  from   user_indexes uix1
  5  where  uix1.table_name  = 'TRADE_ACCOUNT'
  6  and    uix1.partitioned = 'NO'
  7  union all
  8  select uip.index_name
  9       ,uip.partition_name
 10       ,uip.status
 11  from   user_ind_partitions uip
 12  inner join user_indexes uix2 on uip.index_name = uix2.index_name
 13                             and uix2.table_name = 'TRADE_ACCOUNT'
 14  order by 2;

INDEX_NAME                     PARTITION_NAME                 STATUS
------------------------------ ------------------------------ --------
TRADE_ACCOUNT_PK               P131109                        USABLE
TRADE_ACCOUNT_PK               P131110                        USABLE
TRADE_ACCOUNT_PK               P131111                        USABLE
TRADE_ACCOUNT_PK               P131112                        USABLE
TRADE_ACCOUNT_PK               P131113                        UNUSABLE
TRADE_ACCOUNT_PK               P131114                        USABLE
TRADE_ACCOUNT_PK               PMAX                           USABLE

7 rows selected.


As can be seen, partition P131113 is marked as unusable. The local partion index would have to therefore be rebuilt, as follows:

SQL> alter index TRADE_ACCOUNT_PK rebuild partition P131113;

Index altered.

SQL> select uix1.index_name
  2       ,null as partition_name
  3       ,uix1.status
  4  from   user_indexes uix1
  5  where  uix1.table_name  = 'TRADE_ACCOUNT'
  6  and    uix1.partitioned = 'NO'
  7  union all
  8  select uip.index_name
  9       ,uip.partition_name
 10       ,uip.status
 11  from   user_ind_partitions uip
 12  inner join user_indexes uix2 on uip.index_name = uix2.index_name
 13                             and uix2.table_name = 'TRADE_ACCOUNT'
 14  order by 2;

INDEX_NAME                     PARTITION_NAME                 STATUS
------------------------------ ------------------------------ -------
TRADE_ACCOUNT_PK               P131109                        USABLE
TRADE_ACCOUNT_PK               P131110                        USABLE
TRADE_ACCOUNT_PK               P131111                        USABLE
TRADE_ACCOUNT_PK               P131112                        USABLE
TRADE_ACCOUNT_PK               P131113                        USABLE
TRADE_ACCOUNT_PK               P131114                        USABLE
TRADE_ACCOUNT_PK               PMAX                           USABLE

7 rows selected.



The local partion index is now shown as usable. Assuming the indexes/constraints match, one can skip the rebuild by including INCLUDE INDEXES in the exchange statement, for example:

SQL> select uix1.index_name
  2       ,null as partition_name
  3       ,uix1.status
  4  from   user_indexes uix1
  5  where  uix1.table_name  = 'TRADE_ACCOUNT'
  6  and    uix1.partitioned = 'NO'
  7  union all
  8  select uip.index_name
  9       ,uip.partition_name
 10       ,uip.status
 11  from   user_ind_partitions uip
 12  inner join user_indexes uix2 on uip.index_name = uix2.index_name
 13                             and uix2.table_name = 'TRADE_ACCOUNT'
 14  order by 2;

INDEX_NAME                     PARTITION_NAME                 STATUS
------------------------------ ------------------------------ --------
TRADE_ACCOUNT_PK               P131109                        USABLE
TRADE_ACCOUNT_PK               P131110                        USABLE
TRADE_ACCOUNT_PK               P131111                        USABLE
TRADE_ACCOUNT_PK               P131112                        USABLE
TRADE_ACCOUNT_PK               P131113                        USABLE
TRADE_ACCOUNT_PK               P131114                        USABLE
TRADE_ACCOUNT_PK               PMAX                           USABLE

7 rows selected.

SQL> alter table trade_account
  2  exchange partition P131113 with table trade_account_load
  3  including indexes;

Table altered.

SQL> select uix1.index_name
  2       ,null as partition_name
  3       ,uix1.status
  4  from   user_indexes uix1
  5  where  uix1.table_name  = 'TRADE_ACCOUNT'
  6  and    uix1.partitioned = 'NO'
  7  union all
  8  select uip.index_name
  9       ,uip.partition_name
 10       ,uip.status
 11  from   user_ind_partitions uip
 12  inner join user_indexes uix2 on uip.index_name = uix2.index_name
 13                             and uix2.table_name = 'TRADE_ACCOUNT'
 14  order by 2;

INDEX_NAME                     PARTITION_NAME                 STATUS
------------------------------ ------------------------------ --------
TRADE_ACCOUNT_PK               P131109                        USABLE
TRADE_ACCOUNT_PK               P131110                        USABLE
TRADE_ACCOUNT_PK               P131111                        USABLE
TRADE_ACCOUNT_PK               P131112                        USABLE
TRADE_ACCOUNT_PK               P131113                        USABLE
TRADE_ACCOUNT_PK               P131114                        USABLE
TRADE_ACCOUNT_PK               PMAX                           USABLE

7 rows selected.


The good news is that the local partitioned index remains usable. The same however does not apply to global indexes. For global indexes one can specify the UPDATE GLOBAL INDEXES, that will ensure they are usable as well, as follows:

SQL> alter table trade_account
  2  exchange partition P131113 with table trade_account_load
  3  including indexes
  4  update global indexes;

Table altered.



Bear in mind, specifying that global indexes are updated as part of the exchange will add a significant workload for large volumed tables and might take a considerable amount of time. For global indexes there are several parallel techniques that can be employed instead, that can be initiated after the exchange. The only downside is that execution plans that refer to the global index will no longer be reproducible whilst the index remains in an unusable state. 

It is however possible to force Oracle to ignore the fact that an index is unusable, which will ensure that the optimizer will continue to use the same execution plans where unusable indexes are referenced. The initilisation parameter to modify in order to enable such behaviour is SKIP_UNUSABLE_INDEXES and can be set at either the session or system level as follows;

SQL> alter system set SKIP_UNUSABLE_INDEXES = false;

System altered.


SQL> alter session set SKIP_UNUSABLE_INDEXES = false;

Session altered.


It is advisable to keep to the default behaviour, which is SKIP_UNUSABLE_INDEXES set to TRUE, as the impact on execution and application behaviour would be unpredictable.

Another option is to specify PARALLEL along with the UPDATE GLOBAL INDEXES, which to a degree mitigates the expected workload, but should be used with caution as one cannot guarantee the overall Oracle workload and resource usage at the time of issuing the statement, and may ultimately cause disruption to other users. Use of the PARALLEL clause, looks as follows:

SQL> alter table trade_account
  2  exchange partition P131113 with table trade_account_load
  3  including indexes
  4  update global indexes parallel;

Table altered.