Thursday, 5 December 2013

Exchange Partition and data validation


One of the thorny issues that keeps reoccurring in any related partition maintenance operation is ensuring rows end up in valid partitions based on the partition key associated with each row. The issue does not apply to hash partitioned tables, but does rear its head for list, range and composite partitioned tables.

A scenario will be shown that simulates a typical load and data related problem using the partition exchange technique. Identifying  mismatched rows and a few data cleansing suggestions will follow. The same sample tables that occur in a previous post will be used.

A few data rows will be inserted first, to both partitioned and non-partitioned tables tables. The non-partitioned table will then be loaded into a partition using the partition exchange technique where the partition keys are mismatched, as follows:

SQL> create table Trade_Account
  2  (
  3     Acc_ID           number not null
  4    ,Date_Created     date   not null
  5    ,is_Joint_Account varchar2(1) default 'N' not null
  6    ,Acc_Type_ID      number
  7  )
  8  partition by range (date_created)
  9  (
 10  partition p131109 values less than(to_date('2013-11-10', 'YYYY-MM-DD')),
 11  partition p131110 values less than(to_date('2013-11-11', 'YYYY-MM-DD')),
 12  partition p131111 values less than(to_date('2013-11-12', 'YYYY-MM-DD')),
 13  partition p131112 values less than(to_date('2013-11-13', 'YYYY-MM-DD')),
 14  partition p131113 values less than(to_date('2013-11-14', 'YYYY-MM-DD')),
 15  partition p131114 values less than(to_date('2013-11-15', 'YYYY-MM-DD')),
 16  partition pmax values less than (maxvalue));

Table created.

SQL> create table trade_account_load as
  2  (select * from trade_account where rownum < 1);

Table created.

SQL> insert into Trade_Account values
  2  (1,to_date('20131112','yyyymmdd'),'Y',1);

1 row created.

SQL> insert into Trade_Account values
  2  (2,to_date('20131113','yyyymmdd'),'N',1);

1 row created.


SQL> commit;

Commit complete.

Three rows will be inserted into the non-partitioned table, which will be used as the loading table. Two of the rows would map to the P131114 partition in an exchange partition. The row with a date of 20131109 however would not map to the P131114 partition.

SQL> insert into Trade_Account_load values
  2  (5,to_date('20131114','yyyymmdd'),'Y',1);

1 row created.

SQL> insert into Trade_Account_load values
  2  (6,to_date('20131114','yyyymmdd'),'N',1);

1 row created.

SQL> insert into Trade_Account_load values
  2  (7,to_date('20131109','yyyymmdd'),'N',1); --date mismatch with partition key

1 row created.

SQL> commit;


Commit complete.

Gather stats and verify number of rows per partition in the partitioned table:
 
SQL> begin
  2    dbms_stats.gather_table_stats(sys_context('userenv','current_schema')
  3                                , 'TRADE_ACCOUNT');
  4  end;
  5  /

PL/SQL procedure successfully completed.

SQL> select partition_name,num_rows
  2  from user_tab_partitions
  3  where table_name = 'TRADE_ACCOUNT';

PARTITION_NAME           NUM_ROWS
---------------------- ----------
P131109                         0
P131110                         0
P131111                         0
P131112                         1
P131113                         1
P131114                         0
PMAX                            0

7 rows selected.


Ensure that the primary keys match or else an exchange partition will raise an exception.
SQL> alter table trade_account
  2  add constraint trade_account_pk primary key(acc_id, date_created)
  3  using index
  4  (create unique index trade_account_pk
  5   on trade_account(acc_id,date_created) local);

Table altered.

SQL> alter table trade_account_load
  2  add constraint trade_account_load_pk primary key (acc_id,date_created);

Table altered.


Perform the exchange with validation (default behaviour):

SQL> alter table trade_account
  2  exchange partition P131114 with table trade_account_load
  3  including indexes;
exchange partition P131114 with table trade_account_load
                                      *
ERROR at line 2:
ORA-14099: all rows in table do not qualify for specified partition
 

An exception is raised since the row which does not match the partition key fails to load. All rows will therefore fail to load as can be seen from gathering the stats again, which shows no change from the previous stats gather above:

SQL> begin
  2    dbms_stats.gather_table_stats(sys_context('userenv','current_schema')
  3                                , 'TRADE_ACCOUNT');
  4  end;
  5  /

PL/SQL procedure successfully completed.

SQL> select partition_name,num_rows
  2  from user_tab_partitions
  3  where table_name = 'TRADE_ACCOUNT';

PARTITION_NAME                   NUM_ROWS
------------------------------ ----------
P131109                                 0
P131110                                 0
P131111                                 0
P131112                                 1
P131113                                 1
P131114                                 0
PMAX                                    0

7 rows selected.


If loading large volumes of data using an exchange partition, then not loading all rows due to a single row mismatch is generally an inefficient approach. A better approach, in cases where the majority of rows match, would be to load all and then attempt to identify rows that occur in a mismatched partition. This can be done using the WITHOUT VALIDATION clause as follows:

SQL> alter table trade_account
  2  exchange partition P131114 with table trade_account_load
  3  including indexes without validation;

Table altered.


All rows were loaded into the P131114 partition as can be seen from the updated stats:

SQL> begin
  2    dbms_stats.gather_table_stats(sys_context('userenv','current_schema')
  3                                , 'TRADE_ACCOUNT');
  4  end;
  5  /

PL/SQL procedure successfully completed.

SQL> select partition_name,num_rows
  2  from user_tab_partitions
  3  where table_name = 'TRADE_ACCOUNT';

PARTITION_NAME                   NUM_ROWS
------------------------------ ----------
P131109                                 0
P131110                                 0
P131111                                 0
P131112                                 1
P131113                                 1
P131114                                 3
PMAX                                    0

7 rows selected.


Identifying the single mismatched row can be done by using the ANALYZE with the associated VALIDATE STRUCTURE statement. The statement loads all mismatched rows into a table, INVALID_ROWS (created using $ORACLE_HOME/rdbms/admin/utlvalid.sql). Ensure the table exists prior to issuing the statement:

SQL> create table invalid_rows (
  2  owner_name         varchar2(30),
  3  table_name         varchar2(30),
  4  partition_name     varchar2(30),
  5  subpartition_name  varchar2(30),
  6  head_rowid         rowid,
  7  analyze_timestamp  date);

Table created.


SQL> analyze table trade_account partition (P131114) validate structure;

Table analyzed.

SQL> select  head_rowid
  2  from
  3          invalid_rows
  4  where
  5          owner_name    = sys_context('userenv','current_schema')
  6  and     table_name    = 'TRADE_ACCOUNT'
  7  and     partition_name= 'P131114';

HEAD_ROWID
------------------
AAAd63AAIAAADKmAAC

SQL> select * from trade_account
  2  where rowid =
  3  (select head_rowid
  4   from
  5          invalid_rows
  6   where
  7          owner_name    = sys_context('userenv','current_schema')
  8   and    table_name    = 'TRADE_ACCOUNT'
  9   and    partition_name= 'P131114');

    ACC_ID DATE_CREATED        I ACC_TYPE_ID
---------- ------------------- - -----------
         7 01/11/2013 00:00:00 N           1


The single mismatched row is easily identifiable using the ROWID. The row can be deleted and an alternative exchange partition using a different non-partition table could be performed to load mismatched rows.

SQL> delete from trade_account where rowid =
  2  (select head_rowid
  3   from
  4          invalid_rows
  5   where
  6          owner_name    = sys_context('userenv','current_schema')
  7   and    table_name    = 'TRADE_ACCOUNT'
  8   and    partition_name= 'P131114');

1 row deleted.

SQL> commit;

Commit complete.


Gathering stats shows the number of rows for the P131114 partition has reduced by one.

SQL> begin
  2    dbms_stats.gather_table_stats(sys_context('userenv','current_schema')
  3                                , 'TRADE_ACCOUNT');
  4  end;
  5  /

PL/SQL procedure successfully completed.

SQL> select partition_name,num_rows
  2  from user_tab_partitions
  3  where table_name = 'TRADE_ACCOUNT';

PARTITION_NAME                   NUM_ROWS
------------------------------ ----------
P131109                                 0
P131110                                 0
P131111                                 0
P131112                                 1
P131113                                 1
P131114                                 2
PMAX                                    0

7 rows selected.

No comments:

Post a Comment