[Net 2000 Ltd. Home][DataBee Home][DataBee Manual][DataBee FAQ]

The Extraction Set Temporary Tables

When the Set Extractor extracts a referentially correct set of rows it does not actually extract the data. All it does is identify (by ROWID) the collection of rows in the source database which will need to be copied to the target schema in order to form a referentially correct subset there. In other words, the output of the Set Extractor is nothing more than a list of ROWID values. These ROWID values are stored until the Set Loader application uses them to perform the load.

A table named DTBTAB is used in the source schema (or proxy schema) to collect the extracted ROWIDs. There are also two other associated tables, DTB_RSTATS and DTB_TSTATS, which are used to provide supplimental information regarding the collection of ROWID's in the DTBTAB table. In summary, the purpose of each table is as follows:

The storage for the extracted rowids. There will be one entry in this table for every extracted row. This table can get very large and may well need to have its storage clause adjusted so that it consumes disk space efficiently. The ROWID value is stored in the RID column and this column is a primary key. There will never be any duplicate ROWID's in this table.
This table is used to identify the names of each extracted table. Each row in this table describes one table name and contains a TID column which is the Table ID. Each row in the DTBTAB table also contains a TID column and this is how the DataBee software knows which ROWID belongs to which table. This table will usually be quite small and will only ever have one row for each distinct table extracted.
This table is used to contain statistics on each rule involved in the extraction set. Each row describes one extraction rule and contains columns referencing the source and target table id. This table is only used during the extraction process (the Set Loader does not need it) and will only ever contain one row for every extraction rule which executes.

Important note: The three temporary tables are always truncated at the start of each new extraction run. They are not truncated if the extraction set is restarted. Once the load has completed, the temporary tables can be truncated or dropped unless another load operation is required. It is possible to perform multiple loads into different target databases without a re-extract - the Set Loader application can load the same extracted data multiple times.

If a proxy schema is configured in the Rule Controller the temporary tables should be created in the proxy login schema not in the real source schema. The real source schema can be entirely read only as far as the database privileges of the proxy account are concerned.

The three temporary tables can be created automatically using the tools on the Extraction Options tab of the Edit Rule Controller form. The Edit Rule Controller form can be activated by double clicking on the appropriate Rule Controller in the Set Designer application. The definitions of the three temporary tables are shown below. It may well be desirable to add a storage clause to size these tables appropriately.

   RID urowid,
   TID number,
   RPN number,
   DELTAMARKER number   
CREATE index tid_ix on DTBTAB (tid, rpn);

DROP table DTB_TSTATS; CREATE table DTB_TSTATS ( tid number, tname varchar2(65), ridstoretid number, rowcount number(38), RPN number ); ALTER TABLE DTB_TSTATS ADD CONSTRAINT pk_DTB_TSTATS PRIMARY KEY (tid) USING INDEX;

DROP table DTB_RSTATS; CREATE table DTB_RSTATS ( ruleid number(6), srctid number, tgttid number, passcount number(38), rowcount number(38), RPN number ); ALTER TABLE DTB_RSTATS ADD CONSTRAINT pk_DTB_RSTATS PRIMARY KEY (ruleid) USING INDEX;

[Net 2000 Ltd. Home][DataBee Home][DataBee Manual][DataBee FAQ]