Thursday, April 16, 2009

Delta Detection Techniques in ETL Architecture

I'm reading: Delta Detection Techniques in ETL ArchitectureTweet this !
Log of transactions: For this approach, you need a log of all the changes to the source tables. This approach is most commonly used for transaction records, and for changes to a master table when the table is very large.

Snapshot of source system: This approach is to copy the entire content of tables to staging and then determine the changed/added records. This is a time consuming process, particularly if the source tables are huge. This is a preferred method for small tables, such as some master tables. This approach does not capture intermediate changes.

Snapshot of changed records: This approach is to capture the changed/added records directly from master or transactional tables. You do not need a separate log of changes. But this process does require the modification date and time of the changes. This approach does not capture intermediate changes.

There are various delta detection techniques to choose from based on the capabilities of the source system and business requirements. If the source system creates change logs, then detecting extracting the delta is relatively straightforward. If the source system does not help with delta detection and you cannot make changes to the source, then the extraction process has to bring in all the transactional records to staging and determine the changed records. This is time consuming and for huge OLTP data volumes it is not operationally feasible. Hence choosing appropriate delta detection technique becomes important for efficient and effective extraction process.

Reference: BI Architecture Design Whitepaper

No comments:

Related Posts with Thumbnails