ETL
Extraction -
Where is the Source Information?
What is the Extraction Logic?
When can you extract? (when will be the data available)
How often should you extract?
What is the volume of changes per extract?
Transformation
Cleanse the data (duplicates )
Checks to be done (like referential integrity)
Any calculations, summarizations?
Lookups - any other information required
Loading
Type of Loading - Full loads or Incremental Loads
Time to Load and Scheduling
Data availability during the Loads
Dependencies for the Loads
ETL Process
Should be able to track the loads
Load times , Tables Loaded , Records Loaded
Restartability of the Load process
Notification system for failure and success
Maintenance Jobs like cleaning staging tables, rebuilding indexes
Extraction -
Where is the Source Information?
What is the Extraction Logic?
When can you extract? (when will be the data available)
How often should you extract?
What is the volume of changes per extract?
Transformation
Cleanse the data (duplicates )
Checks to be done (like referential integrity)
Any calculations, summarizations?
Lookups - any other information required
Loading
Type of Loading - Full loads or Incremental Loads
Time to Load and Scheduling
Data availability during the Loads
Dependencies for the Loads
ETL Process
Should be able to track the loads
Load times , Tables Loaded , Records Loaded
Restartability of the Load process
Notification system for failure and success
Maintenance Jobs like cleaning staging tables, rebuilding indexes
Comments