In this article, I am going to give you some BI archis that I have met in my various experiences and which use the ODS and STG phases:
Source -> STG-> ODS-> DWH
This is the arch I often use, the STG contains the data of the source without any formatting or filter, it is the exact copy of the source data and ODS contains the formatted data.
And the fact of only formatting in a second time makes it possible to see if there is missing data from the source.
Disadvantage: For databases where we have a high volume of data to integrate and with a FULL power supply mode, this archi is not really recommended, because we go through 3 integration phases with formatting and filtering on this data which takes longer to feed DWH.
Source-> STG-> DWH
Without going through the ODS, you can integrate the source data with formatting in the staging before feeding the DWH from several STG tables: it is a more lean mode.
Source-> DWH
Yes, it exists!
This is not recommended as an arch in terms of exploitation, nor in terms of organization and structuring, but it is the easiest arch to implement.
The fact of not having a copy of the source data does not make it possible to know where the data is lost during the integration (not provided by the source, rejected during the formatting of the ODS, not integrated in the DWH ...)
Leave a comment