User:Sireesha.kalluru/sandbox

Data Integration Tools: ETL: The Heart and Soul of BI
Abstract:

Research paper to cover about latest and greatest data integration tool by mega vendor Oracle Data Integration (ODI) ELT tool heart and soul of BI. There was much excitement around the changing nature of ETL described as ELT. In traditional ETL, data is first extracted from source tables, and moved to file system / staging tables (usually in an intermediate tier) where transformations occur before loading into target warehouse tables. The data is then loaded into the target data warehouse. In ELT, the data is extracted, loaded into the target data warehouse staging, and then transformed. This can eliminate the need for staging tables and speed the entire process.

Introduction:

Data integration (DI) is a process of collecting data from heterogeneous data sources, combining and presenting a unified view of these data to the business user.

ETL is the heart and soul of business intelligence (BI). ETL processes bring together and combine data from multiple source systems into a centralized data warehouse, enabling all users to work off a single, integrated set of data which is key process enabler and competitive weapon for an organization.

BI systems are no longer nice to have, but have become essential to business continuity of success. These systems are no longer stand-alone and separate from operational processing—they are very well integrated with overall business processes. As a result of this evolution, an effective BI environment is based on integrated data which enables users to create strategic, tactical, and operational decisions that initiates’ the business very effectively on a day-to-day basis.

In recent days typical Data integration platforms are designed to meet a higher segment of an enterprise BI requirement which supports different modes of data structure integrations.

Oracle’s Data Integration Strategy have the new requirements for Data Integration includes Any Data, Any Source, and On-Premise, On Cloud, Zero Downtime Maximum Performance, Real-Time Analytics and Data Quality.

While the tools provide substantial benefits in implementation and ongoing support, they do not directly address the most challenging part of a data integration problem — determining the rules by which data will be matched, merged, transformed and cleansed.

Oracle Data Integrator :

ODI benefits in reducing costs and quickens time to value i.e. make available right data to the business faster. Reduces risk associated with stale or bad quality data. Attain operational excellence based up on data analysis and better decisions based on real-time information.

ODI provides:

•	In-database E-LT Transformation vs. E-T-L.

•	Database horse power is leveraged to do the data transformations.

•	Hence eliminates the network hop, non-database transformation delays.

•	Best in class performance, scalability, availability.

•	Lightweight deployment for lower TCO (total cost of ownership).

•	Quicker time-to-value over declarative design.

•	Improved Developer Productivity and Lower Maintenance Costs.

With an ELT approach, the tool scales up linearly with the database:

•	 Any investments on new processing power for the database are immediately reflected in performance of the ETL processes.

•	 No need to re-analyze performance in the middle tier.

•	 No need to invest into more hardware for the middle tier.

•	 Because SQL code offers SET BASED processing, ODI benefits from the unprecedented parallel processing capabilities of the databases.

•	ODI offers Knowledge module code templates for broad category of source database technologies.

ETL (Extract Transform and Load) vs ELT (Extract Load Transform) Approach:

Operations of ELT (Extract Load Transform) tools are:

1. Extract the data from the source tables.

2. Load the tables into the destination or target server, and then

3. Transform the data on the target RDBMS using native SQL operators.

4. E-LT there is no need for a middle-tier engine or server.

5. Process of extract and load can be isolated from the transformation process.

6. Declarative Set-based Design shortens the number of steps that automatically generates the Data flow whatsoever the sources and target DB.

ETL tools operate by the following:

1. Extracting the data from various source systems.

2. Transform the data on a proprietary, middle-tier ETL engine, and then

3. Loading the transformed data the target data warehouse or integration server.

4. ETL incorporates the best aspects of manual coding and automated code-generation approaches.

5. ETL specifies Data Flow Graph so developer must define every step of complex ETL flow Logic which will increase significant development and maintenance efforts.

Conclusion: Organizations need to move more data from more data from more sources more quickly into a variety of distributed BI applications. ETL/ELT process is heart and soul of business intelligence applications. They control the flow of data between numerous source systems and BI applications.

As BI environments expand and grows more complex, ETL/ELT tools are essential to change to retain the pace. The various latest tools in market makes easy to implement ELT very efficiently, and gain the clear benefits from using ELT over ETL.

References:

http://www.oracle.com/us/products/middleware/data-integration/odi-ee-ds-2030747.pdf

http://my.gartner.com/portal/server.pt?open=512&objID=260&mode=2&PageID=3460702&resId=512097&ref=

http://www.appsassociates.com/downloads/Live-Webinar-BI-Applications-with-ODI-GoldenGate-22-July-2014.pdf

http://trianzblog.com/wordpress/?p=203