ETL Process Enhancement
2016 till now | Oracle, Survivorship Matrix, Kalido, Dedicated Team
Identifying and eliminating the probable issues of the Extract, Transform, Load (ETL) process for reference data that comes from several operational systems.
The US national leader in customized insurance, claims and patient safety & risk solutions for physicians, surgeons, dentists and other healthcare professionals, as well as hospitals, senior care and other healthcare facilities.
The ETL process receives data from multiple sources. Some of them provide more detailed and complete information than the others. In certain cases, the attribute values received from more detailed sources could be replaced by empty values of the same attributes received from less detailed sources. Therefore, the incomplete data might be loaded into the data warehouse.
The primary task was to prevent the ETL process from the issues mentioned above, so that it provides the most complete, accurate and consistent information at the time.
We addressed the issue by adding a transformation step to the ETL process that clearly defines whether to keep or to rewrite an attribute value.
First, we analyzed all the attributes from every source and set priorities for each of them. Then we developed a Survivorship Matrix, which contains attributes and priority relationships. Its main purpose is to define whether to keep or rewrite an attribute value judging by the priority of the source. The Survivorship Matrix logic is incorporated within one SQL logic, allowing to escape any pre-processing.
By implementing the Survivorship Matrix, we’ve achieved data deduplication, data completeness and data reliability. Such an approach eliminates redundant overwriting and ensures consistency for any period of time. Additionally, we’ve increased flexibility of the system, since the priority values can be easily changed if needed. The solution became easily scalable, as the new attribute priorities are added outside the ETL process. Moreover, the performance requirement of the total running time to be less than 30 minutes was successfully met, as it was reduced to less than 5 minutes.
OracleOracle SQL Trace
Featured case studies:
The customer asked Azati to audit the existing solution in terms of general performance to create a roadmap of future improvements. Our team also increased application performance and delivered several new features.
At Azati Labs, our engineers developed an AI-powered prototype of a tool that can spot a stock market trend. Online trading applications may use this information to calculate the actual stock market price change.
Azati designed and developed a semantic search engine powered by machine learning. It extracts the actual meaning from the search query and looks for the most relevant results across huge scientific datasets.
Azati helped a well-known software integrator to eliminate legacy code, rebuild a complex web application, and fix the majority of mission-critical bugs.
Azati helped a European startup to create a custom logistics platform. It helps shippers to track goods in a real-time, as well as guarantees that the buyer will receive the product in a perfect condition.