Healthcare client ETL pipeline
This independent healthcare navigator works with employees and employers to simplify the healthcare experience. Not providing insurance themselves, they work with providers to field requests from customers and point them toward the best solution. They operate as the leaders in the healthcare navigation industry, which they were instrumental in building.
Overview
THE CHALLENGE
This healthcare client had been working on a local SQL server to start the data centralization process, but they had issues with scaling the operation up as the scope of the project expanded.
THE SOLUTION
Ollion proposed setting up a full ETL pipeline to take the data through multiple transformation steps. We streamlined the ingestion/transformation process through the use of tools set up to run entirely on automation.
THE OUTCOME
Our solution contributed to time- and cost-savings. Plus, stakeholders can more easily access and analyze data. The BI team even discovered gaps in their data collection from this new system.
The challenge
Working with a constantly growing number of customers and insurance clients, the company has maintained an enormous inflow of data from various sources. This includes external data, such as customer and provider information, as well as internal data, including call logs and employee information. This data comes from several different sources, all with individual resources for viewing and extracting it. Many of these sources also have unifying information, such as ID fields, which are shared across different platforms. This led to a desire to start centralizing all this incoming data into one location.
An internal team started to work on bringing this data out of the source systems and into a centralized server with API calls and other similar methods. They knew that they wanted to eventually build out dashboards and metrics they could use internally to analyze their business efficiency but were unsure how best to proceed with building that out. They had been working on a local SQL server to start this development process but were having issues with scaling the operation up as the scope of the project expanded. This led them to explore external solutions, which brought us into the picture.
The solution
The solution we proposed involved setting up a full ETL pipeline that would take the data through multiple transformation steps. The first step in this process was utilizing AWS to trigger API calls to all their individual data sources, funneling it all into one cloud storage system (s3). Bringing data through AWS would increase the ease of access for the data and allow for more versatility with the additional steps of the ETL process. The data would then be brought into Snowflake, a cloud data warehouse, to take the raw files and start building tables for that data. Snowflake would handle the ingestion from AWS s3 by utilizing triggers that send notifications when new files are ingested into s3.
After the data would move through s3 and Snowflake, an instance of dbt would be added on top of that. Dbt is an ETL tool that allows for SQL querying, taking raw data and transforming it into more user-accessible tables that can be accessed by BI tools like Tableau and ThoughtSpot. Scheduled jobs that create and re-populate these tables are created within dbt Cloud and output the results to Snowflake. From here, the internal BI team at the client can take these finished tables and utilize them to create dashboards and other business metrics that help them determine internal metrics. The end product of this solution essentially streamlines the ingestion/transformation process through the use of three tools set up to eventually run entirely on automation.
The outcome
The goal for this project was to create a reporting suite that would give our client business insights into their consumers and their own internal proceedings. Our client's data, after going through our new pipeline, takes several disparate sources and links them together with keys and joins that the client can use to more seamlessly create visuals. Rather than having to spend time building out every metric and view within a BI tool, the work is already done and being refreshed regularly within Snowflake. This solution is saving time in the long run, as data is now easier for stakeholders to access and analyze compared to the earlier raw state. Not to mention the benefits of cloud storage allowing more versatility than their previous local server.
The reports being built off this data have also proved to be very effective within the company. This healthcare client's BI team has been reporting on new developments within the company that were discovered using our new system. Certain tables have helped point them toward gaps in their data collection, pain points with their regular users and places to improve efficiency with their call center staff. These insights will continue to grow as more data is ingested and tables are developed. Our client is now learning more about how to improve the efficiency of their business model, as well as simplifying their data handling coming from collection sources.
We were able to create a better atmosphere for client learning within the team we worked with. We gave them a better understanding of how to format and organize their data into a more user-accessible system. This will continue to have benefits in cost savings, as a lot of these practices apply to the best ways to run a data warehouse from a finance perspective, as well as from a data analyst perspective. While they are working more on the physical deliverables of this endeavor, we have been focused on creating a knowledge base within the company that allows them to continue growing even after we are finished with this project.