Improving Data Quality at a Regional Bank | A Case Study

By OvalEdge Team , Posted April 08, 2024 In Data Quality

A regional US bank used OvalEdge to tackle data quality issues that were leading to inaccurate liquidity risk and credit risk predictions, and potential compliance violations.

Client Context

The customer was a mid-sized, US-based bank with nearly five billion dollars in assets. From a services perspective, the bank focused on lending and supported deposits. Consequently, much of the bank's IT infrastructure was centered on risk assessment.

From a business technology perspective, the bank had a series of systems in place to aid the day-to-day running of operations. Beyond these technologies were the critical financial and compliance-related analytics tasks that the bank's analytics team carried out using data funneled through the applications. They included an analysis of:

Credit risk assessment
Liquidity risk assessment
HMDA compliance reporting
Customer 360 view

The bank had a robust technology stack with dedicated tools to support its core analysis and risk mitigation strategy. However, data issues impacted its ability to manage credit and liquidity risk and adhere to the various financial compliance laws that governed it.

Key Pain Points

The bank's data issues affected both business users and analytics teams. For business users, poor data quality was the primary pain point. They could not retrieve high-quality data to carry out accurate credit and liquidity risk assessments, and these data quality issues manifested at various levels.

This issue with data quality had a profound impact on credit risk assessment. In particular, data quality suffered during the process of recording income.

Regarding income, the most critical credit risk assessment issues occurred around failures to capture gross income. These errors happened because many applicants had secondary income sources, such as second jobs or rental properties, yet income wasn't captured.

Beyond this, the bank's loan origination system didn't specify whether income should be calculated weekly, bi-weekly, monthly, or annually. This lack of standardization led to operational errors later in the pipeline and a lack of continuity in income results.

Finally, the income reported on the bank's Loan/Application Register (LAR) system sometimes differed from the amount used for work papers. That's because this income was being pulled from different places, sometimes from the LAR system other times from work papers.

Regarding liquidity risk assessment, a large volume of low-quality data and inconsistent KPIs made it difficult to predict cash flow and liquidity accurately. This was because data on deposit holders wasn't always accurate, which significantly impacted the bank's ability to determine cash flow.

Data quality issues also impacted the bank's HMDA compliance reporting. This was a major pain point for the bank, which cited reporting in compliance with the Home Mortgage Disclosure Act of 1975 (HMDA) as one of its most ineffective yet critical business tasks. One of the most prominent issues the bank experienced concerned the recording of GMI data. The terms used for this reporting requirement were inconsistent, and many didn't adhere to the policies laid out in the HDMA.

Related Post: Implementing Data Quality for Fair Lending Compliance in Banking

Regarding Customer 360, this analysis was fundamentally flawed at the bank because there was no single pane of glass from which to collect and analyze the data. There was no company-wide collaborative measure in place so users didn't know who to ask to find the needed data. It impacted every domain in the business.

Beyond business users, some data issues particularly impacted specialized teams in the bank. From a data privacy compliance perspective, there was no official structure or governance head, Chief Compliance Officer, to ensure the bank adhered to the fundamentals of data privacy, such as classifying and securing PII, enabling users the right to know their data and the right for them to ensure the bank deletes it.

Teams working with BI tools also encountered research difficulties because financial data was spread across various systems and siloed in domains. This made it incredibly difficult for BI teams to find and utilize the data they needed to inform existing models and expand into new analytics technologies.

Bank data engineers struggled particularly to create a unified and streamlined process for developing new risk assessment models that mirrored the evolution of the bank's customer base and service offerings.

OvalEdge Solution

The bank chose OvalEdge as its data governance platform. After purchasing the tool, the first step was to organize a team within the bank to begin implementing the processes required to address its existing data issues. The following implementation took place over two core stages.

Stage One

The company's data processes were a black box for business users because they didn't understand how data collected and worked on in the systems they used translated to the analysis processes that the bank relied on. The first part of the process was to crawl all of the transactions and create a visualization that explained how all the data was calculated using the various systems in the bank.

This part was about enabling the bank's business users to understand how the technology systems they used daily impacted analytics downstream. This understanding enabled these business users to become more data literate.

Related Post: Data Literacy: What it is and How it Benefits Your Business

Step two involved establishing roles and responsibilities for data governance. In particular, this was about formalizing the data team at the bank and creating domains. In each domain, ownership was established, and data custodians and stewards were assigned. The result, from a business perspective, was a structure that included defined roles and responsibilities and a clear mechanism for extracting knowledge from all stakeholders.

Stage Two

The bank's data stewards then set about collecting all of the human knowledge about the critical data elements (CDES) used by the bank. This data was collected, simplified, and narrowed down. For example, regarding liquidity risk assessment, core CDEs were identified as liquidity coverage ratio and net stable funding ratio.

Once these CDEs were defined, the bank mapped their lineage to understand where the data was coming from and where it was going. A key part of this was the simplification process, which included removing duplicates, retiring technical debt, and performing other cleaning processes.

During this action, the OvalEdge team discovered that some of the bank's definitions were inconsistent. Consistency was enabled by creating a backlog and editing the data formula. Where the formula was the same, the data was tagged accordingly.

Where this couldn't be done, new terms were created.

For example, net liquidity coverage ratio instead of liquidity coverage ratio. The goal was to define KPIs clearly so they could be easily understood by all bank stakeholders. All the terms were curated in a business glossary, and to date, they number upwards of 1,000.

Finally, although the bank's metadata was clearly defined, some bad data was still falling through the cracks. To address this, the OvalEdge team set up a data cleansing process.

A workflow was put together to establish data quality issues and then fix them when they arose. This evolved into an ongoing data quality improvement lifecycle, in which the team would go to the root cause of a data quality issue and fix the problem. This meant that only good-quality data could flow from the source system, negating any detrimental impact on analytics.