Cutting Through ‘Noisy’ Credit Data – and Helping Banks in the Process

Roger M. Stein

Roger M. Stein

Accurate forecasts about loan performance are becoming a top priority, especially when banks merge or acquire and when they must combine their loan performance data sets.

But “noisy” data corruption, particularly that of missing or mislabeled records, presents a challenge to develop default models with any hope of accuracy.

MIT professor Roger M. Stein, a senior lecturer in finance, has developed an analytical model to adjust for the corruption of test data. His paper, titled “Validating Default Models when the Validation Data are Corrupted: Analytic Results and Bias Corrections,” was recently published by the MIT Laboratory for Financial Engineering. If there’s an idea of how noisy the data is, the statistical result can be adjusted to filter the corrupted data, almost as if it weren’t corrupted – or, at least, be able to estimate how far off the data results might be.

In the draft of the paper, he includes a very down-to-earth quote from Yogi Berra: “In theory, theory and practice are the same. In practice they are different.”

Stein’s paper draws from research in the fields of engineering and medicine. “In the early days of radar detection people tried to determine whether what they saw on the screen was real or just noise,” Stein said. “The radar would blip, and they would question whether it was really a plane.”

Radar-detection researchers developed techniques to compare different radar operators’ performance and to optimize for the numbers of false positives relative to false negatives. The same science applies to disease-screening diagnostics, in which a generally reliable test may sometimes produce both false positives and false negatives.

By plotting the percentage of each kind of error on a graph, researchers can determine the most effective way to interpret test results.

Over the past 15 years, these same techniques have been applied to evaluating the performance of the credit default models that banks use, according to Stein. The challenge: how to balance false positives and false negatives and to evaluate the performance of a credit model when the data are known to be corrupted.

“For a bank that is in the business of lending, there’s clearly a cost to making a loan to someone who then defaults,” Stein said. “But there’s also a cost to not lending money to someone who doesn’t default. The goal is to figure out how you can make such tradeoffs efficiently.”

 

Correcting Corruption

Stein’s paper addresses two types of data corruption: those arising from missing records and those arising from mislabeled records. Missing data can involve cases where non-performing loans have been moved off the balance sheet because they’ve been referred to a specialized division that deals only with problem loans, and which focuses on asset recovery.

Mislabeled default records occur when there’s a mismatch between a borrower’s financial statements and its loan delinquency status. It’s typical to join financial statements in one database to performance in another database, but because it’s difficult to merge such databases, un-matches and mismatches can occur, particularly in a merger/acquisition scenario. That’s when default records can go missing – typically called the ‘hidden default’ in banking, Stein notes.

Then, there are instances of discrepancies between indicators when banks merge. A non-defaulting firm’s records may contain an indicator that in a database from Merging Bank A is tagged as a default, but in Merging Bank B’s database, is not. For example, Bank A’s definition equates a single missed payment as a default, whereas Bank B’s definition may flag payments no less than 120 to 180 days past due.

Stein notes that if the mechanism causing the mislabeling is known with a high specificity level (record by record), then the disparity between probability and actual is likely trivial. But if the labeling mechanism is sensitive but less specific, noting that 1 in 105 records in database are mislabeled but that it’s not certain which ones, it’s no longer trivial. Stein’s paper focuses on the latter.

His research revealed that different models perform very differently when given corrupted data. Better models are affected more severely by flawed data, he found. “For many years the common wisdom among some analysts was that corrupted or missing data ‘cancelled out’ across different models since they were all at the same disadvantage,” Stein said. “But this research shows that this is not true. … as data get noisier, it becomes harder to tell the difference between a nearly random model and a good one that is being compared using noisy benchmarks.”

One of the beneficial results of the analysis is that most of the corrections can be implemented as slight adjustments to the output of the current generation statistical software in use by most institutions, Stein says. This can be done with a few (e.g., two or three) lines of scripting in the current software, or in an Excel spreadsheet.  Although the derivations are more involved, the final results are typically (single) simple equations. For example, one of the main equation results can be implemented in a few cells of a spreadsheet.

The white paper can be accessed here.

 

Christina P. O’Neill is editor of custom publications for The Warren Group.