![]() |
|
![]() |
|---|
The Credit Score War
File Segmentation
This information gives you some idea how they separate and rank risk levels. It's not clear yet how you can battle this process at this time. I will however continue to use my resources and keep you informed on what step you can take to remain at the highest possible level.
Overview
The objective of segmentation is to define a set of sub-populations that, when modeled individually and then combined, rank risk more effectively than a single model tested on the overall population. Typically, the performance improvement is measured across the entire population, but the litmus test is the improvement in performance as measured at the individual creditor portfolio level. The credit-based delinquency model development process offers perhaps the ultimate flexibility in segmentation options resulting from large sample sizes and variety of individuals across the entire risk spectrum. This paper outlines the role of segmentation in the improvement of credit-based delinquency models and presents innovative methodologies for the segmentation process.
The Role of Segmentation
How does segmentation improve scoring system performance? The premise of segmentation is that the credit characteristics (independent variables) have a different relationship with risk (dependent variable) for different sub-populations. By identifying the appropriate sub-populations, the characteristics that are most predictive in isolating risk are optimized for that group.
If we consider two perspectives—a sub-prime lender and prime lender—that target individuals with vastly different risk profiles, the value of segmentation becomes clear. Consider the characteristic “number of accounts with a worst repayment performance of 90 days past due or more (severely delinquent/derogatory),” which is an important characteristic in discriminating between high and low risk individuals.
From the perspective of a sub-prime lender, the number of severely delinquent/derogatory accounts is a pervasive and defining element of the target population, but not necessarily a good predictor of who will be lower risk. Further, this characteristic is not likely to have a significant impact on the accept/decline decision.
For individuals with tarnished credit, there may be little difference in risk with increasing numbers of severely delinquent/derogatory accounts. From the perspective of a prime lender, the difference in the number of severely delinquent/derogatory accounts will be a significant factor in ranking risk. For prime populations that may be fairly homogeneous, the difference between having zero and one severely delinquent/derogatory account may be the basis for the accept/decline decision or the difference between the lowest and the highest interest rate.
Suppose a single model solution is developed that uses the number of severely delinquent/derogatory accounts as one of the components to rank order risk. The resulting solution may be effective for determining whether an account is directed to a sub-prime lender or a prime lender, and effective for the prime lender in rank ordering risk. It may, however, have little or no value to the sub-prime lender in rank ordering risk. One potential solution would be to segment the population using the number of severely delinquent/derogatory accounts and build models on each population separately, optimizing the relationship of that characteristic to the risk prediction of the respective segment.
Traditional Segmentation Strategies
Many techniques exist for segmenting a population. The previous example relies on using a single characteristic to define a segment. The use of characteristics is common in segmentation schemes and typically involves multiple characteristics to define a sub-population, as demonstrated below.
Segmentation using individual characteristics has been the traditional methodology used for credit-based delinquency models. The segmentation scheme is typically derived using regression tree analysis, such as Classification and Regression Trees (CART) or Chi-squared Automatic Interaction Detector (CHAID); these techniques segment the population based on the relationship of independent and dependent variables.
Ultimately, using the characteristic-centric, tree-based approach creates a rank ordering system that results from a number of nodes (tree endpoints) with differing bad rates. However, scoring vendors have long promoted the value of using statistically derived scores instead of decision trees or manual processes to rank order individuals.
When viewed from the perspective of the sub-prime and prime lender, the characteristic-based approach is still sub-optimal because there is no one characteristic that when split one or more times approximates a lender’s target population. Since there are many factors that influence whether an individual is sub-prime or prime, evaluating overall risk one characteristic at a time is inefficient because it requires numerous nodes to assess all of the relevant factors.
Why do traditional credit-based delinquency model vendors use characteristic-based methodologies to rank order risk for segmentation while they criticize their use for credit decisions? The answer lies in complacency and lack of innovation. With a historical lack of competition in the tri-CRC, credit-based delinquency model arena, the incumbent vendors have had no compelling reason to use advanced methodologies to extract additional predictive power from the rich databases of consumer credit behavior.
The New Era of Segmentation
Is characteristic-based segmentation dead? No. There is still value in using characteristic-based segmentation, but it must be used in conjunction with other approaches to produce a robust, powerful credit-based delinquency model. The new era of segmentation uses scores that group individuals with similar behaviors along a number of dimensions.
The use of scores to segment the population is consistent with consumer credit markets as higher risk individuals are directly or indirectly cascaded to non-prime or sub-prime lenders based on decisions derived from some type of credit score—not by individual credit characteristics.
By using a risk-based segmentation score, individuals are compared to individuals with similar risk profiles; segmenting the credit population into risk tiers approximates creditors’ target markets and enables effective risk assessment across the entire credit risk spectrum.
The scheme leverages characteristic-based segmentation in conjunction with a general risk score and a profile model. The profile model identifies whether an individual has the profile of someone who will file for bankruptcy or someone who will default (90+ days past due/charge off).
Previous bankruptcy was defined by a presence of a bankruptcy public record or a bankrupt status indication on an account. Thin file was defined by an individual having one or two accounts and no previous bankruptcy. Individuals with full files had no previous bankruptcy and three or more accounts.
Previous bankruptcy and thin file splits were defined heuristically. Previous bankruptcy, thin and full file sub-populations were then divided into risk tiers using a risk-based segmentation score; CART analysis was used to determine the segmentation score cuts for the different risk tiers. The bankrupt/default profile model was used to divide the risk tiers into bankrupt and default profile sub-populations, and were also defined using CART.