# «Roosevelt C. Mosley Jr., FCAS, MAAA Estimating Claim Settlement Values Using GLM by Roosevelt C. Mosley, Jr., FCAS, MAAA Abstract: The goal o f this ...»

Estimating Claim Settlement Values

Using GLM

Roosevelt C. Mosley Jr., FCAS, MAAA

Estimating Claim Settlement Values Using GLM

by

Roosevelt C. Mosley, Jr., FCAS, MAAA

Abstract: The goal o f this paper is to demonstrate how generalized linear modeling

(GLM) can be applied in non-tradttional ways in property and casualty insurance.

Specifically, we wdl use a proper~, and casualty closed claims database to aid in

estimating ultimate claim settlement amounts, evaluating claim trends, and assisting in improving claims handling procedures. This specific example will be used to demonstrate the potential o f the application o f GLM to different areas o f an insurance company.

A GLM will be developed with data from the Insurance Research Council (IRC) closed claims study. The model will be populated with characteristics o f closed automobile claims along with final settlement amounts. Using this data, the paper will examine how

**GLM can be used to identify:**

I) Trends in claims severities over time,

2) Differences in severities that exist between current ratemaMng characteristics (e.g. state, territoD9, characteristics o f the claims and the injured parties, and other factors (e.g. time from reporting to settlement, attorney involvement, use o f arbitration), and

3) Interactions between these factors.

Diagnostics will also be discussed which can be used to test the validi~ and robusmess o f the GLM models that are developed, and several apphcations o f the results o f this type o f analysis will be presented.

Over the last several years, Generalized Linear Modeling (GLM) has seen increased usage among actuaries primarily in traditional ratemaking applications. The benefits o f GLM are that it allows for a flexible model structure to be fit to insurance ratemaking data, and it also allows for a multivariate model to be generated that simultaneously incorporates a set o f independent variables to determine their impact on a dependent variable. Thns is an improvement over traditional one-way types o f analysis (both loss ratio and pure premium) because it adjusts for the impact o f distributional biases that are present in all insurance data sets. The result is a set o f indications for whatever you are modeling (class plan relativit,es, tiering relati~Aties, etc.) that reflect the true impact o f each variable being analyzed.

GLM has had immediate appeal in the traditional areas o f actuarial practice. Most significantly, insurers have used GLM to refine class plan relativities, establish tiering and underwriting plans, and incorporate commercially available insurance scores into rating and unden,,xiting plans, just to name a few applications. These applications have

**been addressed quickly as insurers move to this type of analysis for a number o f reasons:**

these areas fall within the actuary's normal area o f responsibility, the data for these types o f analyses is usually readily available, and this type o f analysis can provide the most immediate benefit for an insurer.

However, understanding the general statistical nature o f GLM, one realizes that a GLM analysis can be applied to other areas within insurance companies, areas that have not necessarily been within the actuaries' traditional realm o f responsibility. Specifically, we have used GLM's for a number o f non-traditional applications, including developing custom insurance scores, generating vehicle classification systems, evaluating claims and agency personnel and external sen, ice providers, and estimating claim settlement value amounts. These types o f analyses can provide benefit to many areas o f the company, and can display the actuary's skills to a wider audience.

We will demonstrate the concept o f applying GLM to non-traditional areas in this paper using the 1994 Insurance Research Council (IRC) Closed Claim Study database. In this example, we use the characteristics o f the closed claims as provided in the IRC database to estimate the ultimate settlement value o f a claim; however, we will describe this process in general terms such that it might be applied to a variety o f different areas. The goal o f this paper is not to provide you with a complete analysis o f the [RC database, but to use this database as an example o f how this general statistical procedure can be applied to other areas.

**The Basics of G L M**

GLM is a statistical process by which a model is developed in which a specific dependent, or response variable, is predicted by a number o f independent, or explanatory variables. For example, as applied to the insurance ratemaking process, the process o f setting class premiums for groups o f risks can be thought o f graphically as shown in Figure I.

The goal o f the classification ratemaking process is to set premiums by class of risk that reflect the risk o f each group. This requires estimating the relative loss potential o f each insured characteristic in the classification plan to determine how the factor contributes to the overall risk premium. An insured is then charged a premium based on his or her characteristics, and how these characteristics relate to the risk of loss. The traditional approach to analyzing the variables in the class plan was to analyze each o f the variables separately, using a one-way loss ratio or pure premium approach. The inherent assumption in the one-way analysis is that, for each level o f the factor being analyzed, the distribution o f all the other factors in the class plan is constant. This means, for example, if one were analyzing auto symbol, model ),ear, and age using a series o f oneway analyses, one would be assuming that the same proportion o f teenagers drive I 0-year old Ford Escorts and brand new Cadillac Escalades. While this is simply one example, there are a number o f other violations o f this assumption that can be thought o f in an auto or homeowners insurance class plan.

Figure I: Description of Classification Ratemaking Process Figure 2 gives an example o f how this type o f analysis can lead to erroneous results. The first table in Figure 2 gives the results o f two separate one-way homeowner's insurance analyses, one for territory and one for protection class. In this particular example, when analyzing the two territories, one assumes that territory A has the same ratio o f protection class I risks as territory B. The result o f the loss rauo analysis shows that the rates lbr territory A should be increased relative to the territory B rates. Similarly for protection class, the analysis shows that the change in protection class 2 rates should be higher relative to the change in protection class I rates. However, when these results are ~Aewed in a two-way table, the true picture becomes clear. The territory loss ratios are identical for both protection classes. The true problem is in the protection class relativities. If one had simply looked at the one-way analysis, the erroneous decision would have been to increase both the territory A rates and the protection class 2 rates, resulting in an overcorrection. The reason the one-way loss ratios appear this way is because of the difference in protection class distribution over the two territories. Again, while this is a simple example, one can easily think o f the number o f different potential scenarios where this can occur in a rating plan.

GLM corrects for these distributional biases, and also provides a flexible model structure such that it better fits insurance data. One can best think o f GLM in terms o f one o f its simplest forms, classical linear regression. The formula for a simple one-factor linear

**regression is:**

** y = a + bx + error**

This describes the fitting o f a line through a series o f points, attempting to model a response variable (y) using an explanatory variable (x). The b represents the relationship o f the independent variable x to y. There is also an error term which accounts for the fact that the model will not predict the observations perfectly. Under linear regression, the error is assumed to be normally distributed with a mean o f zero and a constant variance.

A graphical description o f this simple regression model can be seen m Figure 3. In this example, the bodily injury severity is being modeled as a function o f the time period.

**To extend this to GLM, the more general formula for multiple regression is:**

In this notation, the X[5 represents a matrix, where X represents a series o f independent variables and [3 represents the relationship o f these independent variables to the dependent variable. The error term is more general in that it is not restricted to the assumption o f normally distributed error terms (as in simple and multiple linear regression). More general error structures, such as Gamma, Poisson, and Negative Binomial can be used which are more representative o f insurance data.

Non-Traditional Applications Given the general structure of GLM described above, one can begin to expand the use of GLM beyond the traditional actuarial realm. The general structure of GLM can be

**described as shown below in Figure 4:**

Because GLM is a general statistical process, it is not limited to estimating class plan relativities. The general structure o f the model can be used to describe many different responses by a series o f explanatory variables. Depending on what problem GLM is applied to, the explanatory variables and the model error structure will change, but the process o f generating and applying the model will remain the same.

** C L A I M S E T T L E M E N T V A L U E ESTINIATION**

One potential area for the application of GLM in an insurance company is in the estimation o f uhimate claim settlement values. The ultimate value o f a settled claim can be described as the response variable, and the characteristics o f the claim represent the explanatory variables. When a claim is reported to an insurer, the insurer is presented with the facts o f the claim. Based on the facts o f the claim, an estimate is made o f what the final value ofthat claim will be. This value may be determined based on a claim value estimation software package, guidelines established by the company, the claim persons' expert opinion, or a combination o f the three. As the case matures, as payments are made on the case, and as more information regarding the case becomes available, future refinements o f that estimate can be made. It is these estimates that are made before the final disposition-of a claim that are reflected in an insurers financial results from year to year.

What this GLM example will do is develop a model to estimate the final amount o f the claim settlement, which can then be used as part o f the overall information that the claims handler uses to determine the expected final value o f a claim. The goal o f this analysis is not to replace the claims person, no more than the goal o f the analysis o f traditional class plan relativities by using (3LM is to replace the actuary. The goal o f this process would be to provide the claims person with additional information on which to base decisions.

This type o f model could be used to help estimate the ultimate settlement value o f claims based on the information known. It could also be used to assist claims departments in determining the effectiveness o f certain claims handling techniques. It can also provide information on areas o f focus such that claim handlers might more efficiently handle claims.

**Data**

To perform this type o f analysis, an insurer would need a database o f final closed claim settlement amounts, as well as the characteristics o f the claims that have been closed.

The characteristics available will likely vary between insurers, but examples o f the

**information that could be used are:**

• Insured rating and unde~vriting characteristics Typeofinjuries involved •

The list o f characteristics to be analyzed could continue, and the goal should be to include all the information that is available that might be useful to the analysis. This could be one potential difficulty for an insurer employing this type o f analysis technique. For some insurers, this type o f closed claim database might simply not exist, or the information might exist in paper form in the claim files.

For this paper, we have analyzed the IRC 1994 Bodily Injury closed claim database. This database was compiled by the IRC as a sample of claims closed during a specific period during 1992 from a number o f insurance companies. The database consists o f the ultimate senlement value o f these claims, a breakdown o f these settlement amounts by type o f payment (medical, wage loss, etc.), and a number o f characteristics o f the claim.

The variables analyzed from this database reflect many of the ~tems listed above. A complete list o f the factors could be obtained from the IRC.

While not a specific issue with the IRC database, an insurer or clatms organization that undertakes this type o f analysis will need to be aware o f claims that are closed without payment. While these claims do not generate any loss dollars, there are at least tv,,o other issues that these claims raise. First, they will generate loss adjustment expense dollars because a claim file will be opened on these claims and a claims person will be assigned to handle the claim. Also, because these claims can generate a series o f points with no settlement value or a very small settlement value, this can create some difficulty with the determination o f a model error structure. One approach to h a n d h n g this issue would be to use an analysis similar to a claim frequency analysis, but instead analyze the likelihood o f a claim closing without payment. This analysis could then be combined with a settlement value analysis to determine the ultimate expected settlement value.