The exponential mixture distribution is a semi-parametric distribution used in Actuarial Loss
Modeling to fit collections of similiar losses. The exponential mixture has a decreasing hazard rate,
increasing Mean Residual Life and has a thicker tail than the usual single parameter exponential
1. According to Keatinge 2, the exponential mixture is considered semi-parametric
because no parametric assumption is made about the form of the mixing distribution.

The R library Renext exposes the `MixExp2`

function, which makes available the standard `d`

, `p`

, `q`

and `r`

-prefixed statistical functions to
determine the density, distribution, quantile or generate a random draw from a 2-component exponential
mixture model, but for exponential mixtures comprised of more than two terms, there’s suprisingly little
available.

In this post, we’ll demonstrate a method for calculating quantiles from exponential mixture models with any
number of terms using inverse transform sampling. We’ll leverage the Newton-Raphson method to obtain quantiles,
assuming the mean and weight for each component have already been determined.
In a future post, we’ll demonstrate how to use maximum likelihood and MCMC techniques to arrive at the optimal
weight distribution across terms based on the dataset of interest.

## Background

The distribution function of an exponential mixture model follows the form:

Where \(\sum_{i}^{n}w_{i} = 1\), and \(\lambda_{i}\) is the inverse of the expected value for
the \(i^{th}\) component in the mixture.

In order to use the Newton-Raphson method to find roots for an exponential mixture model
with an arbitrary number of components, we need to arrange the CDF such that all terms on
one side are set identically to zero. Rearranging yields:

The Newton-Raphson update expression can be represented as:

With successive iterations providing better approximations of the root until a sufficiently accurate value is obtained.

For our purposes, the rearranged CDF expression will serve as \(f(x)\). \(f^{\prime}(x)\) will be the derivative of the rearranged CDF, specifically:

We can commence iteration at \(x_{0} = 0\) in all cases, and don’t need to concern ourselves
with searching for a optimal initialization point since the mixture is monotonic.

## Implementation

What follows is an implementation of the Newton-Raphson algorithm written in R. It takes as arguments
a starting point (which defaults to 0), a function `f`

and the function’s derivative `f_prime`

. `epsilon`

and `maxiters`

represent the threshold below which a root is assumed to have been located and signals
for iteration to cease, and the maximum number of iterations to carry out as a safeguard against
divergence (respectively):

```
newtonRaphson = function(
x_init=0, f, f_prime, epsilon=.000001,
maxiters=1000, returniters=FALSE) {
root = NULL
x_i = x_init
tolerance = 1
itervals = vector()
itercntr = 0 # ensure `maxiters` is not exceeded
while (TRUE) {
itercntr = itercntr + 1
itervals[[itercntr]] = x_i
if ((itercntr>=maxiters) || (tolerance<=epsilon)) {
root = itervals[[length(itervals)]]
break
} else {
x_n = (x_i - (f(x_i)/f_prime(x_i)))
tolerance = abs(x_n - x_i)
x_i = x_n
}
}
resList = list(iterations=itercntr, root=root, approx_seq=itervals)
if (returniters==FALSE) resList$approx_seq = NULL
return(resList)
}
```

`newtonRaphson`

returns a list with three components:

`iterations`

is the number of Newton-Raphson approximations

`root`

is the value along the x-axis at which the minimum of the function is obtained

`approx_seq`

is a vector of length equal to`iterations`

of successive root approximations

Next we’ll walkthrough an example demonstrating the use of `newtonRaphson`

.

### Numerical Example

Assume we’re interested in determining quantiles for an exponential mixture model with
weights = (.60,.30,.10) and means = (10,50,100).

The corresponding distribution function is therefore:

Where the hazard rate for each term is the inverse of the term’s mean.

As a sanity check, let’s evaluate \(F(x)\) at a few points to determine the CDF, which will serve as
a stand-in for the uniform random variates we’ll be sampling when the algorithm is used in practice.

Let’s find the associated percentiles for inputs 10, 25, 75 and 200. This is straightforward in R:

```
mixtureCDF = function(x) {
return(1 - .60 * exp(-x/10) - .30 * exp(-x/50) - .10 * exp(-x/100))
}
mixtureCDF(10) # 0.4431693676
mixtureCDF(25) # 0.6909097246
mixtureCDF(75) # 0.8854924461
mixtureCDF(200) # 0.9809717788
```

These percentiles, 0.4431693676, etc… represent the uniform random numbers that will be passed
to the simulation framework in practice. The goal is upon being passed a random number such as
0.6909097246 (corresponding to the mixture model evaluated at 25), the function will return
**25**.

We need one additional piece of functionality to complete the framework: A function that wraps `newtonRaphson`

and takes for arguments a vector of means, a vector of weights and returns another function which can then
be passed a uniform random variate, which ultimately returns the corresponding quantile. This logic is
encapsulated in `invertCDF`

:

```
invertCDF = function(weights, means) {
rates = 1.0/means
function(urand) {
# initialize numerator and denominator functional
# components for Newton-Raphson iteration =>
f = function(x) {
func = 0
for (i in 1:length(rates)) {
iterrate = rates[i]
iterweight = weights[i]
iter_func = iterweight * exp(-iterrate * x)
func = func + iter_func
}
return(func - (1 - urand))
}
f_prime = function(x) {
func_prime = 0
for (i in 1:length(rates)) {
iterrate = rates[i]
iterweight = weights[i]
iter_func_prime = -((iterweight * iterrate) * exp(-iterrate * x))
func_prime = func_prime + iter_func_prime
}
return(func_prime)
}
# call newtonRaphson to back-out quantile from mixed exponential
# distribution corresponding to CDF value of urand =>
result = newtonRaphson(x_init=0, f=f, f_prime=f_prime)
return(result$root)
}
}
```

Finally, we demonstrate using `invertCDF`

to find quantiles from the sample exponential mixture CDF,
\(F(x) = 1 - .60e^{-x/10} - .30e^{-x/50} - .10e^{-x/100}\). Let’s assume our sample uniform random
draws are 0.4431693676, 0.6909097246, 0.8854924461 and 0.9809717788, which correspond to values 10, 25, 75 and 200.
Put another way, if we pass 0.4431693676, 0.6909097246, 0.8854924461 and 0.9809717788 to our simulation framework,
we’d expect the values 10, 25, 75 and 200 to be returned. This serves as verification that the framework is
functioning as expected:

```
weights = c(.60, .30, .10)
means = c(10, 50, 100)
random_draws = c(0.4431693676, 0.6909097246, 0.8854924461, 0.9809717788)
# call invertCDF, which returns a function =>
inverter = invertCDF(weights=weights, means=means)
#pass random_draws to inverter, and obtain the estimated percentiles =>
result = sapply(random_draws, inverter, simplify=TRUE)
# print values of `result` =>
print(result)
[1] 10.00000000 25.00000000 75.00000002 200.00000011
```

Below are plots of the exponential mixture model, along with the shaded area under the curve that corresponds to the quantiles found above:

These plots were created with `ggplot2`

and the following specification:

```
# Logic to create highlighted AUC plots with ggplot2 =>
library(ggplot2)
f = function(x) {
return(.06*exp(-x/10.)+.006*exp(-x/50.)+.001*exp(-x/100.))
}
x = seq(from=0,to=100,by=.005)
y = sapply(X=x,FUN=f,simplify=TRUE,USE.NAMES=FALSE)
df = data.frame(X=x,Y=y)
ggplot(data=df, mapping=aes(x=X, y=Y)) +
geom_line()+
geom_area(mapping=aes(x=ifelse(x>0 & x<10, x, 0)), fill="#9898fb", alpha=1.) +
xlim(0,100) +
ylim(0, .1) +
ylab("Exponential Mixture Density") +
ggtitle("AUC for Exponential Mixture: ~44th Percentile") +
geom_vline(xintercept=0)+
geom_hline(yintercept=0)
```

For each plot, we simply replaced the title and passed the quantile of interest to the generic `f`

function,
which encapsulates our sample exponential mixture model.

## Conclusion

Exponential mixture models have many desirable properties and offer a degree of flexibility not characteristic of the parametric approach. In a future post, we’ll demonstrate how to obtain the optimal weighting scheme and mean for each mixture term using maximum likelihood estimation and also using Markov Chain Monte Carlo methods. Until then, happy coding!