The exponential mixture distribution is a semi-parametric distribution used in Actuarial Loss Modeling to fit collections of similiar losses. The exponential mixture has a decreasing hazard rate, increasing Mean Residual Life and has a thicker tail than the usual single parameter exponential 1. According to Keatinge 2, the exponential mixture is considered semi-parametric because no parametric assumption is made about the form of the mixing distribution.

The R library Renext exposes the MixExp2 function, which makes available the standard d, p, q and r-prefixed statistical functions to determine the density, distribution, quantile or generate a random draw from a 2-component exponential mixture model, but for exponential mixtures comprised of more than two terms, there’s suprisingly little available.
In this post, we’ll demonstrate a method for calculating quantiles from exponential mixture models with any number of terms using inverse transform sampling. We’ll leverage the Newton-Raphson method to obtain quantiles, assuming the mean and weight for each component have already been determined. In a future post, we’ll demonstrate how to use maximum likelihood and MCMC techniques to arrive at the optimal weight distribution across terms based on the dataset of interest.

## Background

The distribution function of an exponential mixture model follows the form:

$$F(x) = 1 - w_{1}e^{-\lambda_{1}x} - w_{2}e^{-\lambda_{2}x} - \cdots - w_{n}e^{-\lambda_{n}x}$$

Where $$\sum_{i}^{n}w_{i} = 1$$, and $$\lambda_{i}$$ is the inverse of the expected value for the $$i^{th}$$ component in the mixture.
In order to use the Newton-Raphson method to find roots for an exponential mixture model with an arbitrary number of components, we need to arrange the CDF such that all terms on one side are set identically to zero. Rearranging yields:

$$w_{1}e^{-\lambda_{1}x} + w_{2}e^{-\lambda_{2}x} + \cdots + w_{n}e^{-\lambda_{n}x} - (1 - F(x)) = 0$$

The Newton-Raphson update expression can be represented as:

$$x_{n+1} = x_{n} - \frac{f(x)}{f^{\prime}(x)}$$

With successive iterations providing better approximations of the root until a sufficiently accurate value is obtained.

For our purposes, the rearranged CDF expression will serve as $$f(x)$$. $$f^{\prime}(x)$$ will be the derivative of the rearranged CDF, specifically:

$$f^{\prime}(x) = -\lambda_{1}w_{1}e^{-\lambda_{1}x} -\lambda_{2}w_{2}e^{-\lambda_{2}x} - \cdots -\lambda_{n}w_{n}e^{-\lambda_{n}x}$$

We can commence iteration at $$x_{0} = 0$$ in all cases, and don’t need to concern ourselves with searching for a optimal initialization point since the mixture is monotonic.

## Implementation

What follows is an implementation of the Newton-Raphson algorithm written in R. It takes as arguments a starting point (which defaults to 0), a function f and the function’s derivative f_prime. epsilon and maxiters represent the threshold below which a root is assumed to have been located and signals for iteration to cease, and the maximum number of iterations to carry out as a safeguard against divergence (respectively):

newtonRaphson = function(
x_init=0, f, f_prime, epsilon=.000001,
maxiters=1000, returniters=FALSE) {

root = NULL
x_i = x_init
tolerance = 1
itervals = vector()
itercntr = 0    # ensure maxiters is not exceeded

while (TRUE) {

itercntr = itercntr + 1
itervals[[itercntr]] = x_i

if ((itercntr>=maxiters) || (tolerance<=epsilon))  {
root = itervals[[length(itervals)]]
break

} else {
x_n = (x_i - (f(x_i)/f_prime(x_i)))
tolerance = abs(x_n - x_i)
x_i = x_n
}
}

resList = list(iterations=itercntr, root=root, approx_seq=itervals)
if (returniters==FALSE) resList$approx_seq = NULL return(resList) }  newtonRaphson returns a list with three components: • iterations is the number of Newton-Raphson approximations • root is the value along the x-axis at which the minimum of the function is obtained • approx_seq is a vector of length equal to iterations of successive root approximations Next we’ll walkthrough an example demonstrating the use of newtonRaphson. ### Numerical Example Assume we’re interested in determining quantiles for an exponential mixture model with weights = (.60,.30,.10) and means = (10,50,100). The corresponding distribution function is therefore: $$F(x) = 1 - .60e^{-x/10} - .30e^{-x/50} - .10e^{-x/100},$$ Where the hazard rate for each term is the inverse of the term’s mean. As a sanity check, let’s evaluate $$F(x)$$ at a few points to determine the CDF, which will serve as a stand-in for the uniform random variates we’ll be sampling when the algorithm is used in practice. Let’s find the associated percentiles for inputs 10, 25, 75 and 200. This is straightforward in R: mixtureCDF = function(x) { return(1 - .60 * exp(-x/10) - .30 * exp(-x/50) - .10 * exp(-x/100)) } mixtureCDF(10) # 0.4431693676 mixtureCDF(25) # 0.6909097246 mixtureCDF(75) # 0.8854924461 mixtureCDF(200) # 0.9809717788  These percentiles, 0.4431693676, etc… represent the uniform random numbers that will be passed to the simulation framework in practice. The goal is upon being passed a random number such as 0.6909097246 (corresponding to the mixture model evaluated at 25), the function will return 25. We need one additional piece of functionality to complete the framework: A function that wraps newtonRaphson and takes for arguments a vector of means, a vector of weights and returns another function which can then be passed a uniform random variate, which ultimately returns the corresponding quantile. This logic is encapsulated in invertCDF: invertCDF = function(weights, means) { rates = 1.0/means function(urand) { # initialize numerator and denominator functional # components for Newton-Raphson iteration => f = function(x) { func = 0 for (i in 1:length(rates)) { iterrate = rates[i] iterweight = weights[i] iter_func = iterweight * exp(-iterrate * x) func = func + iter_func } return(func - (1 - urand)) } f_prime = function(x) { func_prime = 0 for (i in 1:length(rates)) { iterrate = rates[i] iterweight = weights[i] iter_func_prime = -((iterweight * iterrate) * exp(-iterrate * x)) func_prime = func_prime + iter_func_prime } return(func_prime) } # call newtonRaphson to back-out quantile from mixed exponential # distribution corresponding to CDF value of urand => result = newtonRaphson(x_init=0, f=f, f_prime=f_prime) return(result$root)
}
}


Finally, we demonstrate using invertCDF to find quantiles from the sample exponential mixture CDF, $$F(x) = 1 - .60e^{-x/10} - .30e^{-x/50} - .10e^{-x/100}$$. Let’s assume our sample uniform random draws are 0.4431693676, 0.6909097246, 0.8854924461 and 0.9809717788, which correspond to values 10, 25, 75 and 200. Put another way, if we pass 0.4431693676, 0.6909097246, 0.8854924461 and 0.9809717788 to our simulation framework, we’d expect the values 10, 25, 75 and 200 to be returned. This serves as verification that the framework is functioning as expected:

weights = c(.60, .30, .10)
means = c(10, 50, 100)
random_draws = c(0.4431693676, 0.6909097246, 0.8854924461, 0.9809717788)

# call invertCDF, which returns a function =>
inverter = invertCDF(weights=weights, means=means)

#pass random_draws to inverter, and obtain the estimated percentiles =>
result = sapply(random_draws, inverter, simplify=TRUE)

# print values of result =>
print(result)

[1]  10.00000000  25.00000000  75.00000002 200.00000011


Below are plots of the exponential mixture model, along with the shaded area under the curve that corresponds to the quantiles found above:

These plots were created with ggplot2 and the following specification:

# Logic to create highlighted AUC plots with ggplot2 =>
library(ggplot2)

f = function(x) {
return(.06*exp(-x/10.)+.006*exp(-x/50.)+.001*exp(-x/100.))
}

x = seq(from=0,to=100,by=.005)
y = sapply(X=x,FUN=f,simplify=TRUE,USE.NAMES=FALSE)
df = data.frame(X=x,Y=y)

ggplot(data=df, mapping=aes(x=X, y=Y)) +
geom_line()+
geom_area(mapping=aes(x=ifelse(x>0 & x<10, x, 0)), fill="#9898fb", alpha=1.) +
xlim(0,100) +
ylim(0, .1) +
ylab("Exponential Mixture Density") +
ggtitle("AUC for Exponential Mixture: ~44th Percentile") +
geom_vline(xintercept=0)+
geom_hline(yintercept=0)


For each plot, we simply replaced the title and passed the quantile of interest to the generic f function, which encapsulates our sample exponential mixture model.

## Conclusion

Exponential mixture models have many desirable properties and offer a degree of flexibility not characteristic of the parametric approach. In a future post, we’ll demonstrate how to obtain the optimal weighting scheme and mean for each mixture term using maximum likelihood estimation and also using Markov Chain Monte Carlo methods. Until then, happy coding!