LOESS, also referred to as LOWESS, for locally-weighted scatterplot smoothing, is a non-parametric regression method that combines multiple regression models in a k-nearest-neighbor-based meta-model 1. Although LOESS and LOWESS can sometimes have slightly different meanings, they are in many contexts treated as synonyms. For the remainder of this post, we will refer to the fitting of localized subsets of data to build up a function that describes the deterministic part of the variation in a dataset using weighted linear least squares as the LOESS method. Although libraries such as scikit-learn and statsmodels have highly optimized locally weighted regression routines, it’s helpful to see how the estimates are actually generated, and that’s what we’ll be doing in this post.

## LOESS Fitting Procedure

Given a dataset consisting of a collection of $$n$$ $$(x, y)$$ pairs, one of the purposes of LOESS is to provide a general idea of the dataset’s distributional form prior to committing to a specific parametric model. One of the benefits of LOESS is that there is no requirement to specify a global function to fit to the data. Instead, locally weighted regression is performed on a pre-defined number of nearest neighbors around each data point, then the composite of each locally weighted regression is plotted along with a scatterplot of the original $$(x, y)$$ data.
In what follows, we’ll outline the steps of the LOESS algorithm, and along the way identify inputs to the fitting routine that can be modified to alter the nature of the resulting composite fit. In addition, we’ll cover an optional iterative robust estimation procedure, which attempts to mitigate the effect that outliers or influential data points may have on each set of locally weighted regression parameters.

### Step I: Specify LOESS parameters

In order to carry out the LOESS fitting procedure, we need to specify the number of data points to use for each locally weighted regression, and the degree of the polynomial used for estimation. The number of data points used for each regression is obtained from $$\alpha$$, the smoothing paramter. For example, if a dataset contains 20 observations and $$\alpha=.50$$, then the $$q = 20 * \alpha = 10$$ most proximate data points will be used for each locally weighted regression. The degree of polynomial is represented by $$\lambda$$, and within the context of LOESS is commonly set to $$\lambda=1$$ (linear regression) or $$\lambda=2$$ (second-degree polynomial regression) for each subset consiting of $$q$$ data points.

The first step is to define $$m$$ equally-spaced locations across the range of $$x$$ values, identified as $$v_{j}$$, where $$j$$ ranges from 1 to $$m$$. Ultimately, the LOESS curve will be evaluated at each of the $$v_{j}$$s, and the fitted value at each $$v_{j}$$ will be given by $$\hat{g}(v_{j})$$ 2. Generally, if the dataset contains $$n$$ data points, then $$m = n + 1$$, i.e., the number of equally-spaced locations will be one more than the total number of data points.

### Step II: Calculate proximate weights for each observation

For each $$v_{j}$$, it is necessary to determine the $$q$$ closest points, $$d_{i}(v_{j}) = |x_{i}-v_{j}|$$. Once the $$q$$-nearest observations have been determined, each distance is rescaled as a proportion of the maximum of the $$q$$-nearest distances, $$d^{*}_{i}(v_{j})=d_{i}(v_{j})/d_{q}(v_{j})$$, where $$d^{*}_{1}(v_{j})$$ represents the nearest data point, and $$d^{*}_{q}(v_{j})$$ the furthest. We calculate neighborhood weights for all observations, from $$i=1$$ to $$n$$. Once the re-scaled distances have been calculated, the weight for observation $$i$$ is defined using the tricube weight function, given by:

$$w_{i}(v_{j}) = \begin{cases} (1-|d^{*}_{i}(v_{j})^{3}|)^{3} & d^{*}_{i}(v_{j}) \lt 1 \\ 0 & otherwise \end{cases}$$

Because distances are rescaled in terms of the most distant of the $$q$$-nearest observations, observations beyond the $$q^{th}$$-nearest will be given 0 weight when fitting the locally weighted regression coefficients.

### Step III: Determine local regression estimate at each location

For each location $$v_{j}$$, determine the local weighted regression coefficients using the normal equations adapted to attribute greater weight to more proximate observations. In matrix form, the regression parameters are given by

$$\hat{\beta} = (X^{T}WX)^{-1}X^{T}Wy,$$

where

* $$X$$ is the $$n$$ x $$(\lambda+1)$$ design matrix, with $$\lambda$$ representing the degree of the regression polynomial

* $$W$$ is the $$n$$ x $$n$$ diagonal weight matrix

* $$y$$ is the $$n$$ x $$1$$ response vector

* $$\hat{\beta}$$ is the $$(\lambda+1)$$ x $$1$$ parameter vector estimated for each location $$v_{j}$$

The first column of $$X$$ will contain all 1’s, representing the intercept term. When $$\lambda=1$$:

\begin{align*} X &= \begin{bmatrix} 1&x_{1} \\ 1&x_{2} \\ \vdots & \vdots\\1&x_{n} \\ \end{bmatrix} \\\\ \hat{\beta} &= \begin{bmatrix} \beta_{0}&\beta_{1} \\ \end{bmatrix}, \\ \end{align*}

and when $$\lambda=2$$:

\begin{align*} X &= \begin{bmatrix} 1&x_{1}&x_{1}^{2} \\ 1&x_{2}&x_{2}^{2} \\ \vdots & \vdots\\1&x_{n}&x_{n}^{2} \\ \end{bmatrix} \\\\ \hat{\beta} &= \begin{bmatrix} \beta_{0}&\beta_{1}&\beta_{2} \\ \end{bmatrix}. \\ \end{align*}

### Implementation

The code that follows encapsulates steps I, II and III of the LOESS fitting procedure described thus far. We’ll discuss the optional robust estimation procedure after introducing the base LOESS algorithm:

"""
Local Regression (LOESS) estimation routine.
"""
import numpy as np
import pandas as pd
import scipy

def loc_eval(x, b):
"""
Evaluate x using locally-weighted regression parameters.
Degree of polynomial used in loess is inferred from b. x
is assumed to be a scalar.
"""
loc_est = 0
for i in enumerate(b): loc_est+=i[1]*(x**i[0])
return(loc_est)

def loess(xvals, yvals, alpha, poly_degree=1):
"""
Perform locally-weighted regression on xvals & yvals.
Variables used inside loess function:

n         => number of data points in xvals
m         => nbr of LOESS evaluation points
q         => number of data points used for each
locally-weighted regression
v         => x-value locations for evaluating LOESS
locsDF    => contains local regression details for each
location v
evalDF    => contains actual LOESS output for each v
X         => n-by-(poly_degree+1) design matrix
W         => n-by-n diagonal weight matrix for each
local regression
y         => yvals
b         => local regression coefficient estimates.
b = (X^T*W*X)^-1*X^T*W*y. Note that @
replaces np.dot in recent numpy versions.
local_est => response for local regression
"""
# Sort dataset by xvals.
all_data = sorted(zip(xvals, yvals), key=lambda x: x[0])
xvals, yvals = zip(*all_data)

locsDF = pd.DataFrame(
columns=[
'loc','x','weights','v','y','raw_dists',
'scale_factor','scaled_dists'
])
evalDF = pd.DataFrame(
columns=[
'loc','est','b','v','g'
])

n = len(xvals)
m = n + 1
q = int(np.floor(n * alpha) if alpha <= 1.0 else n)
avg_interval = ((max(xvals)-min(xvals))/len(xvals))
v_lb = max(0,min(xvals)-(.5*avg_interval))
v_ub = (max(xvals)+(.5*avg_interval))
v = enumerate(np.linspace(start=v_lb, stop=v_ub, num=m), start=1)

# Generate design matrix based on poly_degree.
xcols = [np.ones_like(xvals)]
for j in range(1, (poly_degree + 1)):
xcols.append([i ** j for i in xvals])
X = np.vstack(xcols).T

for i in v:

iterpos = i[0]
iterval = i[1]

# Determine q-nearest xvals to iterval.
iterdists = sorted([(j, np.abs(j-iterval)) \
for j in xvals], key=lambda x: x[1])

_, raw_dists = zip(*iterdists)

# Scale local observations by qth-nearest raw_dist.
scale_fact = raw_dists[q-1]
scaled_dists = [(j[0],(j[1]/scale_fact)) for j in iterdists]
weights = [(j[0],((1-np.abs(j[1]**3))**3 \
if j[1]<=1 else 0)) for j in scaled_dists]

# Remove xvals from each tuple:
_, weights      = zip(*sorted(weights,     key=lambda x: x[0]))
_, raw_dists    = zip(*sorted(iterdists,   key=lambda x: x[0]))
_, scaled_dists = zip(*sorted(scaled_dists,key=lambda x: x[0]))

iterDF1 = pd.DataFrame({
'loc'         :iterpos,
'x'           :xvals,
'v'           :iterval,
'weights'     :weights,
'y'           :yvals,
'raw_dists'   :raw_dists,
'scale_fact'  :scale_fact,
'scaled_dists':scaled_dists
})

locsDF    = pd.concat([locsDF, iterDF1])
W         = np.diag(weights)
y         = yvals
b         = np.linalg.inv(X.T @ W @ X) @ (X.T @ W @ y)
local_est = loc_eval(iterval, b)
iterDF2   = pd.DataFrame({
'loc':[iterpos],
'b'  :[b],
'v'  :[iterval],
'g'  :[local_est]
})

evalDF = pd.concat([evalDF, iterDF2])

# Reset indicies for returned DataFrames.
locsDF.reset_index(inplace=True)
locsDF.drop('index', axis=1, inplace=True)
locsDF['est'] = 0; evalDF['est'] = 0
locsDF = locsDF[['loc','est','v','x','y','raw_dists',
'scale_fact','scaled_dists','weights']]

# Reset index for evalDF.
evalDF.reset_index(inplace=True)
evalDF.drop('index', axis=1, inplace=True)
evalDF = evalDF[['loc','est', 'v', 'b', 'g']]

return(locsDF, evalDF)


The first function, loc_eval, calculates the local regression estimate using the specified vector of regression coefficients. loess takes 4 arguments: xvals and yvals are length $$n$$ arrays that serve as the target for the estimation procedure. alpha is the smoothing paramater, $$0 \leq \alpha \leq 1$$, which determines how many observations to use for each local regression. poly_degree is the degree of each local regression polynomial, typically set to 1 or 2.
loess returns a tuple consisting of two DataFrames: The first contains details related to each local regression, including the nearest observations, tricube weights and the scale factor used for setting the weight matrix. The second DataFrame contains all regression coefficients and actual LOESS estimates for each $$v_{j}$$ (the LOESS estimates are in column g).
To demonstrate howloess works, it is passed a dataset that follows a known functional form but with added random perturbations:

"""
Calling loess on a dataset that follows a known functional
"""
import numpy as np
import pandas as pd
import scipy
import matplotlib.pyplot as plt
import seaborn as sns

np.random.seed(16)

xvals = [.39,.75,1.1,1.45,1.95,2.46,3.07,3.44,4.57,5.05,5.68,
6.01,6.63,7.11,7.62,8.01,8.54,9.08,9.48, 9.91]

yvals = [1.25*np.sqrt(i)+np.random.normal(0, .425) for i in xvals]

# loess returns a tuple of DataFrames, named here as regsDF and
# evalDF for "Regression DataFrame" and "Evaluation DataFrame":
regsDF, evalDF = loess(xvals, yvals, alpha=.6, poly_degree=1)


Here’s the contents of evalDF:

   loc  est        v                                 b         g
0    1    0   0.1520   [0.971309456784, 0.36775619248]  1.027208
1    2    0   0.6518  [0.971037894146, 0.367892814498]  1.210830
2    3    0   1.1516  [0.970870414448, 0.368032309445]  1.394696
3    4    0   1.6514  [0.969933474046, 0.368445061391]  1.578384
4    5    0   2.1512  [0.966646869447, 0.369833381406]  1.762232
5    6    0   2.6510   [0.95975778471, 0.373072897623]  1.948774
6    7    0   3.1508   [0.951294824032, 0.37704265022]  2.139281
7    8    0   3.6506  [0.934302528565, 0.384487641362]  2.337913
8    9    0   4.1504  [0.943742212522, 0.379817597861]  2.520137
9   10    0   4.6502    [1.05795640166, 0.34805636124]  2.676488
10  11    0   5.1500   [1.31080362855, 0.295077400117]  2.830452
11  12    0   5.6498    [1.7907245679, 0.210302149182]  2.978890
12  13    0   6.1496   [2.04555732953, 0.171963219408]  3.103062
13  14    0   6.6494   [1.79314133537, 0.214725343498]  3.220936
14  15    0   7.1492   [1.62970787356, 0.238031763905]  3.331445
15  16    0   7.6490   [1.48509663441, 0.257554113797]  3.455128
16  17    0   8.1488    [1.5109094503, 0.253071435328]  3.573138
17  18    0   8.6486   [1.50473780555, 0.253257497761]  3.695061
18  19    0   9.1484   [1.49131228675, 0.254851108695]  3.822792
19  20    0   9.6482     [1.484099205, 0.255821994149]  3.952321
20  21    0  10.1480   [1.48577211467, 0.255705004244]  4.080666


For each of the 21 locations in v, the 1st-degree polynomial LOESS estimate can be found in g, and b contains the regression coefficients for each of the 21 regressions carried out within the function. To illustrate, for $$v=.1520$$, the model specification is given by $$\beta_{0} + \beta_{1}v = 0.9713 + 0.367756*0.1520 = 1.027208$$, which is exactly the value in g.
To visualize, we can produce a scatterplot of xvals and yvals with the LOESS estimate overlaid:

import numpy as np
import pandas as pd
import scipy
import matplotlib.pyplot as plt
import seaborn as sns

sns.set(context='notebook', style='darkgrid', font_scale=1)

np.random.seed(16)

xvals = [.39,.75,1.1,1.45,1.95,2.46,3.07,3.44,4.57,5.05,5.68,
6.01,6.63,7.11,7.62,8.01,8.54,9.08,9.48, 9.91]

yvals = [1.25*np.sqrt(i)+np.random.normal(0, .425) for i in xvals]

regsDF, evalDF = loess(xvals, yvals, alpha=.6, poly_degree=1)

# Obtain reference to LOESS x & y values (v & g).
l_x  = evalDF['v'].values
l_y  = evalDF['g'].values

# Generate x-y scatterplot with loess estimate overlaid.
fig = plt.figure()
ax1.grid(True)
ax1.minorticks_on()
ax1.scatter(xvals, yvals, label="Original Data")
ax1.plot(l_x, l_y, color='#FF0000', label="1st-degree Polynomial LOESS")
ax1.set_title("Linear LOESS Estimator", loc="left", fontsize=14)
ax1.legend(loc="upper left",
scatterpoints=1,
fontsize=11,
frameon=True,
fancybox=True,
facecolor="#FFFFFF",
edgecolor="#000000")
plt.tight_layout()
plt.show()


Running the code above produces:

We can specify poly_degree=2 when calling loess, which will fit each local subset of data to a quadratic polynomial. Next, we generate 1st and 2nd-degree polynomial LOESS estimates based on the same dataset and compare the resulting estimates. Referring to the same xvals and yvals as before:

regsDF1, evalDF1 = loess1(xvals, yvals, alpha=.6, poly_degree=1)
regsDF2, evalDF2 = loess1(xvals, yvals, alpha=.6, poly_degree=2)

l_x1  = evalDF['v'].values
l_y1  = evalDF['g'].values

l_x2 = evalDF2['v'].values
l_y2 = evalDF2['g'].values

# poly_degree=1 plot:
fig = plt.figure()
ax1.grid(True)
ax1.minorticks_on()
ax1.scatter(xvals, yvals, label="Original Data")
ax1.plot(l_x1, l_y1, color='#FF0000', label="1st-degree Polynomial LOESS")
ax1.set_title("Linear LOESS Estimator", loc="left", fontsize=14)
ax1.legend(loc="upper left",
scatterpoints=1,
fontsize=11,
frameon=True,
fancybox=True,
facecolor="#FFFFFF",
edgecolor="#000000")

# poly_degree=2 plot:
ax2.grid(True)
ax2.minorticks_on()
ax2.scatter(xvals, yvals, label="Original Data")
ax2.plot(l_x2, l_y2, color='#FF0000', label="2nd-degree Polynomial LOESS")
ax2.legend(loc="upper left",
scatterpoints=1,
fontsize=11,
frameon=True,
fancybox=True,
facecolor="#FFFFFF",
edgecolor="#000000")

plt.tight_layout()
plt.show()


This produces a side-by-side comparison of the 1st and 2nd-degree polynomial estimates:

It may be necessary to adjust the smoothing parameter in order to compensate for areas of overfitting when specifying a LOESS estimate with a higher degree polynomial.

### Step IV (optional): Iterative robust estimation procedure

After the initial LOESS estimate, each local regression model at each location can be replaced by another model using an iterative robust estimation procedure 2. This involves four steps, which are repeated until some convergence criteria are met. The four steps are carried out for each of the $$n+1$$ locations $$v_j$$.

#### 1. Calculate local regression model residuals

Calculate the difference between the local regression line and response. If the residuals for location $$j$$ are given by $$e$$, then

$$e = y-X\hat{\beta_{j}}$$

#### 2. Normalize residuals

We use the median of $$e$$ in 1) to calculate the normalized residuals, $$e^{*}$$:

$$e^{*} = \frac{e}{6\,Median\, |e|}$$

#### 3. Define robustness weight for each observation

For the robustness weight, we use the bisquare weight function, given by:

$$r = \begin{cases} (1-|e^{*}|^{2})^{2} & |e^{*}| \lt 1 \\ 0 & otherwise \end{cases}$$

This new weight, $$r$$, will have the same number of elements as the original tricube weights, which were used to create an $$n$$ x $$n$$ diagonal weight matrix. For the robustness update, we proceed just as before, the only difference being the new weight matrix, $$W^{*}$$, will be the product of the original tricube weights and $$r$$ placed along the diagonal.

#### 4. Use updated weights to estimate new fitted value

If the updated parameter estimates are given by \hat{\beta^{*}}, then

$$\hat{\beta^{*}} = (X^{T}W^{*}X)^{-1}X^{T}W^{*}y,$$

with $$X$$ and $$y$$ the same as before. Repeat until estimated coefficients converge.

We extend the original loess function by adding a robustify parameter (which defaults to False) that indicates whether or not to perform the robust estimation procedure (“Step IV”). Iteration ceases when either 1) each new estimate, when compared with estimates from the previous iteration, change by less than $$.5%$$, or 2) the number of iterations exceeds 50:

"""
Local Regression (LOESS) estimation routine with optional
iterative robust estimation procedure. Setting robustify=True
indicates that the robust estimation procedure should be
performed.
"""
import numpy as np
import pandas as pd
import scipy

def loc_eval(x, b):
"""
Evaluate x using locally-weighted regression parameters.
Degree of polynomial used in loess is inferred from b. x
is assumed to be a scalar.
"""
loc_est = 0
for i in enumerate(b): loc_est+=i[1]*(x**i[0])
return(loc_est)

def loess(xvals, yvals, alpha, poly_degree=1, robustify=False):
"""
Perform locally-weighted regression via xvals & yvals.
Variables used within loess function:

n         => number of data points in xvals
m         => nbr of LOESS evaluation points
q         => number of data points used for each
locally-weighted regression
v         => x-value locations for evaluating LOESS
locsDF    => contains local regression details for each
location v
evalDF    => contains actual LOESS output for each v
X         => n-by-(poly_degree+1) design matrix
W         => n-by-n diagonal weight matrix for each
local regression
y         => yvals
b         => local regression coefficient estimates.
b = (X^T*W*X)^-1*X^T*W*y. Note that @
replaces np.dot in recent numpy versions.
local_est => response for local regression
"""
# sort dataset by xvals:
all_data = sorted(zip(xvals, yvals), key=lambda x: x[0])
xvals, yvals = zip(*all_data)

locsDF = pd.DataFrame(
columns=[
'loc','x','weights','v','y','raw_dists',
'scale_factor','scaled_dists'
])
evalDF = pd.DataFrame(
columns=[
'loc','est','b','v','g'
])

n = len(xvals)
m = n + 1
q = int(np.floor(n * alpha) if alpha <= 1.0 else n)
avg_interval = ((max(xvals)-min(xvals))/len(xvals))
v_lb = max(0,min(xvals)-(.5*avg_interval))
v_ub = (max(xvals)+(.5*avg_interval))
v = enumerate(np.linspace(start=v_lb, stop=v_ub, num=m), start=1)

# Generate design matrix based on poly_degree.
xcols = [np.ones_like(xvals)]
for j in range(1, (poly_degree + 1)):
xcols.append([i ** j for i in xvals])
X = np.vstack(xcols).T

for i in v:

iterpos = i[0]
iterval = i[1]

# Determine q-nearest xvals to iterval.
iterdists = sorted([(j, np.abs(j-iterval)) \
for j in xvals], key=lambda x: x[1])

_, raw_dists = zip(*iterdists)

# Scale local observations by qth-nearest raw_dist.
scale_fact = raw_dists[q-1]
scaled_dists = [(j[0],(j[1]/scale_fact)) for j in iterdists]
weights = [(j[0],((1-np.abs(j[1]**3))**3 \
if j[1]<=1 else 0)) for j in scaled_dists]

# Remove xvals from each tuple:
_, weights      = zip(*sorted(weights,     key=lambda x: x[0]))
_, raw_dists    = zip(*sorted(iterdists,   key=lambda x: x[0]))
_, scaled_dists = zip(*sorted(scaled_dists,key=lambda x: x[0]))

iterDF1 = pd.DataFrame({
'loc'         :iterpos,
'x'           :xvals,
'v'           :iterval,
'weights'     :weights,
'y'           :yvals,
'raw_dists'   :raw_dists,
'scale_fact'  :scale_fact,
'scaled_dists':scaled_dists
})

locsDF = pd.concat([locsDF, iterDF1])
W = np.diag(weights)
y = yvals
b = np.linalg.inv(X.T @ W @ X) @ (X.T @ W @ y)
local_est = loc_eval(iterval, b)

iterDF2 = pd.DataFrame({
'loc':[iterpos],
'b'  :[b],
'v'  :[iterval],
'g'  :[local_est]
})

evalDF = pd.concat([evalDF, iterDF2])

# Reset indicies for returned DataFrames.
locsDF.reset_index(inplace=True)
locsDF.drop('index', axis=1, inplace=True)
locsDF['est'] = 0; evalDF['est'] = 0
locsDF = locsDF[['loc','est','v','x','y','raw_dists',
'scale_fact','scaled_dists','weights']]

if robustify==True:

cycle_nbr = 1
robust_est = [evalDF]

while True:
# Perform iterative robustness procedure for each local regression.
# Evaluate local regression for each item in xvals.
#
# e1_i => raw residuals
# e2_i => scaled residuals
# r_i  => robustness weight
revalDF = pd.DataFrame(
columns=['loc','est','v','b','g']
)

for i in robust_est[-1]['loc']:

prevDF = robust_est[-1]
locDF = locsDF[locsDF['loc']==i]
b_i = prevDF.loc[prevDF['loc']==i,'b'].item()
w_i = locDF['weights']
v_i = prevDF.loc[prevDF['loc']==i, 'v'].item()
g_i = prevDF.loc[prevDF['loc']==i, 'g'].item()
e1_i = [k-loc_eval(j,b_i) for (j,k) in zip(xvals,yvals)]
e2_i = [j/(6*np.median(np.abs(e1_i))) for j in e1_i]
r_i = [(1-np.abs(j**2))**2 if np.abs(j)<1 else 0 for j in e2_i]
w_f = [j*k for (j,k) in zip(w_i, r_i)]    # new weights
W_r = np.diag(w_f)
b_r = np.linalg.inv(X.T @ W_r @ X) @ (X.T @ W_r @ y)
riter_est = loc_eval(v_i, b_r)

riterDF = pd.DataFrame({
'loc':[i],
'b'  :[b_r],
'v'  :[v_i],
'g'  :[riter_est],
'est':[cycle_nbr]
})

revalDF = pd.concat([revalDF, riterDF])
robust_est.append(revalDF)

# Compare g vals from two latest revalDF's in robust_est.
idiffs = \
np.abs((robust_est[-2]["g"]-robust_est[-1]["g"])/robust_est[-2]["g"])

if ((np.all(idiffs<.005)) or cycle_nbr>50): break

cycle_nbr+=1

# Vertically bind all DataFrames from robust_est.
evalDF = pd.concat(robust_est)

evalDF.reset_index(inplace=True)
evalDF.drop('index', axis=1, inplace=True)
evalDF = evalDF[['loc','est', 'v', 'b', 'g']]

return(locsDF, evalDF)


Originally, evalDF contained $$n+1$$ records, one estimate per location $$v_{j}$$, and in the earlier example, evalDFs est field contained all 0’s, indicating that the associated estimates were from the initial estimation. Now, when robustify=True, est can be used to determine from which iteration a particular set of estimates originate. In the next example, we call loess with robustify=True, and compare how the LOESS curve changes with each updated set of model coefficients. For the remaining demonstrations, poly_degree=1:

"""
Graphical comparison of LOESS model estimates
at each robustness iteration.
"""
locsDF, evalDF = loess(xvals, yvals, alpha=.6, poly_degree=1, robustify=True)

resDF0 = evalDF[evalDF['est']==0]
l_x0   = resDF0['v'].values
l_y0   = resDF0['g'].values

resDF1 =  evalDF[evalDF['est']==1]
l_x1   = resDF1['v'].values
l_y1   = resDF1['g'].values

resDF3 =  evalDF[evalDF['est']==3]
l_x3   = resDF3['v'].values
l_y3   = resDF3['g'].values

resDF4 =  evalDF[evalDF['est']==evalDF['est'].max()]
l_x4   = resDF4['v'].values
l_y4   = resDF4['g'].values

fig = plt.figure()

ax0.grid(True)
ax0.minorticks_on()
ax0.scatter(xvals, yvals, label="Original Data")
ax0.plot(l_x0, l_y0, color='#FF0000', label="1-D Polynomial LOESS Estimate")
ax0.set_title("LOESS Estimate 1", loc='left')
ax0.legend(loc="upper left",
scatterpoints=1,
fontsize=11,
frameon=True,
fancybox=True,
facecolor="#FFFFFF",
edgecolor="#000000")

ax1.grid(True)
ax1.minorticks_on()
ax1.scatter(xvals, yvals, label="Original Data")
ax1.plot(l_x1, l_y1, color='#FF0000', label="1-D Polynomial LOESS Estimate")
ax1.set_title("LOESS Estimate 2", loc='left')
ax1.legend(loc="upper left",
scatterpoints=1,
fontsize=11,
frameon=True,
fancybox=True,
facecolor="#FFFFFF",
edgecolor="#000000")

ax2.grid(True)
ax2.minorticks_on()
ax2.scatter(xvals, yvals, label="Original Data")
ax2.plot(l_x3, l_y3, color='#FF0000', label="1-D Polynomial LOESS Estimate")
ax2.set_title("LOESS Estimate 3", loc='left')
ax2.legend(loc="upper left",
scatterpoints=1,
fontsize=11,
frameon=True,
fancybox=True,
facecolor="#FFFFFF",
edgecolor="#000000")

ax3.grid(True)
ax3.minorticks_on()
ax3.scatter(xvals, yvals, label="Original Data")
ax3.plot(l_x4, l_y4, color='#FF0000', label="1-D Polynomial LOESS Estimate")
ax3.set_title("LOESS Estimate 4", loc='left')
ax3.legend(loc="upper left",
scatterpoints=1,
fontsize=11,
frameon=True,
fancybox=True,
facecolor="#FFFFFF",
edgecolor="#000000")

plt.tight_layout()
plt.show()


Running the code above generates a comparison of the LOESS curves at each iteration of the robust estimation procedure:

The change between iteration 1 (upper left) and iteration 4 (lower right) is subtle, primarily due to the fact that we’re modeling a fairly well-behaved dataset, but notice how the influence of the very last observation on the LOESS curve is tempered as we move from the initial to the final plot. At first, for x-value=10, the LOESS curve estimated a value in excess of 4, between 4.10-4.20. By the final iteration, the y-value associated with the same x-value was almost exactly 4.0. The robust estimation procedure will have a greater affect on the LOESS approximation when modeling datasets that exhibit a greater degree of variation.

## Conclusion

The LOESS method can be especially valuable as an exploratory data analysis tool. If there exists little prior information on the distribution of a particular dataset, using a locally-weighted regression routine can help lend insight into its general distributional form.

### Footnotes:

1. Jacoby, William G. - Loess: A Nonparametric, Graphical Tool for Depicting Relationships Between Variables
2. Cleveland,William S. - Robust Locally Weighted Regression and Smoothing Scatterplots