Consider a GAMs model as follows:

\[Y=\alpha + \sum_{j=1}^{P}f_{j}(X_{j}) + \epsilon\]

We can estimate the functions, \(f_{j}\), by minimizing the penalized residual sum of squares as follows:

\[RSS=\sum_{i=1}^{N}(y_{i}-\alpha-\sum_{j=1}^{p}f_{j}(x_{ij}))^2+\sum_{j=1}^{p}\lambda_{j}\int f_{j}^{''}(t_{j})^2dt_{j}\]

The first term in the above equation measures how close our functions are to the data, and the second term penalizes curvature in the estimated functions. The \(\lambda_{j}\), the smoothing parameter, decides on the tradeoff for each function \(j\). If \(\lambda\) is infinitive, then we are only looking for a simple linear fit; because no curvature (second-order derivative) is tolerated!

Naturally, the cubic splines can be estimated for the functions, \(f_{j}\). How do we estimate them? Let’s see.

The constant \(\alpha\): This constant is shared across all the functions, \(f_{j}\) and does not add value to the functions estimations. We can go on and set \(\alpha = \sum_{i=1}^{N}y_{i}\). In this way, the functions, \(f_{j}\) , also average zero, \(\sum_{i=1}^{N}f_{j}(x_{ij})=0\).

Iterative procedure: Given the above, we can run an iterative procedure to estimate the functions, \(f_{j}\), as follows:

Algorithm for estimating Additive Models

(Friedman, Hastie, Tibshirani, 2009)

\[1. \text{ Set } \alpha= \sum_{i=1}^{N}y_{i}, \hat{f}_{j}=0\] \[2. \text{ For } j=1,2,3,...,p, 1,2,3,...,p,1,2,...,\] \[\hat{f}_{j}\leftarrow Spline[(y_{i}-\hat{\alpha}-\sum_{k\ne j}\hat{f}_{k}(x_{ij}))_{1}^{N}]\] \[\hat{f}_{j}\leftarrow \hat{f}_{j}-\frac{1}{N}\sum_{i=1}^{N}\hat{f}_{j}(x_{ij})\]

We repeat the above until the functions stabilize (or change less than a prespecified threshold). I should note that, in the above algorithm, we can use other fitting methods instead of splines, different functions for different variables, and functions that include interactions between variables. The overall point is this: we can control the non-linearity in a way that we can interpret it. Isn’t that wonderful?

Next, I will run the GAMs model in python on a simple example, and soon, I will check its performance on predicting assets’ excess return. But before, let me give you an overview of the evolution from linear models to GAMs.

Let’s say we have a dependent variable \(y\) and two independent variables, \(X_{1}\) and \(X_{2}\). In the simplest form, we have the standard linear model in which independent variable related to the mean dependent variable linearly as follows:

  • Linear model
\[y\sim\mathcal{N}(\mu,\sigma^{2})\\ \mu = \beta_{0} + \beta_{1}X_{1} + \beta_{2}X_{2}\]

Next, to add non-linearity to our linear models, we can use polynomial models. We can use the same procedure as the linear model to estimate the coefficients. As an example, we have:

  • Polynomial
\[y\sim\mathcal{N}(\mu,\sigma^{2})\\ \mu = \beta_{0} + \beta_{1}X_{1} + \beta_{2}X_{2}^{2}\]

Next, we have the Generalized Linear Model (GLM), in which we use a link function (\(g\)) to relate the predictors to the response variable. It is modeled as follows:

  • GLM
\[y\sim\mathcal{N}(\mu,\sigma^{2})\\ \mu = g^{-1}(\beta_{0} + \beta_{1}X_{1} + \beta_{2}X_{2}^{2})\]

And finally, we have the Generalized Additive Models (GAMs) in which the predictors are related to the response variables with a link function \(g\). In addition, the independent variables are free to choose how they relate to the response variable with functions, \(f_{j}\).

  • GAMs
\[y\sim\mathcal{N}(\mu,\sigma^{2})\\ \mu = g^{-1}(\alpha_{0} + f_{1}(X_{1}) + f_{2}(X_{2}))\]

Next, let us go through a simple example in python and see how GAMs does its job.