### Hierarchical Clustering

In a previous post I discussed k-means clustering, which is a type of unsupervised learning method.  Today I want to add another tool to our modeling kit by discussing hierarchical clustering methods and their implementation in R. As in the k-means clustering post I will discuss the issue of clustering countries based on macro data.  … More Hierarchical Clustering

### Regularized Neural Network in R

In my last post I discussed the importance of using a regularized classifier to avoid overfitting the data. I used contour plots to show the distortions in the decision boundaries of a logistic regression classifier on artificial data. Recently I have been going through Nigel Lewis’ “Build Your Own Neural Network Today!” and decided to … More Regularized Neural Network in R

### Shaving a Classifier with Occam’s Razor

I recently went through a Coursera course on Classification taught by Carlos Guestrin from U of Washington and thought it was excellent.   There was an interesting discussion on model overfitting that I thought I would share. In previous posts I discussed linear models with shrinkage parameters such us ridge and lasso regression models. Similar approach … More Shaving a Classifier with Occam’s Razor

### Minimum Spanning Trees

My posts have been spars as I adjust being back home. Today’s post will be about a somewhat gimmicky approach to visualizing correlation in the markets. A while back I came across an interesting article by Resovsky et al on minimum spanning trees. I do not have a background in graph theory but I have … More Minimum Spanning Trees

### Classification Trees

In today’s post I wanted to describe classification trees. I will concentrate on Classification and Regression Tree (CART) algorithm. As I discuss the main features of this algorithm I will contrast it with other widely used methods to construct classification trees. What A Classification Tree Looks Like: Before diving into the algorithm let’s have a … More Classification Trees

### Lasso model example (LME’s Aluminium Futures Price)

In my previous post I showed a coordinate descent algorithm for solving Lasso coefficients. Lasso model is part of a family of penalized regression models that are popular in machine learning and predictive modeling. In today’s post I want to show you how this model can be used to estimate the monthly average price of … More Lasso model example (LME’s Aluminium Futures Price)

### On US PMI

I have spent a month running around Toronto and Tokyo so did not get a chance to post earlier on the nasty PMI print coming out of US but decided to comment at this time so I can tie to my earlier post on ridge regression modeling. There is not much positive to say about … More On US PMI

### K-Means Clustering in Excel

In this post I wanted to present a very popular clustering algorithm used in machine learning. The k-means algorithm is an unsupervised algorithm that allocates unlabeled data into a preselected number of K clusters.  A stylized example is presented below to help with the exposition. Lets say we have 256 observations which are plotted below. … More K-Means Clustering in Excel