Mixtures of multivariate Gaussians

Suppose we have a small dataset containing two clusters.

Then we can use the function vimix to fit a mixture of multivariate Gaussian distributions:

Let’s see what our clusters look like:

The lower bound increases at each iteration of the algorithm and can be used to check whether the algorithm has converged. Another interesting value to check at each iteration is the number of non-empty clusters.

Finally, it is important to note that the algorithm doesn’t always converge to the same solution. Depending on the initialization, it will converge to different local optima. However, by runnin the algorithm multiple times, we can see that, the optimal solution is almost always reached, in this simple case.

maxK <- 10
n_random_starts <- 30
ELBO <- matrix(0, maxK-1, n_random_starts)
for(k in 2:maxK){
    for(j in 1:n_random_starts){
        output <- vimix(data, K = k)
        ELBO[k-1,j] <- output$L[length(output$L)]
    }
}

library(reshape)
ELBO <- melt(t(ELBO))
names(ELBO) <- c('start_n', 'K', 'ELBO')
ELBO$K <- ELBO$K + 1

ggplot(ELBO, aes(x = K, y = ELBO)) + geom_point() + geom_jitter()

Now, let’s try to use the same mixture model as above, but with the additional assumption that all the variables are independent.

We can again check the clustering solution:

and the lower bound and number of non-empty clusters at each iteration:

Like before, there is no guarantee that the algorithm will converge to the global optimum. However, we observe in practice that this is usually the case.

maxK <- 10
n_random_starts <- 30
ELBO <- matrix(0, maxK-1, n_random_starts)
for(k in 2:maxK){
    for(j in 1:n_random_starts){
        output <- vimix(data, K = k, indep = T)
        ELBO[k-1,j] <- output$L[length(output$L)]
    }
}

library(reshape)
ELBO <- melt(t(ELBO))
names(ELBO) <- c('start_n', 'K', 'ELBO')
ELBO$K <- ELBO$K + 1

ggplot(ELBO, aes(x = K, y = ELBO)) + geom_point() + geom_jitter()