EigenAnime

Quantitative Engineering Analysis 1: A facial recognition algorithm to match people with similar anime faces.

picture of portfolio project
November 2022-December 2022

Quantitative Engineering Analysis 1 is a class focused on teaching students the engineering applications of different linear algebra concepts.

I worked on this project alongside Dokyun Kim, and Ellen Sun. For the final, groups made a program or explanation of concepts taught during the class. Our group dove deeper into Eigenvectors and Principal Component Analysis, specifically in facial recognition, and combined these ideas with our enjoyment of anime. Our goal was to develop an algorithm to match people with an anime look-alike when given a photo.

Example training data images and visualizations of training data dimensions.

The data set was well curated with images specifically facing forward and facial features being in around the same area. Visual representations of the original dataset dimensions on the left and reshaped dimensions on the right.

To do this, we first created a data set of 2 pictures of 42 characters, 84 pictures total. The pictures formed a 64 x 64 x 84 (height of image x width of image x number of images) 3-dimensional matrix. In Matlab, we reshaped this matrix into a 84 x 4096 (number of images x number of pixels) 2-dimensional matrix to be able to perform PCA on the dataset.

After finding the eigenvectors of the training data, we projected images onto the eigenvectors to compare them against each other. Images were compared by calculating the euclidean distance between them. Intuitively, the smallest distance would mean the most similar images and therefore the same character.

Graph of accuracy versus number of eigenvectors used.

To optimize our facial recognition algorithm, we graphed the accuracy of our model against the number of eigenvectors used. We found that using 32 eigenvectors gave our model a 72% accuracy in identifying the same anime character. The accuracy was calculated using 2 fold cross validation with a supervised data set. The accuracy was high enough to conclude that the model works well with anime-anime recognition.

Graph of distance between anime to anime comparison against human to anime comparison.

Afterwards, we gave our model human faces. However, graphing the smallest euclidean distances of anime-anime comparisons against human-anime made it clear that there was little resemblance between the human picture and anime picture. There is a significantly higher distance in the vast majority of human-anime comparisons.

We learned that creating a solid data set and a model with a small data set are both difficult. Ultimately we found that matching human faces to anime faces is difficult due to the vast difference in facial structure.

Github Repository