# Classifying Artwork

Determining a particular piece of artwork’s style could be quite challenging, especially given no other information about the piece of artwork. Using a probabilistic approach, a group of friends and I were curious as to whether we could train a model that could learn and classify particular styles of art. For example, more impressionist pieces of art could have more vibrant colors and intense brush strokes, whereas more contemporary pieces of art could contain more geometric patterns throughout the piece.

## Data Collection

The first step was the obtain a large dataset of images somehow. The next step was to format these images into feature vectors that we could train a model on.

### Scraping

Data was scraped using BeautifulSoup4 from the J. Paul Getty Museum (Getty) collection and the Museum of Modern Art (MoMA) collection. Obtaining images from the Getty collection was a bit cumbersome, in that some pieces of art did not have the necessary details for us to use. Because we had to scrape the web page of each art piece before realizing whether the image containined the necessary associated information, this process was quite slow.

The MoMA dataset was a lot easier to deal with, in that the MoMA conveniently provides a Github repository, with a JSON file containing the metadata of all of the artwork in their entire collection. Because of this, we could painlessly skip scraping the images that we already knew did not provide all of the necessary information for our training.

### Image Processing

Each piece of art was center-cropped into a square, and processed as a matrix of its pixels, with each element containing some RGB(r, g, b) value. For example, the $i$th image could be represented as follows.

The RGB matrix corresponding to this $i$th image would then be unraveled to construct that image’s feature vector, $\mathbf{x}_i$.

Thus, we can represent the feature vectors of our training set as the matrix denoted as $\mathbf{X}$, and the true labels of our training set as the row vector denoted as $\mathbf{y}$.

## Training

For the training portion of the task, we attempted two types of models: support vector machines (SVMs) and artificial neural networks (ANNs). SciKit-Learn was used for the implementation of the SVMs and TensorFlow was used for the implementation of the ANNs.

### Support Vector Machines

We can use the formula for a kernel SVM as follows.

where $\hat{y}$ is the predicted label for feature vector $\hat{\mathbf{x}}$, $w_0$ is the bias term, and $w_i$, $y_i$, and $\mathbf{x}_i$ are the weight, true label, and feature vector of the $i$th data point in the training data. $K$ is simply the kernel function. We experimented with linear, polynomial, and radial basis function (RBF) kernels on our dataset.

Linear Polynomial RBF
${\mathbf{x}_i}^\top \hat{\mathbf{x}}$ $\left({\mathbf{x}_i}^\top \hat{\mathbf{x}}\right)^c$ $\exp\left( -\frac{\lVert \mathbf{x}_i - \hat{\mathbf{x}} \rVert}{2\sigma^2} \right)$

Because the feature vectors used to represent particular pieces of artwork are likely to not be linearly separable, and most likely not $k$-polynomial separable up to to a large $k$, we anticipated that the first two approaches would not perform outstandingly well.

Kernel SVM Accuracy F1 Score
Linear 24.8% 24.6%
Polynomial 30.3% 29.2%
RBF 35.9% 35.2%

### Artificial Neural Networks

Next, we tried building neural networks to train using our data.

To compute the $j$th node at the $l$th hidden layer of our neural network, represented as $h_j^{(l)}$, we can simply do the following.

Here, $d$ represents the number of features of each data point, and $w_{ij}^{(l-1)}$ represents the weight of the edge connecting the $i$th and $j$th node of the $l-1$th and $l$th layer, respectively.

Then, we can compute the activation at that node using the regularized linear unit (ReLU) function, represented below.

As our ordinary neural network did not perform too well, we’re currently experimenting with convolutional neural networks (CNNs). Stay tuned…