Laptop Brand Classifier

Sep 24, 2019

Full code can be accessed at the Github repository

What brand is this laptop?

Based on Lesson 2 of fast.ai’s Deep Learning course, it is possible to scrape images of the internet (particularly Google Images) to build our own classifier, which is actually extremely useful and can be applied to any number of applications.

Here, I chose a really simple problem, to classify laptops based on their brands using images of them. Although it may not seem so simple, since all laptops look similar to a certain extent, the highly efficient Deep Learning models will beg to differ.

This model gets around 83% accuracy, which is a very good result considering how similar laptops from different brands look.

This is the code used to carry out this task:

from fastai.vision import *

After going on Google Images, and searching for whatever images we want (e.g Macbooks), we can insert a simple Javascript command into the browser:

urls = Array.from(document.querySelectorAll(".rg_di .rg_meta")).map(
  (el) => JSON.parse(el.textContent).ou
);
window.open("data:text/csv;charset=utf-8," + escape(urls.join("\n")));

Next, we create the necessary folder and file name for the data to be imported into.

I am using Google’s Colab so all the images will be stored in Google Drive, from which the images are easily accesible.

from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)
root_dir = "/content/gdrive/My Drive/"
base_dir = root_dir + 'fastai-v3'

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=email%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocs.test%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.photos.readonly%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fpeopleapi.readonly&response_type=code

Enter your authorization code:
··········
Mounted at /content/gdrive

folder = 'macbook'
file = 'macbook.txt'

folder = 'hp'
file = 'hp.txt'

folder = 'lenovo'
file = 'lenovo.txt'

Code has to be run once for every category.

path = Path(base_dir+'/data/images')
dest = path/folder
dest.mkdir(parents=True, exist_ok=True)

path.ls()

[PosixPath('/content/gdrive/My Drive/fastai-v3/data/images/macbook.txt'),
 PosixPath('/content/gdrive/My Drive/fastai-v3/data/images/macbook'),
 PosixPath('/content/gdrive/My Drive/fastai-v3/data/images/lenovo.txt'),
 PosixPath('/content/gdrive/My Drive/fastai-v3/data/images/hp'),
 PosixPath('/content/gdrive/My Drive/fastai-v3/data/images/hp.txt'),
 PosixPath('/content/gdrive/My Drive/fastai-v3/data/images/lenovo'),
 PosixPath('/content/gdrive/My Drive/fastai-v3/data/images/models'),
 PosixPath('/content/gdrive/My Drive/fastai-v3/data/images/cleaned.csv'),
 PosixPath('/content/gdrive/My Drive/fastai-v3/data/images/export.pkl'),
 PosixPath('/content/gdrive/My Drive/fastai-v3/data/images/mactest.jpg')]

Next, the files (txt files with urls of images) has to be uploaded into Drive.

Once that is done, the images can be downloaded into Drive, into the specified folders, from the urls using the download_images function.

download_images(path/file, dest, max_pics=200)

classes = ['macbook','hp','lenovo']

We can remove any images that cannot be opened:

for c in classes:
    print(c)
    verify_images(path/c, delete=True, max_size=500)

Next, we can extract the images from the folders, and seperate them into training and validation sets, using the ImageDataBunch function.

np.random.seed(42)
data = ImageDataBunch.from_folder(path, train=".", valid_pct=0.2,
        ds_tfms=get_transforms(), size=224, num_workers=4).normalize(imagenet_stats)

/usr/local/lib/python3.6/dist-packages/fastai/data_block.py:534: UserWarning: You are labelling your items with CategoryList.
Your valid set contained the following unknown labels, the corresponding items have been discarded.
images
  if getattr(ds, 'warn', False): warn(ds.warn)

Looking at some of the pictures:

data.show_batch(rows=3, figsize=(7,8))

png

data.classes, data.c, len(data.train_ds), len(data.valid_ds)

(['hp', 'lenovo', 'macbook'], 3, 306, 75)

Training the model, using the cnn_learner function:

learn = cnn_learner(data, models.resnet34, metrics=error_rate)

Downloading: "https://download.pytorch.org/models/resnet34-333f7ec4.pth" to /root/.cache/torch/checkpoints/resnet34-333f7ec4.pth
100%|██████████| 87306240/87306240 [00:00<00:00, 162957184.69it/s]

learn.fit_one_cycle(5)

epoch	train_loss	valid_loss	error_rate	time
0	1.305020	0.848843	0.346667	00:54
1	1.121091	0.731948	0.293333	00:06
2	0.956481	0.663035	0.293333	00:05
3	0.809013	0.651194	0.266667	00:05
4	0.718085	0.661706	0.240000	00:05

learn.lr_find(start_lr=1e-5, end_lr=1e-1)

LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.

Interpreting the results:

learn.recorder.plot()

png

learn.fit_one_cycle(2,max_lr=slice(1e-03,1e-02))

epoch	train_loss	valid_loss	error_rate	time
0	0.152411	0.790561	0.173333	00:05
1	0.102961	0.861176	0.186667	00:05

interp = ClassificationInterpretation.from_learner(learn)

interp.plot_confusion_matrix()

png

interp.most_confused(min_val=2)

[('lenovo', 'hp', 4),
 ('hp', 'macbook', 3),
 ('lenovo', 'macbook', 3),
 ('macbook', 'hp', 3)]

Lenovo’s are being mistaken for HP’s 4 times, but the reverse doesn’t seem to happen. Macbooks are the ones that are creating most of the error.

Using an unused picture, and checking if our model can predict what laptop brand it is:

learn.export()

defaults.device = torch.device('cpu')

img = open_image(path/'mactest.jpg')
img

png

learn = load_learner(path)

pred_class,pred_idx,outputs = learn.predict(img)
pred_class

Category macbook

img1  = open_image(path/'hptest.jpg')
img1

png

pred_class,pred_idx,outputs = learn.predict(img1)
pred_class

Category hp

The model is able to predict these new images perfectly as well.

A very simple application to do something pretty complex.

Deep Learning

Vishnu Bharadwaj

Data Scientist

Budding Data Scientist who loves to learn and write.

Laptop Brand Classifier

What brand is this laptop?

Vishnu Bharadwaj

Data Scientist

Related