Pretrained Models for Machine Learning You Should Know

5 min readJul 31, 2024

Pretrained models are Machine Learning models that have been previously trained on a large dataset to solve a specific task. These models are then used or fine-tuned for other related tasks without the need to train them from scratch. So, if you want to know about the pretrained models used in the industry, this article is for you. In this article, I’ll take you through the pretrained models you should know for a career in Machine Learning.

Pretrained Models for Machine Learning You Should Know

The use of pretrained models in Machine Learning is for transfer learning. Transfer Learning is a concept where the knowledge gained while solving one problem is applied to a different but related problem. This approach leverages the learned features from the pretrained model to improve performance on the new task.

Here are some pretrained models you should know for a career in Machine Learning:

BERT (Bidirectional Encoder Representations from Transformers)
GPT-3 (Generative Pre-trained Transformer 3)
VGG16 (Visual Geometry Group)
ResNet (Residual Networks)
Word2Vec
YOLO (You Only Look Once)

Let’s go through all these pretrained models in detail.

BERT (Bidirectional Encoder Representations from Transformers)

BERT is a transformer-based model developed by Google. It is designed to understand the context of words in a sentence by looking at both the preceding and following words. This makes it highly effective for various NLP tasks like question answering, sentiment analysis, and named entity recognition.

You can use BERT for any NLP task that requires understanding the context and semantics of text. It is especially useful for tasks where the meaning of a word depends on its context within the sentence. Here’s an example of using BERT with Python for encoding a piece of text:

from transformers import BertTokenizer, BertModel
import torch

# load pre-trained model and tokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained('bert-base-uncased')

# tokenize input text
text = "Here is some text to encode"
encoded_input = tokenizer(text, return_tensors='pt')

# perform forward pass to get hidden states
with torch.no_grad():
    output = model(**encoded_input)

print(output.last_hidden_state)

tensor([[[-0.0549,  0.1053, -0.1065,  ..., -0.3551,  0.0686,  0.6506],
         [-0.5759, -0.3650, -0.1383,  ..., -0.6782,  0.2092, -0.1639],
         [-0.1641, -0.5597,  0.0150,  ..., -0.1603, -0.1345,  0.6216],
         ...,
         [ 0.2448,  0.1254,  0.1587,  ..., -0.2749, -0.1163,  0.8809],
         [ 0.0481,  0.4950, -0.2827,  ..., -0.6097, -0.1212,  0.2527],
         [ 0.9046,  0.2137, -0.5897,  ...,  0.3040, -0.6172, -0.1950]]])

You can find an example of text classification with BERT from here.

GPT-3 (Generative Pre-trained Transformer 3) and its Advancements

GPT-3 is an autoregressive language model developed by OpenAI. It can generate human-like text based on a given prompt. It is capable of performing tasks such as translation, question answering, and text completion without task-specific training.

You can use GPT models for generating text, completing sentences, creating conversational agents, and any task where generating human-like text is required.

To use GPT models you need to have access to the OpenAI API. You can sign up for the API from here. You can find examples of using GPT-3 and its advancements here.

VGG16 (Visual Geometry Group)

VGG16 is a convolutional neural network model known for its depth and simplicity. It has 16 layers and is widely used for image classification tasks. It achieved top results on the ImageNet dataset.

You can use VGG16 for image classification, object detection, and image feature extraction tasks. Here’s an example of using this model to predict the probability of a particular object existing in the image (I am using this image as input here):

from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.vgg16 import preprocess_input, decode_predictions
import numpy as np

# Load pre-trained VGG16 model + higher level layers
model = VGG16(weights='imagenet')

# Load and preprocess the image
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

# Predict the probabilities
preds = model.predict(x)
print('Predicted:', decode_predictions(preds, top=3)[0])

Predicted: [('n02504013', 'Indian_elephant', 0.6579379), ('n01695060', 'Komodo_dragon', 0.19563328), ('n02504458', 'African_elephant', 0.08337321)]

ResNet (Residual Networks)

ResNet is a convolutional neural network with a residual learning framework, which allows the training of very deep networks. It is designed to solve the vanishing gradient problem by using shortcut connections.

You can use ResNet for image classification, object detection, and other computer vision tasks that require deep neural networks. Here’s an example of using this model to predict the probability of a particular object in the image (I am using this image as input here):

from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions
import numpy as np

# Load pre-trained ResNet50 model + higher level layers
model = ResNet50(weights='imagenet')

# Load and preprocess the image
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

# Predict the probabilities
preds = model.predict(x)
print('Predicted:', decode_predictions(preds, top=3)[0])

Predicted: [('n02504013', 'Indian_elephant', 0.93069756), ('n02504458', 'African_elephant', 0.049199447), ('n01871265', 'tusker', 0.019575601)]

Word2Vec

Word2Vec is a group of models used to generate word embeddings, which capture the context of words in a vector space. It was developed by Google, it uses shallow neural networks to create dense representations of words.

You can use Word2Vec for tasks requiring word embeddings such as semantic similarity, clustering, and classification. Here’s an example of using this model for generating word embeddings and finding similar words:

from gensim.models import Word2Vec
from gensim.test.utils import common_texts

# train a Word2Vec model
model = Word2Vec(sentences=common_texts, vector_size=100, window=5, min_count=1, workers=4)

# access vector for a word
vector = model.wv['computer']

# find similar words
similar_words = model.wv.most_similar('computer')
print(similar_words)

[('system', 0.21617139875888824), ('survey', 0.04468922317028046), ('interface', 0.015203381888568401), ('time', 0.0019510635174810886), ('trees', -0.03284316882491112), ('human', -0.07424270361661911), ('response', -0.09317591041326523), ('graph', -0.09575342386960983), ('eps', -0.10513808578252792), ('user', -0.16911619901657104)]

You can find a detailed example of Word2Vec from here.

YOLO (You Only Look Once)

YOLO is a state-of-the-art, real-time object detection system. It frames object detection as a single regression problem, straight from image pixels to bounding box coordinates and class probabilities.

You can use YOLO for real-time object detection tasks, such as autonomous driving, surveillance systems, and any application requiring fast and accurate object detection.

You can find an example of object detection with YOLO from here.

Summary

So, here are some pretrained models you should know for a career in Machine Learning:

BERT (Bidirectional Encoder Representations from Transformers)
GPT-3 (Generative Pre-trained Transformer 3)
VGG16 (Visual Geometry Group)
ResNet (Residual Networks)
Word2Vec
YOLO (You Only Look Once)

I hope you liked this article on pretrained models you should know for Machine Learning. Feel free to ask valuable questions in the comments section below. You can follow me on Instagram for many more resources.

Pretrained Models for Machine Learning You Should Know

Pretrained Models for Machine Learning You Should Know

BERT (Bidirectional Encoder Representations from Transformers)

GPT-3 (Generative Pre-trained Transformer 3) and its Advancements

VGG16 (Visual Geometry Group)

ResNet (Residual Networks)

Word2Vec

YOLO (You Only Look Once)

Summary

Written by Aman Kharwal

No responses yet