Training & Testing Text Classification Models with Google Cloud Vertex AI

By leveraging Google’s AutoML feature, classification models can be created with little to no technical effort.

5 min readMar 28, 2023

For starters, here are a few general observations:

There are many elements of Vertex AI in general and AutoML in specific which reminds me of HuggingFace🤗 autoTRAIN.
AutoML allows for quick prototyping and exploration of data sets making use of a no-code studio approach.
Vertex AI has two text classification option, single or multi-label classifications. Creating class hierarchies or taxonomies are not possible.
Model training time is relatively long. Traditional NLU Models have made great strides in terms of incremental training. Incremental training is the notion of appending new data to an existing model. Added to this, training time of traditional NLU models have shortened dramatically.

Even-though AutoML is designed to streamline the process of creating a ML model, two elements which are highly configureable are data split and annotation sets.

Data split: By default, AutoML randomly assigns each item in your dataset to training, validation, and test sets in a 80/10/10 ratio respectively. You can change that ratio or even manually assign each data item to a set.
Annotation sets: Annotation sets store annotations so that you can use the same dataset for other models and objectives. For example, you could use this same “happiness” dataset to train a multi-label classification model instead of a single-label one.

But I hasten to mention, Vertex AI lacks a bottom-up, data-centric approach to curating and structuring training data.

⭐️ Please follow me on LinkedIn for updates on Conversational AI ⭐️

In a previous post I stepped through the process of creating a dataset and training a ML model.

In the image below, the Vertex AI dashboard is visible, under recent models the new model is listed with the average precision.

Once the model is accessed, there is a progression bar at the top of the page. Here the newly created model can be evaluated, deployed & tested and more.

In the image below you can see that the model can be tested with longer input. On the right of the image, the labels are visible, with the enjoy_the_moment label identified.

Jumping back to the evaluate tab, a few quick-view graphic indicators are available per model.

Below you see the confusion matrix:

And below is visible the trade-off between precision and recall at different confidence thresholds.

➡️ A lower threshold results in higher recall but typically lower precision.

➡️ A higher threshold results in lower recall but typically with higher precision.

Read more on threshold, precision and recall here.

In Closing

In an upcoming article I want to consider production deployment of Vertex models.

⭐️ Please follow me on LinkedIn for updates on Conversational AI ⭐️

I’m currently the Chief Evangelist @ HumanFirst. I explore and write about all things at the intersection of AI and language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces and more.

NLU design tooling

“Conversation Designer, Retail, 10k+ employees The tool that turned conversation designers, into NLU designers” ★★★★★…

www.humanfirst.ai

https://www.linkedin.com/in/cobusgreyling

Get an email whenever Cobus Greyling publishes.

Get an email whenever Cobus Greyling publishes. By signing up, you will create a Medium account if you don’t already…

cobusgreyling.medium.com

The Cobus Quadrant™ Of NLU Design

NLU design is vital to planning and continuously improving Conversational AI experiences.

cobusgreyling.medium.com

The Cobus Quadrant™ Of Conversation Design Capabilities

∗ This is part one of a two part series, please also take a look part two, the Cobus Quadrant of NLU Design.

cobusgreyling.medium.com

Large Language Models, Generative AI & Google Cloud Vertex AI

Google launched Vertex AI 18 May 2021 at Google I/O and it seems like the product has faired well considering all the…

cobusgreyling.medium.com

Creating Training Data For Text Classification In Google Cloud Vertex AI

In the coming posts I will be doing a few deep dives on Google Vertex AI. This post focusses on data engineering and…

cobusgreyling.medium.com

Foundation Conversational AI Technologies Landscape

And the rapid expansion of Large Language Model (LLM) enablement.

cobusgreyling.medium.com

The Foundation Large Language Model (LLM) & Tooling Landscape

There is an ever growing list of Generative AI Applications, which can be broken down into eight broad categories.

cobusgreyling.medium.com

Large Language Models Are Forcing Conversational AI Frameworks To Look Outward

With fragmentation being forced on frameworks it will become increasingly hard to be self-contained. I also consider…

cobusgreyling.medium.com

Vertex AI | Google Cloud

Send feedback Build, deploy, and scale machine learning (ML) models faster, with fully managed ML tools for any use…

cloud.google.com

HappyDB

A Corpus of 100,000 Crowdsourced Happy Moments

www.kaggle.com

Training & Testing Text Classification Models with Google Cloud Vertex AI

By leveraging Google’s AutoML feature, classification models can be created with little to no technical effort.

In Closing

NLU design tooling

“Conversation Designer, Retail, 10k+ employees The tool that turned conversation designers, into NLU designers” ★★★★★…

Get an email whenever Cobus Greyling publishes.

Get an email whenever Cobus Greyling publishes. By signing up, you will create a Medium account if you don’t already…

The Cobus Quadrant™ Of NLU Design

NLU design is vital to planning and continuously improving Conversational AI experiences.

The Cobus Quadrant™ Of Conversation Design Capabilities

∗ This is part one of a two part series, please also take a look part two, the Cobus Quadrant of NLU Design.

Large Language Models, Generative AI & Google Cloud Vertex AI

Google launched Vertex AI 18 May 2021 at Google I/O and it seems like the product has faired well considering all the…

Creating Training Data For Text Classification In Google Cloud Vertex AI

In the coming posts I will be doing a few deep dives on Google Vertex AI. This post focusses on data engineering and…

Foundation Conversational AI Technologies Landscape

And the rapid expansion of Large Language Model (LLM) enablement.

The Foundation Large Language Model (LLM) & Tooling Landscape

There is an ever growing list of Generative AI Applications, which can be broken down into eight broad categories.

Large Language Models Are Forcing Conversational AI Frameworks To Look Outward

With fragmentation being forced on frameworks it will become increasingly hard to be self-contained. I also consider…

Vertex AI | Google Cloud

Send feedback Build, deploy, and scale machine learning (ML) models faster, with fully managed ML tools for any use…

HappyDB

A Corpus of 100,000 Crowdsourced Happy Moments

Written by Cobus Greyling