Custom Vision

Multi-Label Classification AI Vision

The Importance of Background Tags

As aforementioned, tagging the background can greatly impact the training of your Custom Vision classifier. Since we tagged it in every image, the model was reliably able to identify an empty basket as containing no fruit. In order to demonstrate the improvement that is added by training the model on a background-only image (such as the empty basket), we put the model through two iterations of training: one iteration trained without the basket tag, and one with it.

Empty basket prediction showing high confidence in basket detection with low fruit probabilities

For the iteration not trained on the basket tag, we get the following when predicting a fruit basket full of coloured paper:

Prediction results without background tag showing incorrect fruit classifications on paper

With 99.9% confidence, the model has predicted a basket in the image, and assigned between 0-2% confidence to the presence of fruits; in keeping with what's actually in the image.

Conversely, using a model trained using a basket tag, testing the same image produced a very different result:

Prediction results with background tag showing more accurate classifications

Here is a figure displaying model performance side-by-side - before and after including a background ("basket") tag in the training set:

Heatmap comparison showing effect of background tags on model predictions

Left (Without Background Tag): Misclassifications occur where the model incorrectly predicts fruit in background-only images. Right (With Background Tag): The model correctly identifies the empty basket and no longer confuses background elements with fruit types.

The Take-Home Message
Including background-only examples (like the empty basket) and explicitly tagging them improves the model's ability to distinguish between true fruit features and background noise. Background tagging grounds the model, providing a reference point, helping to better define its decision boundaries, and preventing overinterpretation.