Custom Vision

Object Detection AI Vision

Now that you have uploaded and annotated your training images with bounding boxes, it's time to train your object detection model.

In this section, we will walk through how to train an object detection model using Microsoft Custom Vision. Unlike multi-class and multi-label classification, where the model assigns tags to entire images, object detection goes a step further by learning to identify both the presence and the location of multiple objects within the same image. This requires manually drawing bounding boxes around each object of interest and tagging them accordingly. During training, the model learns not only to recognise what each object looks like, but also how to localise it within the image using pixel coordinates. This enables the model to return one or more bounding boxes with associated labels and confidence scores when making predictions.

Start Training Process

Once all images are uploaded and annotated, click Train to begin the training process.

Select Quick Training to begin initial training. This will start the process of teaching your model to recognize and locate the different fruit types in images.

Custom Vision training interface showing training progress for object detection model

Custom Vision training interface - select Quick Training to begin the model training process