Machine Learning Studio

Hands-on Machine Learning Introduction ML Studio

This section focuses on the practical, code-based machine learning within Azure ML Studio. Unlike automated approaches, this section will run through how to use ML Studio to write, store, and run your machine learning pipelines, giving you complete control over your model development process.

You'll learn to leverage three key components: Jupyter notebooks for interactive development, PyTorch for deep learning, and Azure jobs for scalable model training. Each section includes video demonstrations and downloadable examples using publicly available datasets.

Key Components Overview

Jupyter Notebooks

Jupyter notebooks serve as your primary development environment in Azure ML Studio. These web-based documents combine live code, visualisations, and explanatory text in a shareable format that's ideal for research and experimentation.

Azure ML Studio provides pre-configured notebooks with popular machine learning libraries, automatic compute scaling, and integration with Azure data stores. The next section demonstrates creating and running notebooks, including dataset access and compute configuration.

PyTorch Framework

PyTorch is a leading machine learning framework favoured by researchers for its dynamic computational graphs and intuitive Python design. Azure ML Studio provides optimised PyTorch environments with distributed training capabilities across multiple GPUs and compute nodes.

The framework excels in deep learning, computer vision, natural language processing, and reinforcement learning applications.

Azure ML Jobs

Azure ML jobs enable cloud-scale training beyond your local machine's limitations. Submit training tasks to powerful compute clusters, specify exactly the resources needed, and leverage distributed training for large models and datasets.

Jobs integrate with Azure's experiment tracking and model versioning, to help you easily track your logs, metrics, parameters, and outputs.

Dataset Information

Throughout this section, we'll demonstrate all three tools using a consistent dataset for ease and clarity. We'll be working with the Wine Quality dataset, a popular tabular dataset from Kaggle that contains various chemical properties of wines along with quality ratings.

Getting the Data: If you're coding along with the video demonstrations, you can download the dataset from Kaggle or from our GitHub repository (accessible via the the sidebar under Resources). All accompanying Jupyter notebooks used throughout the tutorials are also stored in the same repository for easy access.