Data Science Tools and Other Announcements from Ignite

In this episode, Microsoft's Corporate Vice President for Cloud Artificial Intelligence, Joseph Sirosh, joins host Kyle Polich to share some of the Microsoft's latest and most exciting innovations in AI development platforms. Last month, Microsoft launched a set of three powerful new capabilities in Azure Machine Learning for advanced developers to exploit big data, GPUs, data wrangling and container-based model deployment.

The first new feature, the AML Workbench, is a cross-platform client that integrates AI-powered data wrangling into the client itself, which allows the user to transform data with the power of AI. The second new feature is the AML Experimentation service on the cloud that enables data scientists to track and manage their big data experiments on Spark and GPUs, etc. using all the power in the cloud. And the third announcement is the AML Model Management service to host, version, manage and monitor machine learning models.

Joseph discusses the three features in some detail with Kyle, but for a deep dive into each one, you can visit his blog here.

The AML Workbench is a client application that runs on Windows and Mac and serves as a control panel for your development lifecycle. Building on the latest research in program synthesis (PROSE) and data cleaning, Microsoft created a data wrangling technology that can drastically reduce the time that data scientists have to spend in coding and transforming data for machine learning. The way program synthesis works are, you give an input, the kind of data you want to wrangle, and the output you want to transform to, so you control the before and after. And then, the program synthesis technique will generate a program for that data transformation automatically for the user, and the program can be run on all of the data and verify that it works. If there are examples where it doesn't work, the user can give more character examples, and it will re-generate a new program that fits it better.

The AML Experimentation Service allows machine learning experimentation at any scale, leveraging the cloud. ML experiments can run on a local machine, inside of a Docker container locally or on Azure. Through the service, experiments can even scale-out on top of Apache Spark on Azure HDInsight clusters. The Experimentation Service can support a variety of open source deep learning frameworks, such the Microsoft Cognitive Toolkit, Tensorflow, Caffe2, PyTorch and Chainer. Leveraging the Azure Batch AI Training service, each deep learning experiment can utilize hundreds of GPU virtual machines. The new service can track, store and manage models, configuration, parameter, and data using Git repositories. These features enable users to record the run history on all the experiments run, which allows them to compare and contrast model runs, and review performance under different parameters.

The AML Model Management Service allows data scientists and developers to deploy and manage their trained models locally to or to large-scale cluster deployments in the cloud. Models can be containerized in Docker and implemented on network edge devices, allowing models to score closer to the event and in real-time. Once in production, models can be monitored for performance using Azure Application Insights and then proactively retrained if data drift or other circumstances begin to degrade performance.

These three new features together offer end-to-end tools for AI development. A recording of Joseph's session at Microsoft Ignite 2017 is available here.

Other announcements from Ignite 2017 mentioned in the show:

Links and Resources:

Joseph’s blog on AI announcements at Ignite

Microsoft AI Platform

Introduction to Azure Machine Learning

Quick Start Tutorials for Data Preparation, Build a model, Deploy a model, Advanced Data Preparation

Sponsored by Springboard

Check out Springboard's Data Science Bootcamp




Joseph Sirosh

Joseph Sirosh

Joseph Sirosh is the Corporate Vice President for the Data Group at Microsoft, where he leads the company’s database, big data, and machine learning products. Joseph holds a PhD in Computer Science from the University of Texas at Austin and a B.Tech. in Computer Science & Engineering from the Indian Institute of Technology Chennai. He is very passionate about machine learning and its applications. One of his missions at Microsoft is to democratize ML technology and make it accessible to everybody.