Neel Bhatt

Tech blogger(https://neelbhatt.com), Top 4% overall on StackOverflow, Senior Software Engineer

First look of Azure Machine Learning : Azure Machine Learning part II

Published Nov 20, 2017Last updated May 18, 2018

In my last post, I have explained very basic information for Machine learning and I also explained the development life cycle for a Machine learning project.

In this post, I will explain some frequent issues during the Machine Learning development and how you can overcome using Azure Machine Learning along with some basic Data cleansing task using Azure Machine Learning.

One of the biggest problems with Machine learning development:

In the Machine learning workflow, there is, sometimes, friction in the hand over between Data scientist and Operations.

The models which are developed are often recorded which causes translation errors:

ml8

Thus Data scientist loses visibility in the model performance due to that.

How Azure Machine learning can solve this big problem?

With Azure Machine learning, the workflow is dramatically enhanced because it enables the operations engineer to encapsulate the model instead of recording that, which reduces the noise in the system:

ml9

Additionally, Azure Machine learning provides the capability to make the experiments more efficient by reducing the time to prepare the data as well as well as by simplifying the experimentation valuation

What are the components of Azure Machine learning?

It is an Azure service which consists of libraries like Microsoft ML Spark libraries and tools like Azure Workbench and these work together with the IDEs like Visual Studio Code, PyCharm, Jupyter etc and third-party libraries like TensorFlow, TLC, CNTK etc.

You can train as well as deploy using Docker on Azure Compute such as HD Insight, VMs, GPUs, Azure container services as well as IOT devices.

Below is the complete picture of the things I explained above:

ml10

Apart from this, Azure Machine Learning can help Data scientists as below:

You can reuse some existing Python, R scripts
Easy configuration for modeling and deployment
Easy to use graphical interface
No need to setup anything, it is ready to start and no more computing resource limitations
Azure marketplace to utilize existing models or publish/monetize your new models
Built-in Algorithms:

Azure-Machine-Learning-Models

Can it help developers as well?

Yes, it does:

Very helpful existing ML APIs which you can use
Can easily use ML models in day to day applications
It brings prediction capabilities to the masses and available to non-experts
Predictive models can be used to interpret the huge data that would result from the Internet Of Things(IOT)

How to get into Azure Machine Learning?

To get started with Studio, go to https://studio.azureml.net. If you’ve signed into Machine Learning Studio before, click Sign In. Otherwise, click Sign up here and choose between free and paid options.

Sign in to Machine Learning Studio
Now let us take a quick example of Data cleansing of some large data in Azure Machine learning.

For example, our task is to find whether the image has snow leopard or not. There are a bunch of images and from these images, we are required to find which are those images which have snow leopard in it.

One of the examples of those images is as below:

ml11

To load the data:

Create a new experiment by clicking +NEW at the bottom of the Machine Learning Studio window, select EXPERIMENT , and then select Blank Experiment:

ml21

Now, we have the bunch of images metadata which contains the long Image path along with a couple of timestamps, we will load them into our Azure:

ml12

As you can see in above image, all those images have some unique image number. So our first task is to take out those unique numbers from those long path.

With Azure Machine Learning, it can be done by just a few clicks.

We will use Derive column by Example feature for this.

ml13

Now just give the image number for only a couple of images and the system will learn and will perform the task for rest of the path on its own. It is one type of Supervised learning:

ml14

Once the process is done:

ml15

As you can see, it learned to take out the image number from rest of the images even though some of the images have parenthesis.

Now let us see another example of cleansing the data task.

In the above metadata, the images path are categorized into mainly 2 folders, one with otherImages and another with snowLeopardImages:

ml16

So we will give 0 to otherImages and 1 to snowLeopardImages with use of Derived by example again:

ml17

ml18

It may require giving 0 and 1 more than once(around 5 times max) because by giving more examples, Machine is learning that it should put 0 against all otherImages and 1 against all snowLeopardImages

Once the process is over, we can see the count of 0 and 1 by clicking on Value count as shown below:

ml19

So below window shows, we have 2864 images without Snow Leopard and around 800 images with Snow Leopard Images:

ml20

By very few clicks, we can do data cleaning tasks.

Microsoft research has many pre-existing libraries but we can use other open source and third party libraries.

In next post, we will see how we can integrate Python code into Azure Machine Learning to improve the accuracy and the deployment of the same Snow Leopard model.

Have a look here for more similar posts.

Hope it helps.

Machine learning Data Science

Report

Enjoy this post? Give Neel Bhatt a like if it's helpful.

Neel Bhatt

Tech blogger(https://neelbhatt.com), Top 4% overall on StackOverflow, Senior Software Engineer

* 5 years of professional experience in software design, development, debugging, documentation and testing of Client–Server and Web based Applications. * Experienced in Object Oriented Analysis and Design using UML Methodology. * ...

Discover and read more posts from Neel Bhatt

get started