Machine Learning – Page 22

So, let’s take that out and we are back to old, classical vector algebra. It’s like a person with a bunch of sticks to figure out which one to lay where in a 2-D plane to separate one class of objects from another, provided class definitions are already known.
The problem is which particular shape and length must be chosen to show maximum contrast between classes.
We need to arrive at a function definition, in such a way that the value a given function takes changes drastically (e.g. from a large positive value to a large negative value).

SVM is often great for two-class classification problems, and different variants also work well for multi-class problems.

Comments closed

Fraud Detection with Flink

Published 2020-01-22 by Kevin Feasel

Alexander Fedulov gives us a case study of using Apache Flink for fraud detection:

In this blog post, we have discussed the motivation behind supporting dynamic, runtime changes to a Flink application by looking at a sample use case – a Fraud Detection engine. We have described the overall architecture and interactions between its components as well as provided references for building and running a demo Fraud Detection application in a dockerized setup. We then showed the details of implementing a dynamic data partitioning pattern as the first underlying building block to enable flexible runtime configurations.
To remain focused on describing the core mechanics of the pattern, we kept the complexity of the DSL and the underlying rules engine to a minimum. Going forward, it is easy to imagine adding extensions such as allowing more sophisticated rule definitions, including filtering of certain events, logical rules chaining, and other more advanced functionality.

It was an interesting discussion and you can grab the code as well.

Comments closed

Combining SAS + R for ML

Published 2019-12-10 by Kevin Feasel

Sophia Rowland has a demo of working with SAS Cloud Analytic Service from R:

The Scripting Wrapper for Analytics Transfer, also known as SWAT, is a package that allows R users to access the power of the SAS Cloud Analytic Service (CAS) from a familiar R interface. The SWAT package is available to SAS Visual Analytics (VA), SAS Visual Statistics (VS), and SAS Visual Data Mining and Machine Learning (VDMML) users. To begin working with SWAT, download and install the package from the SAS Software GitHub page.

Read on for the demo.

Comments closed

Time Series Anomaly Detection with Power BI

Published 2019-12-09 by Kevin Feasel

Leila Etaati takes us through time series anomaly detection with Cognitive Services and Power Query:

I am excited about this blog post, this is based on the New service in Cognitive Service name “Anomaly Detection” which is now in Preview.
I recorded a video about how it works in cognitive service https://youtu.be/7ZOtZDbn6gM.
However, I am going to talk about how to use it in Power BI. In this post first, a brief introduction to the anomaly detection will be presented, then how it can be used inside Power BI will be discussed.

It sounds like there are still some rough edges, but they already have the makings of an interesting service.

Comments closed

The A* Search Algorithm

Published 2019-12-05 by Kevin Feasel

Akash Kumar takes us through the A* algorithm and how it works for search:

Path Finding has been one of the oldest and most popular applications in computer programming. You could virtually find the most optimal path from a source to a destination by adding costs which would represent time, money etc. A* is one of the most popular algorithms for all the right reasons. In this article, let’s find out just why.

Click through for an explanation of what the algorithm does and pseudocode to implement it.

Comments closed

Automated ML Pipelines with SAS

Published 2019-12-04 by Kevin Feasel

Sophia Rowland shows off SAS’s auto-ML action:

The dsAutoMl action does it all. It will explore your data, generate features, select features, create models, and autotune the hyper-parameters of those models. This action includes the four policies we have seen in my first two blogs: explorationPolicy, screenPolicy, transformationPolicy, and selectionPolicy. Please review my previous blogs if you need a refresher on the data exploration and cleaning process or feature generation and selection process. The dsAutoMl action builds on our prior discussions through model generation and autotuning. A data scientist can choose to build several models such as decision trees, random forests, gradient boosting models, and neural networks. In addition, the data scientist can control which objective function to optimize for and the number of K-folds to use. The output of the dsAutoMl action includes information about the features generated, information on the model pipelines generated, and an analytic store file for generating the features with new data.

This is an area where several companies are investing a lot of money, trying to simplify the process of training models.

Comments closed

Bot Framework 101 Notes

Published 2019-11-26 by Kevin Feasel

Annie Xu has some notes from an introductory course on the Microsoft Bot framework:

Not long ago, I got a chance to learn a Bot 101 lesson from my teammate Wayne Smith. It was a great class because it helped me who is an new learner to understand a lot of key concepts of Microsoft bot. Because it is in an internal meeting and there is no public video released, I wrote some notes below to share with you.

Click through for Annie’s notes and a bunch of links to additional resources.

Comments closed

Preventing Overfitting in ML Models

Published 2019-11-07 by Kevin Feasel

Tom Jordan gives us four techniques to reduce the likelihood of overfitting in our models:

Dropout
This technique is exclusively used within the training of neural networks, so isn’t applicable to all machine learning models, however can be used in the production of extremely effective neural network models. During the start of each step in the training process, each sub unit of the model, the neuron, has a probability of being included in that step or not. If it doesn’t make the cut, it is effectively deleted from the network for that step, and then reintroduced on the next step.

There are some good techniques here.

Comments closed

Dealing with NULLs in Java with SQL Server 2019

Published 2019-10-30 by Kevin Feasel

Niels Berglund covers changes in SQL Server Machine Learning Services around Java code execution:

In the null values post mentioned above, I mentioned that there are differences between SQL Server and Java in how they handle null. So, when we call into Java from SQL Server, we may want to treat null values the same way as we do in SQL Server.
I wrote about this in the SQL Server 2019 Extensibility Framework & Java – Null Values post mentioned above. However, that post was written before SQL Server 2019 CTP 2.5. In CTP 2.5 Microsoft introduced the Java SDK, and certain things changed. Amongst the things that changed is the way we handle nulls when we receive datasets from SQL Server in our Java code.

Read on to learn how it works today.

Comments closed

MLFlow on Databricks Community Edition

Published 2019-10-18 by Kevin Feasel

Jules Damji and Siddharth Murching have an interesting announcement:

Today, we are excited to extend Databricks Community Edition with hosted MLflow for free, as part of our ongoing commitment to help developers learn about machine learning lifecycle. With the Community Edition, you can try tutorials that demonstrate how to track results and experiments as you build machine learning models—a crucial stage in the machine learning model’s development lifecycle.
MLflow is an open-source platform for the machine learning lifecycle with four components: MLflow Tracking, MLflow Projects, MLflow Models, and MLflow Registry. MLflow is now included in Databricks Community Edition, meaning that you can utilize its Tracking and Model APIs within a notebook or from your laptop just as easily as you would with managed MLflow in Databricks Enterprise Edition.

I like showing off Databricks Community Edition, and I’m glad to see them extend it a bit.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Category: Machine Learning

Concepts in Support Vector Machines

Fraud Detection with Flink

Combining SAS + R for ML

Time Series Anomaly Detection with Power BI

The A* Search Algorithm

Automated ML Pipelines with SAS

Bot Framework 101 Notes

Preventing Overfitting in ML Models

Dealing with NULLs in Java with SQL Server 2019

MLFlow on Databricks Community Edition