In this video, I answer a viewer question about how to perform batch processing from the Azure Machine Learning Designer.
This video wraps up my work on Azure ML for now. I’m going to kick off a brand new series starting next week.
Comments closedA Fine Slice Of SQL Server
In this video, I answer a viewer question about how to perform batch processing from the Azure Machine Learning Designer.
This video wraps up my work on Azure ML for now. I’m going to kick off a brand new series starting next week.
Comments closedMatt Eland tries out the TextAnalytics client:
We’ll talk about each one of these capabilities briefly as we cover the results, but at a high level what we want to do is:
- Perform sentiment analysis to determine if the text is positive, negative, neutral, or mixed.
- Summarize the text using abstractive summarization which summarizes the text with new text generated by a large language model (LLM).
- Summarize the text using extractive summarization which summarizes the text by extracting key sentences or parts of sentences to convey the overall meaning.
- Extract key phrases of interest from the text document.
- Perform entity recognition and linked entity recognition to determine the major objects, places, people, and concepts the document discusses.
- Recognize any personally identifiable information (PII) present in the document for potential redaction.
- Analyze the text for healthcare specific topics such as treatment plans or medications.
Read on to see how a certain passage of text fares.
Comments closedIn this video, I take us through the process of creating a local deployment of an Azure ML managed endpoint. We will cover requirements, why you might want to do this, and common problems you may run into along the way.
This was a fun video to make, especially in anticipating the sorts of problems that come up along the way. I won’t pretend that it’s comprehensive but it does hit several of the most common problems I see (or cause).
Comments closedIn this video, I show off how easy it is to integrate Azure ML and Power BI, at least once you get past all of the trouble trying to integrate them.
I expected this to be easy. It turns out that the “make it look easy” depends on having several things in place already and using the correct (by which I mean “old”) deployment type.
Comments closedMatt Eland tries out Semantic Kernel:
Generative AI systems use large language models (LLMs) like OpenAI’s GPT 3.5 Turbo (ChatGPT) or GPT-4 to respond to text prompts from the user. But these systems have serious limitations in that they only include information baked into the model at the time of training. Technologies like retrieval augmentation generation (RAG) help overcome this by pulling in additional information.
AI orchestration frameworks make this possible by tying together LLMs and additional sources of information via RAG. Additionally, AI orchestration systems can provide capabilities to generative AI systems, such as inserting records in a database, sending emails, or calling out to external systems.
In this article we’ll look at the high-level capabilities building AI orchestration systems in C# with Semantic Kernel, a rapidly maturing open-source AI orchestration framework.
Click through to see how things work.
Comments closedBrendan Tierney continues a series on image classification:
In a previous post, I gave examples of how to label data using OCI Data Labeling. It was a simple approach to data labeling images for input to AI Vision. In that post, we just gave a label for the image to indicate if the image contained a Cat or a Dog. Yes, that’s a very simple approach, and we can build image classification models, and use the resulting model to predict a label for new images. These would be labeled as a Cat or a Dog with a degree of certainty. Although this simple approach can give OK-ish results, we typically want a more detailed model and predictions. For a more detailed approach, we can use Object Detection. For this, we need to prepare our data set in a slightly different way and Yes it does take a bit more time to prepare. Or perhaps it takes a lot more time to prepare the data. But this extra time in preparing the data should (in theory) give us a more accurate model.
This post will focus on creating a new labeled dataset using bounding boxes, and in a later post, we’ll examine the resulting model to see if it gives better or more accurate results.
Read on for the process.
Comments closedPete Warden describes a phenomenon:
The GGML framework is just over a year old, but it has already changed the whole landscape of machine learning. Before GGML, an engineer wanting to run an existing ML model would start with a general purpose framework like PyTorch, find a data file containing the model architecture and weights, and then figure out the right sequence of calls to load and execute it. Today it’s much more likely that they will pick a model-specific code library like whisper.cpp or llama.cpp, based on GGML.
This isn’t the whole story though, because there are also popular model-specific libraries like llama2.cpp or llama.c that don’t use GGML, so this movement clearly isn’t based on the qualities of just one framework. The best term I’ve been able to come up with to describe these libraries is “disposable”. I know that might sound derogatory, but I don’t mean it like that, I actually think it’s the key to all their virtues! They’ve limited their scope to just a few models, focus on inference or fine-tuning rather than training from scratch, and overall try to do a few things very well. They’re not designed to last forever, as models change they’re likely to be replaced by newer versions, but they’re very good at what they do.
Pete calls them disposable ML frameworks, though I’d call them single-purpose frameworks to contrast with general-purpose ML frameworks like PyTorch and TensorFlow.
Comments closedBrendan Tierney separates the cats and the dogs:
In this post, I’ll build on the previous work on preparing data, to using this dataset as input to building a Custom AI Vision model. In the previous post, the dataset was labelled into images containing Cats and Dogs. The following steps takes you through creating the Customer AI Vision model and to test this model using some different images of Cats.
This post is part four of a series (first part, second part, third part) on custom image classification in Oracle.
Comments closedPaul Brebner gives us a streaming scenario for model training:
One of the goals of incremental learning is to train a model continuously from streaming data. Incremental learning from streaming data means you don’t need all the data in memory at once, and the model is as up-to-date as possible, which can matter for real-time use cases. The third driver for incremental learning that I mentioned in the previous blog is when there is concept drift in the data itself—but we’ll ignore this aspect for the time being.
In the last blog we demonstrated batch training with TensorFlow, and mentioned that TensorFlow, being a neural network framework, has the potential for incremental learning—just like animals and people do. In this blog, we will set ourselves the task of using TensorFlow to demonstrate incremental learning from the same static drone delivery data set of busy/not busy shops that we used in the last blog.
Read on to see the code, results, and warnings.
Comments closedPhil Booth takes a look at vector search systems:
Recently I built a system that uses vector search to logically truncate long documents and retain the most significant parts according to some search term. I’m a dummy, with no background in machine learning or mathematics, so there were new concepts for me to understand and implementation details to figure out. This post summarises what I learned.
Vector search and vector databases are becoming a fairly hot topic, so this at least grounds you on what they are.
Comments closed