Press "Enter" to skip to content

Day: November 1, 2021

Voronoi Diagrams with R and x11()

Tomaz Kastrun creates a Voronoi diagram:

Yes. Finally, the Voronoi diagrams with the use of x11() function. This diagram is presentation of a plane that is partitioned every time, a user clicks on the canvas of x11. This plane is partitioned into smaller regions that are close to given set of points.

Partitioning into smaller regions or convex polygons happens in such manner that each polygon contains only one generating point and every point in a given polygon is closer to its generating point than to any other.

I had to take a look out of curiosity, and yes, the x11() function does work on Windows as well.

Comments closed

GPU-Accelerated Analysis on Databricks using PyTorch + Huggingface

Srijith Rajamohan walks us through an example of sentiment analysis using the PyTorch and Huggingface libraries on Databricks:

Sentiment analysis is commonly used to analyze the sentiment present within a body of text, which could range from a review, an email or a tweet. Deep learning-based techniques are one of the most popular ways to perform such an analysis. However, these techniques tend to be very computationally intensive and often require the use of GPUs, depending on the architecture and the embeddings used. Huggingface (https://huggingface.co) has put together a framework with the transformers package that makes accessing these embeddings seamless and reproducible. In this work, I illustrate how to perform scalable sentiment analysis by using the Huggingface package within PyTorch and leveraging the ML runtimes and infrastructure on Databricks.

Click through for a description of the process, as well as a link to a notebook you can walk through yourself.

Comments closed

Document Classification in Python

Brendan Tierney performs a bit of document classification with scikit-learn and nltk:

Text mining is a popular topic for exploring what text you have in documents etc. Text mining and NLP can help you discover different patterns in the text like uncovering certain words or phases which are commonly used, to identifying certain patterns and linkages between different texts/documents. Combining this work on Text mining you can use Word Clouds, time-series analysis, etc to discover other aspects and patterns in the text. Check out my previous blog posts (post 1post 2) on performing Text Mining on documents (manifestos from some of the political parties from the last two national government elections in Ireland). These two posts gives you a simple indication of what is possible.

We can build upon these Text Mining examples to include other machine learning algorithms like those for Classification. With Classification we want to predict or label a record or document to have a particular value. With Classification this could involve labeling a document as being positive or negative (movie or book reviews), or determining if a document is for a particular domain such as Technology, Sports, Entertainment, etc

Click through for a walkthrough of this process.

Comments closed

Showing Count of Selected Items in a Slicer

Prathy Kamasani wants to track slicer counts:

In one of the projects, I was working on, I received feedback saying it is hard to understand how many items they have selected in a slicer, and it is not the first time I came across this. It is a valid point, especially when you have quite a few items in a slicer, you use a search bar to look for items, you select a couple, but you were not sure how many were selected.

Read on for a rather clever solution to the problem.

Comments closed

Data Source Name Not Found with Postgres Driver

Rayis Imayev troubleshoots a problem:

A very short blog post, just a reminder to myself, but if you have ever tried to connect to a PostgreSQL database using ODBC interface (I know, it already sounds like a very interesting challenge :- ), then you might have experienced this error message: “ERROR [IM002] [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified.”

Read on to see the cause of and solution to this problem.

Comments closed