November 2020 – Page 7

Finding Unused Columns in Power BI Data Models

Published 2020-11-17 by Kevin Feasel

I have a saying in Power BI. Load every column you need, and nothing that you don’t need. The reason for this advice is that columns can make your data model bigger and less performant. You will of course need some columns in your data model for different purposes. Some are used for defining measures and some are used for slicing, dicing and summarising your data in the various visuals. But it is very common for people to load everything from the source, meaning that some of the columns are likely to be loaded but not used. Once the data model is ready and the reporting is done, it can be beneficial to remove the columns that are not being used and are not likely to be used for ad hoc reporting in the near future. The question is – how do you find the columns not being used? This is where Imke’s Power BI Cleaner tool comes in; I will show you how to use it below.

Read on for Seven Minute Abs for your Power BI data model.

Comments closed

The DevOps Learning Curve

Published 2020-11-17 by Kevin Feasel

Grant Fritchey gives us the low-down on learning about DevOps:

If you’re attempting to implement automation in and around your deployments, you’re going to find there is quite a steep learning curve for DevOps and DevOps-style implementations. Since adopting a DevOps-style release cycle does, at least in theory, speed your ability to deliver better code safely, why would it be hard?

Click through for an idea, including tools to use and some first steps.

Comments closed

Creating Jupyter Books in Azure Data Studio

Published 2020-11-17 by Kevin Feasel

Drew Skwiers-Koballa takes us through creating and deploying Jupyter Books:

The notebook experience in Azure Data Studio allows users to create and share documents containing live code, execution results, and narrative text. Potential usage includes data cleaning and transformation, statistical modeling, troubleshooting guides, data visualization, and machine learning. Jupyter books compile a collection of notebooks into a richer experience with more structure and a table of contents. In Azure Data Studio we are able not only to use Jupyter books but also create and share them. Learn the basics of notebooks in Azure Data Studio from the documentation and read on to learn how to leverage a GitHub Action to publish and share remote Jupyter books.

Click through for the process of creating, opening, and distributing Jupyter Books.

Comments closed

Visualizing Analysis Services Tasks with the Job Graph

Published 2020-11-16 by Kevin Feasel

Chris Webb is excited:

More details about it, and how it can be used, are in the samples here:
https://github.com/microsoft/Analysis-Services/tree/master/ASJobGraphEvents
The data returned by the Job Graph event isn’t intelligible if you look at the text it returns in Profiler. However if you save a .trc file with Job Graph event data to XML you can use the Python scripts in the GitHub repo to generate DGML diagrams that can be viewed in Visual Studio, plus Gantt charts embedded in HTML. Of course to do this you’ll need to have Python installed; you’ll also need to have Visual Studio and its DGML editor installed (see here for details).

Read on to see how it looks and Chris’s thoughts on the matter.

Comments closed

Error Handling in R

Published 2020-11-16 by Kevin Feasel

Adi Sarid compares a few methods for error handling in R:

Error catching can be hard to catch at times (no pun intended). If you’re not used to error handling, this short post might help you do it elegantly.
There are many posts about error handling in R (and in fact the examples in the purrr package documentation are not bad either). In this sense, this post is not original.
However, I do demonstrate two approaches: both the base-R approach (tryCatch) and the purrr approach (safely and possibly). The post contains a concise summary of the two methods, with a very simple example.

Read the whole thing. H/T R-Bloggers

Comments closed

Import Files From Sharepoint Into Power Query

Published 2020-11-16 by Kevin Feasel

Imke Feldmann solves a problem:

When you use the UI to import files from SharePoint, you’ll end up with the Sharepoint.Files function. This function can become fairly or super slow when you use it on large SharePoint sites. This is due to the fact, that it will retrieve metadata for ALL files that lie on the site. Meaning: The root site whose URL you have to enter as the function argument. So I’ve developed a better way for File import from SharePoint.

Click through for the solution and how to use it. Imke reports 2X query performance when reading Sharepoint data, so it’s worth checking out.

Comments closed

Using the READPAST Query Hint

Published 2020-11-16 by Kevin Feasel

Rajendra Gupta looks at a lesser-used query hint:

If we specify the READPAST hint in the SQL queries, the database engine ignores the rows locked by other transactions while reading data. Suppose you have a transaction that blocked a few rows in a table for updating the information in those rows. Now, if another user starts a transaction and specifies the READPAST query hint, the query engine ignores these rows and returns the remaining rows satisfying the data requirement of the query. It might return incorrect data as well.

There are some very limited uses for this hint, though they are out there.

Comments closed

Linked Server Data Does Not Match Expected Data Length

Published 2020-11-16 by Kevin Feasel

Jack Vamvas looks at a linked server error:

I’ve found an issue we have with BI Connector over MongoDB. I’m extracting data from MongoDB to SQL Server – using the MongoDB ODBC Data Source Configuration and Linked Server. Basically the BI Connector fails when the string in the column gets too long. When running the job to bring in the data to the production workflow data into a Datawarehouse application we get the following error.
Msg 7347, Level 16, State 1, Line 1
OLE DB provider ‘MSDASQL’ for linked server ‘MongoDB_PROD_mydb’ returned data that does not match expected data length for column ‘[MSDASQL].mywork.mytasks.mytaskInfo’. The (maximum) expected data length is 5332, while the returned data length is 5970.
Completion time: 2020-11-05T17:14:10.2206504+00:00

Read on for one solution.

Comments closed

Target Groups in Elastic Jobs

Published 2020-11-16 by Kevin Feasel

Reitse Eskens shares some more information about elastic jobs in Azure:

In one of my previous blogs, I wrote about how to create an elastic job agent when you need the SQL Agent functionality on Azure. You can read that one here.
This morning, I needed a job to update the stats on a database, but on just one database within the “instance” on Azure. But my first group contained all the databases, and the Ola Hallengren script isn’t available on all databases and the credential I’m using to execute the jobs doesn’t have access to all the databases.

Read on to learn how Reitse solved the problem.

Comments closed

Query Folding in Direct Query Plus Import Scenarios

Published 2020-11-14 by Kevin Feasel

Marc Lelijveld violates Betteridge’s Law of Headlines:

Recently, I run into a case where we wanted to combine Direct Query sources with imported data. Well, this functionality is actually called composite models nowadays.
Having that said, let me start with managing some expectations for this post. The title says composite models, but I have to admit, it is not about the to be released composite models feature where you can combine multiple Power BI datasets. Instead of that, it is about combining Direct Query and Import mode in one single dataset.
In this blogpost I will describe more about my use case, why I considered to use some sources on Direct Query, how this combines with Row Level Security and Query Folding.

Read on for the test.

Comments closed

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30

Month: November 2020