Today is Thanksgiving in the United States. To celebrate, Curated SQL will take today and tomorrow off. We’ll be back on Monday with more links to interesting blog posts from across the data platform space.
Comments closedAuthor: Kevin Feasel
The role of transformation in Spark is to create a new dataset from an existing one. Lazy transformations are those that are computed only when an action requires a result to be returned to the driver programme.
When we call an action, transformations are executed since they are inherently lazy. Not right away are they carried out. There are two primary types of transformations: map() and filter ().
The outcome RDD is always distinct from the parent RDD after the transformation. It could be smaller (filter, count, distinct, sample, for example), bigger (flatMap(), union(), Cartesian()), or the same size (e.g. map).
Read on to learn more about transformations, including examples of how each works. Even if you’re using the DataFrames API for Spark, it’s still important to understand that transformations are lazy.
Comments closedAbid Nazir Guroo looks at some endpoints:
Azure Synapse Analytics Representational State Transfer (REST) APIs are secure HTTP service endpoints that support creating and managing Azure Synapse resources using Azure Resource Manager and Azure Synapse web endpoints. This article provides instructions on how to setup and use Synapse REST endpoints and describe the Apache Spark Pool operations supported by REST APIs.
Read on to see some of the Spark pool management options are available to you via the REST API.
Comments closedRobert Cain continues a series on learning KQL:
In the previous article, Fun With KQL – Make_Set and Make_List, we saw how to get a list of items and return them in a JSON array. In this article we’ll see how to break that JSON array into individual rows of data using the
mv-expand
operator.
Read on to learn more about mv-expand
.
Bill Fellows runs into an issue:
Perfect, now I have a row for each second from midnight to approximately 5.5 hours later. What if my duration need to vary because I’m going to compute these ranges for a number of different scenarios? I should make that 19565 into a variable and let’s overengineer this by making it a bigint.
Things don’t work out quite the way you might have expected there. Read on and see what Bill found and how you can circumvent the problem.
Comments closedOlivier Van Steenlandt continues a series on database projects:
In a previous blog post (Database Projects – Merging changes), we successfully merged our feature branch into our development branch. Now, as a final step in our development process, we want to get our changes deployed to our development environment.
In this blog post, we will go through the process step by step to execute a manual deployment. We will take a look at what happens behind the scenes, how deployment works and we also will take a look at Publishing Profiles.
Check out that process.
Comments closedI am often asked about all kinds of various errors, of course with absolutely no context. I also get asked what error X is or means or says… I don’t remember that stuff off the top of my head. The thing is, you kind of need SQL Server to go look it up and there have been a plethora of times when this wasn’t possible. I’ve also noticed that people tend to give you just the error number and not anything else.
Read on to learn more about what Sean has created, akin to the SQLskills wait stats compendium.
Comments closedGilbert Quevauvilliers tries it:
Did you know that there is an easy way to run and extract Power BI REST API data?
The good news is that you can do this directly in your web browser. You don’t have to install or configure anything!
The method below works well if you want to either test the API to see what it returns.
Or if you want to run it to extract some data.
Read on for the process.
Comments closedMarc Lelijveld wants to see what’s out there in the wild:
In some scenarios, it can happen that you do not even have a Power BI desktop data model. For example, when you migrated from Analysis Services to Power BI Premium, or in case you have to deal with large datasets and it is directly developed using Visual Studio, Tabular Editor or any other tool of your preference and deployed over the XMLA endpoint. Similar setup could be that you once enriched your data model using Tabular Editor or ALM Toolkit, which resulted in the fact that your Power BI Desktop file, is no longer your golden version of your data model.
Another scenario could be gaining an overview of partitioning when using incremental refresh. The partitions of Incremental Refresh are only generated in the Power BI Service. So, including this information in your generated documentation is only possible when you connect directly to the Power BI Service.
But what if you still want to show a complete view of your Power BI data model, and extract insights using the Power BI Model Documenter? I can tell you; it is possible!
Read on to see what you can do in that case.
Comments closedDavid Wilson needs to open a module:
When you’re testing PowerShell modules, importing, referencing, and removing modules can be a little bit tricky. These are some things that I’ve found to make tests more reliable.
Click through for several tips on how to do this.
Comments closed