Image you have a string like so: “10.ROADBIKES.423/01” and would like to extract only “ROADBIKES”.
Power Query actually has a function for this purpose:
Click through to learn what that function is and how it works.
Comments closedA Fine Slice Of SQL Server
Image you have a string like so: “10.ROADBIKES.423/01” and would like to extract only “ROADBIKES”.
Power Query actually has a function for this purpose:
Click through to learn what that function is and how it works.
Comments closedElizabeth Ricks completes a series on designing visuals. Question 4 involves the audience:
When communicating with data, always consider what broader purpose it serves. Too often, we don’t pause and think about the context (understandable in the real-world maelstrom of deadlines, constraints and organizational politics!). Rather, we force the visuals that we used to analyze the data upon our audience, hoping they’ll come to the same conclusion.
Question 5 looks at the text on a visual:
In data visualization, words can be more powerful than we realize. When designing graphs, the non-negotiable text includes explicit axes labels, descriptive chart titles, supporting annotations, and footnotes with relevant methodology and assumptions. Never assume it’s apparent what’s being shown—make it your default to include this important context so your audience doesn’t have to guess what they’re seeing.
Be sure to read both.
Comments closedEd Elliott shows how to get data and convert it into a Spark DataFrame using .NET:
When I first started working with Apache Spark, one of the things I struggled with was that I would have some variable or data in my code that I wanted to work on with Apache Spark. To get the data in a state that Apache Spark can process it involves putting the data into a DataFrame. How do you take some data and get it into a DataFrame?
This post will cover all the ways to get data into a DataFrame in .NET for Apache Spark.
Click through for several methods.
Comments closedKamil Nowinski contrasts two methods for deploying Azure Data Factory pipelines:
Turned out that two-third of people use Microsoft’s deployment way, according to their answers on that poll (including few people who publish the code manually). In 1/3 cases people prefer to deploy directly from code. You may ask: what’re the differences? What characterizes both methods? Which one is better?
Before I start answering these questions, let me present both methods of publishing.
Read on to learn more about these approaches.
Comments closedPeter Schott hits on one of my favorite concepts:
Maybe you’ve worked with data warehouses before, in which case the concept of a “Date Dimension” is going to be familiar. If not, the general idea behind a Calendar or Date table is that you have a table of Dates and metadata about those dates. This can include business-specific flags, alternate Quarter structures, alternate Week Start data, or whatever fits your needs
By pre-populating all of the data about a date in a table, it makes querying for specific date-based criteria a lot easier, especially when your fiscal year isn’t aligned with the calendar year or you need to deal with multiple fiscal years. It also helps with those holidays which are aligned with lunar calendars and thus “change date” every year.
Comments closedAs Synapse engineer or Synapse Support Engineer you may need to start and test some Pools, and you want this to be the most cost efficient possible. Leaving some Synapse with a lot of DWU left turned on during the weekend because you forget to pause the DW after you shutdown your computers is not a good approach and we can quickly resolve this by using Powershell + Automation accounts.
This is also a good introduction to Azure Automation if you aren’t familiar with it.
Comments closedEd Hansberry knows what time it is:
You may need to write a Power BI report that works with Unix Epoch Time, converting either to or from it. Epoch Time is simply the number of seconds that have passed since January 1, 1970, at 12:00 am UTC. So at 1 am of that date, Epoch Time would be 3600, since 3,600 seconds have transpired. Today Epoch Time is around 1610644465, or 1,610,644,465 to make it easier to read, though you will never see the commas in a database.
Converting to and from this time format is deceptively simple, as long as you understand a few functions in Power Query and the concept of Duration.
Read on for a demo.
Comments closedChris Webb lets us know about some new functionality in Power Query:
It’s a new option on the Excel.Workbook function called InferSheetDimensions. Here’s what the docs for the Excel.Workbook function say:
Can be null or a logical (true/false) value indicating whether the area of a worksheet that contains data should be inferred by reading the worksheet itself, rather than by reading the dimensions metadata from the file. This can be useful in cases where the dimensions metadata is incorrect. Note that this option is only supported for Open XML Excel files, not for legacy Excel files. Default: false.
Read on to understand when that might happen.
Comments closedElizabeth Ricks continues a series on data visualization techniques:
Data can be visualized many different ways, creating an often daunting task to select the best chart to use. We also sometimes complicate our audience’s experience, packing too much information into a single graph, in the hopes that it will answer every possible question. Instead of making the Swiss Army Knife of charts, we might instead create different views of the same data, examine what each iteration allows a viewer to see, and then decide which version(s) to share with our audience.
Read on for more.
Comments closedAaron Bertrand has a change of heart:
In my previous tip about SQL Server Extended Events, I discussed the trade-offs involved with choosing to store the database name instead of looking up the name via database_id later. At the time, we decided to use the latter approach since we are generally not concerned about the edge case where DDL is performed against a database that is then dropped before collection occurs.
Something else that has since come up: we want to filter out activity against a set of databases matching a specific name pattern since the DDL activity there is system-generated and does not need to be audited. We quickly learned that you have to be careful with how you use the database name in an Extended Events predicate.
Click through to understand the problem as well as the solution.
Comments closed