Author: Kevin Feasel

What Makes for Good Coding Style

Published 2019-05-03 by Kevin Feasel

Brent Yorgey spends some time thinking about good coding style:

What is good code style? You probably have some opinions about this. In fact, I’m willing to bet you might even have some very strong opinions about this; I know I do. Whether consciously or not, we tend to frame good coding practices as a moral issue. Following good coding practices makes us feel virtuous; ignoring them makes us feel guilty. I can guess that this is why Yom said “I don’t think I could bring myself to be satisfied with partial functions” [emphasis added]. And this is why we say “good code style”, not “optimal” or “rational” or “best practice” code style.

It’s an interesting post. There are some bits on competitive programming which don’t apply in general, but there’s a lot to unpack there.

Comments closed

Singular or Plural Table Names

Published 2019-05-03 by Kevin Feasel

Ed Elliott kicks a hornet’s nest:

There is a lot of confusion when it comes to designing tables in SQL Server around whether to pluralize names or not. How do you choose whether to pluralize or not?
If we want to store a list of people and their details do we use “Person”, “Persons”, “People” or “Peoples”? Some people will use “People” and some will use “Person”, other persons or people would go for “Peoples” or “Persons”.

My preference is singular. In the event that I do pluralize a table, I use a grammatically correct pluralization. None of this “childs” nonsense.

Comments closed

Troubleshooting Database Compatibility Levels

Published 2019-05-03 by Kevin Feasel

Randolph West tells a tale about checking compatibility levels:

In that demo, the AdventureWorks sample database was initially set to compatibility level of 140 (SQL Server 2017 default compatibility) to execute a scalar UDF. At this point, the estimated execution plan showed that the UDF was given a cost of 0%, and performance was terrible (the expected behaviour). Then the database compatibility level was switched to 150 (which is all that’s required to enable this new optimization feature), the query was executed again, the UDF was inlined, and performance improved dramatically.
This is where it got interesting. As a test, the compatibility level of the database was set back to 140, but the query plan continued to inline the UDF. Curious. Flushing the plan cache didn’t change the outcome (even though we knew it wasn’t necessary). Had we discovered a bug in a preview version of SQL Server 2019? It was CTP 2.2 after all, and since then (at the time of this writing) CTP 2.5 is already available.

Read on for the answer.

Comments closed

Data Cleansing Options with Azure

Published 2019-05-03 by Kevin Feasel

James Serra tries to give you an answer of when you should use different Azure services for data cleansing:

Clean the data and optionally aggregate it as it sits in source system. The tool used for this would depend on the source system that stores the data (i.e. if SQL Server, you would use stored procedures). The only benefit with this option is if you aggregate the data, you will move less data from the source system to Azure, which can be helpful if you have a small pipe to Azure and don’t need the row-level details. The disadvantages are: the raw source data is not available in the data lake, so you would always need to go back to source system if you needed to get it again, and it may not even still exist in the source system; you would put extra stress on the source system when doing the cleaning which could affect end users using the system; it could take a long time to clean the data as the source system may not have fast performance; and you would not be able to use other tools (i.e. Hadoop, Databricks) to clean it. Strongly advise against this option

Read on for additional options and James’s recommendations.

Comments closed

UpSet Plots for Set Analysis

Published 2019-05-02 by Kevin Feasel

Laura Ellis digs into the UpSetR package:

UpSet plots have a very cool parameter called queries. Queries can be used to define a subset of the data that you would like to highlight in your graph. The queries property takes in a list of query lists which means that you can pass multiple queries into the same graph. Each query list allows you to set a number of properties about how the query should function.
In this example we are viewing the Cycle and Walk set intersection (query and params). We want the query to be highlighted in a nice pink (color). We want to display the query as a highlighted overlap (active) and we will give it a name that we add to the chart legend (query.name)

I’ve not seen an UpSet plot before but it dumps a lot of information into a relatively small space. I’ll have to spend some time learning more about these plots.

Comments closed

Improving Diagrams

Published 2019-05-02 by Kevin Feasel

Stephanie Evergreen gives us some tips for improving diagrams:

As ubiquitous as diagrams are, they have some significant shortcomings.
Diagrams tend to be most useful as a mental organizing activity for the people who make the diagram. Outsiders coming in fresh have a much harder time seeing how the elements of the diagram fit together and make sense. I think that is generally because diagrams lack enough narrative to explain what is going on. We often use diagrams without explaining them. Or, more precisely, we use diagrams without connecting the diagram pieces to their associated parts of our narrative.

Read on for an example.

Comments closed

The CosmosDB Emulator

Published 2019-05-02 by Kevin Feasel

Hasan Savran has a way to let you play with CosmosDB without dropping any cash on it:

CosmosDB Emulator is a must have tool if you develop applications for Azure CosmosDB. Also, it’s a great tool to have if you like to learn about Azure CosmosDB but you have limited access to Azure for any reason. Azure CosmosDB team constantly works to make all available tools better including the emulator. Currently, emulator supports SQL, Cassandra, MongoDB, Gremlin and Table API. Data Explorer feature supports only SQL API for now. Emulator’simplementation is different than the service and you should not use it for stress testing, you can not test global replication or latency for read and writes.

It’s also useful for testing scenarios where you want some level of integration testing but don’t want to rely on an external service.

Comments closed

Minimal Logging When Inserting into Heaps

Published 2019-05-02 by Kevin Feasel

Paul White gives us the lowdown on minimal logging when performing INSERT..SELECT operations into heap tables:

When inserting rows using INSERT...SELECT into a heap with no nonclustered indexes, the documentation universally states that such inserts will be minimally logged as long as a TABLOCK hint is present. This is reflected in the summary tables included in the Data Loading Performance Guide and the Tiger Team post. The summary rows for heap tables without indexes are the same in both documents (no changes for SQL Server 2016):

But it’s not quite that straightforward, as Paul shows. Read the whole thing.

Comments closed

Reverse Engineering the Key Influencers Visual in Power BI

Published 2019-05-02 by Kevin Feasel

Chris Webb learns how the Key Influencers Power BI visual works, including some interesting undocumented functions:

A fascinating insight into how Power BI works, but is this any practical use to us? Let me be clear: I don’t think you should be using any of these functions yourself in a real-world report. I’m sure all this would be documented and publicised if Microsoft did want us to use it ourselves! Another consideration is that these new functions return tables and that makes them awkward to use in regular .pbix Power BI reports – I guess we could create calculated tables although that’s not as flexible as returning a table from a query as shown above. That said, even though we can’t write our own DAX queries in regular Power BI reports, we can write our own DAX queries in Paginated Reports and we can now create Paginated Reports that use a Power BI dataset as a data source. I tested putting one of the queries generated by the Key Influencers visual into a Paginated Report connected to the same dataset and it worked ok (even after publishing). You can also embed DAX queries connected to a published dataset in Excel too, as I show here. Hmm, plenty to think about then…

Chalk this up as “fun to know but not recommended to use yourself.”

Comments closed

Sort Descending in SSRS

Published 2019-05-02 by Kevin Feasel

Nate Johnson fixes an annoying problem with SSRS:

But you know what’s kinda annoying? You can’t dictate a “first sort direction” — it just assumes that the first time you click the sort-arrows, you want ‘Ascending’ (lowest first). Then you can switch to ‘DEscending’ (highest first). This makes perfect sense for alpha values (strings), but not always for numeric values — at least not when you’re dealing with money, when generally the highest dollar amount is the most important!
So let’s make it real simple for the end-user to get the ‘best’ behavior by default. Let’s try to make it sort by our ‘Revenue’ column in DEscending order first. Ready?

It’s an unintuitive and easy solution to the problem.

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30