Press "Enter" to skip to content

Day: October 11, 2019

Financial Time Series Analysis in Databricks

Ricardo Portilla shares a demo of financial time series analysis in Databricks:

We’ve shown a merging technique above, so now let’s focus on a standard aggregation, namely Volume-Weighted Average Price (VWAP), which is the average price weighted by volume. This metric is an indicator of the trend and value of the security throughout the day.  The vwap function within our wrapper class (in the attached notebook) shows where the VWAP falls above or below the trading price of the security. In particular, we can now identify the window during which the VWAP (in orange) falls below the trade price, showing that the stock is overbought.

Click through for the article, as well as a notebook you can try out.

Comments closed

Limiting Index Sizes in Cosmos DB

Hasan Savran explains why you might want to exclude columns from Cosmos DB indexes:

If everything is indexed already; Why do we want to exclude some of indexes? Indexes are saved on disk, you pay for the storage in Azure. If you keep indexing everything, your index file gets larger and you pay more for storage.

     Also; write operations to index file takes longer if index file is larger. By keeping only what you need in index file will improve the latency of write operations. If you will need to change your indexing policies, Rebuilding indexes will take less time.

This behavior is quite different from the way SQL Server behaves, where indexing is more of an opt-in philosophy.

Comments closed

Data Retrieval Bug Fixed for Columnstore Indexes

Dmitri Korotkevich takes us through an important bugfix in SQL Server:

The typical columnstore table is usually large and contains hundreds of millions or even billions of rows. Think about large fact tables in the data warehouses or huge transactional tables in OLTP systems. Those tables are usually partitioned. Besides usual reasons (Availability, Maintainability, etc), partitioning helps with the data load – it is easier to perform ETL in the staging table and import data through partition switch.

And here comes the problem. If you run OLTP query against partitioned clustered columnstore table and end up with the execution plan that uses index intersection of nonclustered B-Tree indexes, you may get incorrect results.

Getting the correct results in a query is pretty important (he says, with understatement), so this is an important bugfix; keep those SQL Server instances patched accordingly.

Comments closed

Column Masking with SSAS and Power BI

Dustin Ryan digs deep into the bag of tricks:

Last week I was asked to tackle a requirement by a customer adopting Analysis Services to enable data exploration and ad hoc analysis by their users. One of their requirements was to secure columns based on a grant related to a cost center. For example, a grant has several attributes, with some attributes being considered “sensitive” and other attributes considered “non-sensitive”. Non-sensitive grant attributes would accessible to all users while a subset of the attributes in the grant table considered “sensitive” would be accessible to users related to the corresponding cost center. The challenge here is that while Analysis Services supports column level security, dynamic column level security is not supported. So my colleague and friend, the great Steve Pontello, and I put our heads together to address the requirement.

Read on to see how they did it and the resulting solution’s limitations.

Comments closed

Automatic Power BI Page Rotation

Kasper de Jonge shows how you can make Power BI report pages automatically rotate:

This question comes up pretty regularly, I have a big screen in my hallway and I want to show some Power BI reports that rotate. Now there is some build in functionality in the Windows 10 App for Power BI for it that you can check out here. But there have been some cases where this doesn’t work, like for example if you want your report to run outside of your domain using B2B or when using Power BI embedded that both cannot load the report in the Win 10 app.

For those cases, Kasper has a solution involving an external Chrome add-on.

Comments closed

Overlooked T-SQL Functions

Itzik Ben-Gan covers some underutilized functions and function overloads in T-SQL:

TRIM is more than LTRIM(RTRIM())
SQL Server 2017 introduced support for the function TRIM. Many people, myself included, initially just assume that it’s no more than a simple shortcut to LTRIM(RTRIM(input)). However, if you check the documentation, you realize that it’s actually more powerful than that.

This article is an excellent argument in favor of reading the documentation, as all of it is in there but it’s easy to miss.

Comments closed

Date and Time Aggregation in T-SQL

Daniel Hutmacher shows how you can aggregate date and time data types in SQL Server:

Because the “Duration” column is a “time(3)” datatype, it doesn’t really aggregate into an average that easily.

Msg 8117, Level 16, State 1, Line 20 Operand data type time is invalid for avg operator.

This may seem a little odd, as time is really a continuum just like any integer or floating-point value, right?

Read on to see how to do this and a warning from Daniel about overflowing. This sort of aggregation is a lot easier to do in R, but you can still do it in T-SQL.

Comments closed