Word Counts In DAX – Curated SQL

Philip Seamark shows us a way of splitting strings into words in DAX:

Here is a technique you might consider if you need to split text down to individual words. This could be used to help count, rank or otherwise aggregate the words in some longer text. The approach detailed here uses spaces as a delimiter and will not be tripped up if multiple spaces are used between words.

There is no SPLIT function in DAX, so this approach uses the MID function to help find words.

The PBIX file used for the blog can be downloaded here.

[Updated 14th Oct, 2018]
A slightly updated version that uses UNICHAR/UNICODE to preserve the case (“A” versus “a”) of each letter can be downloaded here. The reason for this is DAX stores a dictionary of unique values for every column. It is the first instance of any value that is added to the dictionary and assigned a new ID. Subsequent values that are considered the same “A” and “a” are considered the same are assigned the same ID. Using the UNICHAR/UNICODE version helps preserve the original case of each letter.

It’s an interesting approach and reminded me a bit of using a tally table to split strings in T-SQL.

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31