Subset And Apply Problems

Tom Martens explains a class of generic data processing problems:

Subset and Apply means that I have a dataset of some rows where due to some conditions all the rows have to be put into a bucket and then a function has to be applied to each bucket.

The simple problem can be solved by a GROUP BY using T-SQL, the not so simple problem requires that all columns and rows of the dataset have to be retained for further processing, even if these columns are not used to subset or bucket the rows in your dataset.

One quick example of this is running totals of orders for each customer, which Tom answers using T-SQL, R, and Power BI.  Click through for those three solutions.

Related Posts

Modifying XML in T-SQL

Max Vernon takes us through the .modify function: Determining the property syntax when modifying XML values in SQL Server can be time consuming if you don’t work with XML regularly. SQL Server includes a very flexible XML subsystem, called XML_DML, or XML Data Manipulation Language. XML_DML can be used to easily and effectively update XML values in […]

Read More

Considerations when Deleting Lots of Data

Ed Elliott takes us through things to think about before deleting a few million rows from a table: FragmentationFragmentation occurs when we delete from pages, and there is still data surrounding our data. If we have 100 rows and delete every odd row, we would have perfect fragmentation in that we have doubled the size […]

Read More

Categories

January 2017
MTWTFSS
« Dec Feb »
 1
2345678
9101112131415
16171819202122
23242526272829
3031