Trying out the Databricks For-Each Task

Chen Hirsh goes in a loop:

Databricks recently added a for-each task to their workflow capability. Workflows are Databricks jobs, like Data factory pipelines, or SQL server jobs, a pipeline that you can schedule, that include a number of tasks that together complete some business logic.

Theoretically, the long-awaited for-each task should make the run of multiple processes easier. for example, one of the things I often do is run a list of notebooks, each processing a different table, with no dependencies between them. At the moment I use parallel notebooks – https://docs.databricks.com/en/notebooks/notebook-workflows.html#run-multiple-notebooks-concurrently

As you will see later, this use case is not supported yet. But before that, let’s see what we can do with the for-each task.

Read on to see what it currently can do, and what it cannot.

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31