How to send the notebook run status to ADF from Azure Databricks notebook?

Are you looking for the solution on how you can pass the message from the Azure Databricks notebook execution to the Azure data factory then you have reach to the right place. In this article I will explain to you how you can pass different types of output from Azure Databricks spark notebook execution using python or SCALA. You may want to send the output from ADB to ADF in the form of a plan string, list, dictionary, or JSON object. You can send the output in all these possible formats from the Databricks notebook to the ADF.  Let’s get into it and I’ll explain to you step by step how you can send or transfer the notebook execution output from ADB to the Azure data factory.

Use dbutils.notebook.exit() function to pass the output from the Azure Databricks notebook to Azure Data Factory.

How to pass the Azure Databricks notebook execution output from Azure databricks Azure data factory as String message?

  • Create the pipeline in the Azure data factory account. In case if you don’t have the adf account created then you can follow this link to create your first Azure Data Factory account.  If you already have an active Azure Data factory account then go to the account and click on plus sign to create your pipeline. I am creating a pipeline named ADB-Demo, You can choose the name as per your need.
  • Create the Azure Databricks linked service which will connect the Azure data factory to Azure Databricks. In case if you are looking for help on how to create the link service for Azure Databricks please follow this link to create the Azure Databricks linked service.
  • Go to the pipeline And in the search box type notebook and pull the Notebook activity into the pipeline.
  • Select the notebook activity and at the bottom, you will see a couple of tabs, select the Azure Databricks tabs. In this tab, you have to provide the Azure Databricks linked service which you created in step 2. 

  • Now go to the setting tab in the setting tab, select the path in the databricks workspace where your notebook is available. For example, in my case, I have a notebook name Pass_Status_Demo which I am going to execute.

  • Our pipeline is ready to execute. But we have to go to the notebook and need to make changes in the notebook which will send the output to the Azure data factory. Let’s go to the notebook and in the notebook at the end of all execution use the following command to send the plain text message to the Azure data factory.
dbutils.notebook.exit('Thank you. Message from Azure Databricks Notebook run')

Whatever the message you pass in this exit function, this will get a pass to the Azure data factory as an output.

  • Go to the Driver tab and let’s run the pipeline. Once the pipeline gets executed successfully, expand the output of the notebook execution. There you can see the output JSON which contains the message which we have passed from our Azure Databricks notebook.

How to  how to pass the JSON output from the Azure Databricks notebook execution Run To the Azure data factory 

  • Create the pipeline in the Azure data factory account. In case if you don’t have the adf account created then you can follow this link to create your first Azure Data Factory account.  If you already have an active Azure Data factory account then go to the account and click on plus sign to create your pipeline. I am creating a pipeline named ADB-Demo, You can choose the name as per your need.
  • Create the Azure Databricks linked service which will connect the Azure data factory to Azure Databricks. In case if you are looking for help on how to create the link service for Azure Databricks please follow this link to create the Azure Databricks linked service.
  • Go to the pipeline And in the search box type notebook and pull the Notebook activity into the pipeline.
  • Select the notebook activity and at the bottom, you will see a couple of tabs, select the Azure Databricks tabs. In this tab, you have to provide the Azure Databricks linked service which you created in step 2. 
  • Now go to the setting tab in the setting tab, select the path in the databricks workspace where your notebook is available. For example, in my case, I have a notebook name Pass_Status_Demo which I am going to execute.
  • Our pipeline is ready to execute. But we have to go to the notebook and need to make changes in the notebook which will send the output to the Azure data factory. Let’s go to the notebook and in the notebook at the end of all execution use the following command to send the JSON message to the Azure data factory.
dbutils.notebook.exit('["Employee", "Customer","Order"]')

Whatever the message you pass in this exit function, this will get a pass to the Azure data factory as an output.

  • Go to the Driver tab and let’s run the pipeline. Once the pipeline gets executed successfully, expand the output of the notebook execution. There you can see the output JSON which contains the message which we have passed from our Azure Databricks notebook.

How to  how to pass the list as an output from the Azure Databricks notebook execution Run To the Azure data factory 

  • Create the pipeline in the Azure data factory account. In case if you don’t have the adf account created then you can follow this link to create your first Azure Data Factory account.  If you already have an active Azure Data factory account then go to the account and click on plus sign to create your pipeline. I am creating a pipeline named ADB-Demo, You can choose the name as per your need.
  • Create the Azure Databricks linked service which will connect the Azure data factory to Azure Databricks. In case if you are looking for help on how to create the link service for Azure Databricks please follow this link to create the Azure Databricks linked service.
  • Go to the pipeline And in the search box type notebook and pull the Notebook activity into the pipeline.
  • Select the notebook activity and at the bottom, you will see a couple of tabs, select the Azure Databricks tabs. In this tab, you have to provide the Azure Databricks linked service which you created in step 2. 
  • Now go to the setting tab in the setting tab, select the path in the databricks workspace where your notebook is available. For example, in my case, I have a notebook name Pass_Status_Demo which I am going to execute.
  • Our pipeline is ready to execute. But we have to go to the notebook and need to make changes in the notebook which will send the output to the Azure data factory. Let’s go to the notebook and in the notebook at the end of all execution use the following command to send the JSON message to the Azure data factory.
dbutils.notebook.exit('{"empName": "John", "empCity":"New York"}')

Whatever the message you pass in this exit function, this will get a pass to the Azure data factory as an output.

  • Go to the Driver tab and let’s run the pipeline. Once the pipeline gets executed successfully, expand the output of the notebook execution. There you can see the output JSON which contains the message which we have passed from our Azure Databricks notebook.

Microsoft Official Azure Databricks Documentation Link

Final Thoughts

Azure Databricks is one of the most used platforms in the data engineering world along with the Azure data factory. We have seen how you can pass the output message from the Azure Databricks to the Azure Data Factory. This message could be of any type like string, JSON, or list. Hope you would have learned a lot of new things today.

Thank you keep learning.

DeepakGoyal

Deepak Goyal is certified Azure Cloud Solution Architect. He is having around decade and half experience in designing, developing and managing enterprise cloud solutions. He is also Big data certified professional and passionate cloud advocate.