What is dataset in azure data factory

In azure data factory as we create the data pipelines for ETL / Shift and load / Analytics purpose we need to create the dataset. Dataset connects to the datasource via linked service. It is created based upon the type of the data and data source you want to connect. Dataset resembles the type of the data holds by data source.

For example if we want to pull the csv file from the azure blob storage in the copy activity, we need linked service and the dataset for it. Linked service will be used to make connection to the azure blob storage and dataset would hold the csv type data.

Lets create DataSet in azure data factory for -> CSV file in azure blo storage

Go to your azure data factory account (Assume you already have one in case not please refer : https://azurelib.com/azure-data-factory/ )

dataset in azure data factory

Click on Author tab

Click on + Sign

dataset in azure data factory

Select Dataset

dataset in azure data factory

Select the data store type. In our case it is blob storage and click continue

Based upon the type of the data select the data type : In our case its csv files

dataset in azure data factory

Now provide the dataset name and choose the linked service from the drop down (Azure automatically filter out the linked service based on the data store you have selected)

You can select the folder path and you are done with it. Once dataset has been created you can use it with lookup or copy activity to retrieve the data.

DeepakGoyal

Deepak Goyal is certified Azure Cloud Solution Architect. He is having around decade and half experience in designing, developing and managing enterprise cloud solutions. He is also Big data certified professional and passionate cloud advocate.