File Format in Snowflake

Are you looking to find what is the file format in the snowflake cloud data warehouse or maybe you are looking for a solution to the scenario where you can use the file format in snowflake, how you can create the file format, alter the file format or maybe drop the file format in the snowflake. You may also want to learn what are the different types of file formats we create like CSV, JSON, and others. In this article, I will take you through a step-by-step process of creating the file format in the desired format. After going through this article you will be able to understand and operate fully around the file format in snowflake. I will also share my expertise and experience as well about file format and various challenges which you can encounter while using it. Let’s don’t waste the time and I will take you quickly to the code directly where I will show you the operational details of the Snowflake.

File format defines the template format which can be utilized in loading and unloading the data in the specified file format. For example, you could define CSV, JSON, ORC, PARQUET, AVRO, and XML

Azure Databricks tutorial for Beginners

Azure Data Factory tutorial for Beginners

How to create the file format in the Snowflake?

Before creating the file format you need to understand, what kind of file format you want to create. For example, you may need the following information:

  1. Type of file. For example (CSV, JSON, parquet etc)
  2. RECORD_DELIMITER (how you want to end the line may be ‘\n’
  3. FIELD_DELIMITER (how your fields are separated)
  4. FILE_EXTENSION (File extension)
  5. SKIP_HEADER (Want to skip the first line because of header)
  6. COMPRESSION Type ( how you want to compress the file like gzip, bz2 etc)

These are the few basic pieces of information that you may need. Besides these, there are much other information which you can pass on in the file format as formatTypeOptions

List of Possible File format Type Options

Format Type OptionDescription
COMPRESSION  AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | NONE
RECORD_DELIMITER  ‘<character>’ | NONE
FIELD_DELIMITER  ‘<character>’ | NONE
FILE_EXTENSION  ‘<string>’
SKIP_HEADER  <integer>
SKIP_BLANK_LINES  TRUE | FALSE
DATE_FORMAT  ‘<string>’ | AUTO
TIME_FORMAT  ‘<string>’ | AUTO
TIMESTAMP_FORMAT  ‘<string>’ | AUTO
BINARY_FORMAT  HEX | BASE64 | UTF8
ESCAPE  ‘<character>’ | NONE
ESCAPE_UNENCLOSED_FIELD  ‘<character>’ | NONE
TRIM_SPACE  TRUE | FALSE
FIELD_OPTIONALLY_ENCLOSED_BY  ‘<character>’ | NONE
NULL_IF  ( ‘<string>’ [ , ‘<string>’ … ] )
ERROR_ON_COLUMN_COUNT_MISMATCH  TRUE | FALSE
REPLACE_INVALID_CHARACTERS  TRUE | FALSE
VALIDATE_UTF8  TRUE | FALSE
EMPTY_FIELD_AS_NULL  TRUE | FALSE
SKIP_BYTE_ORDER_MARK  TRUE | FALSE
ENCODING  ‘<string>’ | UTF8

Syntax Query for creating file format in Snowflake

CREATE  FILE FORMAT  <name>
                      TYPE = { CSV | JSON | AVRO | ORC | PARQUET | XML } [ formatTypeOptions ]
                      [ COMMENT = '<string_literal>' ]

-- formatTypeOptions  list you can find from the format Type Options list mentioned above

Example of creating the file format in Snowflake

--  Creating the file format of CSV type

create or replace file format csv_format_example
  type = csv
  field_delimiter = '|'
  skip_header = 1
  null_if = ('NULL', 'null')
  empty_field_as_null = true
  compression = gzip;

-- This will create the CSV type file format with name  'csv_format_example'

Can AVRO file format in Snowflake be used for unloading the data?

No, Avro file format can only be used with the data loading. In Snowflake we can’t not unload the data in the AVRO format.

Can ORC file format in Snowflake be used for unloading the data?

No. Orc file format can only be used with the data loading. In Snowflake we can’t not unload the data in the Orc format.

Can XML file format in Snowflake be used for unloading the data?

No. XML file format can only be used with the data loading. In Snowflake we can’t not unload the data in the XML format.

How to create CSV File Format in the Snowflake?

Syntax for creating the CSV file format as follows:

CREATE  FILE FORMAT  <name>
                      TYPE = CSV
                      [ formatTypeOptions ]
                      [ COMMENT = '<string_literal>' ]

Example of CSV File format in Snowflake

create or replace file format csv_format_example2
  type = csv
  field_delimiter = ','
  skip_header = 1
  null_if = ('NULL', 'null')
  empty_field_as_null = true
  compression = gzip;

How to create JSON File Format in the Snowflake?

Syntax for creating the JSON file format as follows:

CREATE  FILE FORMAT  <name>
                      TYPE = JSON
                      [ formatTypeOptions ]
                      [ COMMENT = '<string_literal>' ]

Example of JSON File format in Snowflake

create or replace file format json_format_example2
  type = JSON
  TRIM_SPACE = true
  FILE_EXTENSION = json
  compression = gzip;

How to alter or rename the file format in Snowflake

You can use ALTER command to rename the file format in Snowflake. SQL syntax would be as follows:

ALTER FILE FORMAT [ IF EXISTS ] <name> RENAME TO <new_name>

Example of File format rename in Snowflake:

ALTER FILE FORMAT old_name RENAME TO new_name

-- This will change the file format 'od_name' to new : 'new_name' in Snowflake

How to alter or change the file format option type in Snowflake

You can use ALTER command to change the file format option type in Snowflake. SQL syntax would be like as follows:

ALTER FILE FORMAT [ IF EXISTS ] <name> SET { [ formatTypeOptions ] }

-- Here formatTypeOptions could be teh list of all different option which you want to set.

Example of File format option change in Snowflake:

ALTER FILE FORMAT file_format_name  SET 
   FIELD_DELIMITER = ','
   SKIP_HEADER  = 1

-- This will change the file format field delimiter to ',' and skip header to 1 row.

How to check file format in Snowflake?

You can check or show list of all the file format available or created in the Snowflake using the SHOW command as follows:

-- Show list of all the file format in current database and schema
SHOW FILE FORMATS

How to check file format in specific database or schema

-- Show list of all the file format in specific database and schema
SHOW FILE FORMATS in database_name.schema_name

How to get DDL of file format in Snowflake?

Syntax to get ddl for the file format in Snowflake is as follows:

select get_ddl('file_format','db_name.schema_name.file_format_name')

Above command will give you ddl for the file format. You can change the database name, schema name and the file format name as per your project requirement.

How to clone the file format in Snowflake?

Syntax for cloning or copying the exiting the file format in Snowflake is as follows:

CREATE [ OR REPLACE ] { STAGE | FILE FORMAT | SEQUENCE | TASK } [ IF NOT EXISTS ] <object_name>
  CLONE <source_object_name>

Example of clone file format in Snowflake as follows:

Create File format cloned_file_format clone existing_file_format

How to describe a file format in the Snowflake?

Describe file format is used to describe or get the list of all the values for file format properties along with their default values.

You can use the DESC command to describe the file format as follows:

desc file format file_format_name;

Describe file format in Snowflake

How to drop a file format in the Snowflake?

If you want to delete the existing the file format in the Snowflake then you can use the drop command along with the file format.

Example could be as follows:

DROP File Format File_format_name

Use Case Scenario for Using File Format

  • Assume that you are getting some CSV files from the upstream system. Now you want to perform some transformation on it. In this case you can load this CSV file into the Snowflake as table. While loading the file you need to have the file format which defines the format in which file should be loaded.
  • Another situation could be like, assume you have done with the transformation on the data and now you wanted to save this data or export this data as file. Then may use the unload functionality of the Snowflake. For unloading or exporting the data, again you need the file format in which data will be exported and saved. Hence this could also be the scenario where the file format is needed.

Snowflake official Documentation Link

Final Thoughts

By this, we have reached the last section of the blog. In this blog, we have learned what is the File format in the Snowflake. How we can create the different types of file format for example CSV file format, JSON file format. I have also explained how we can rename the existing file format along with how you can alter the file format to set the format type options. We have also discussed how we can describe the file format, clone file format, and lastly how we have seen how we can delete or drop the existing file format in the Snowflake.

Please share your comments and suggestions in the comment section below and I will try to answer all your queries as time permits.

DeepakGoyal

Deepak Goyal is certified Azure Cloud Solution Architect. He is having around decade and half experience in designing, developing and managing enterprise cloud solutions. He is also Big data certified professional and passionate cloud advocate.