Free Snowflake DEA-C02 Practice Test & Real Exam Questions
You are loading JSON data into a Snowflake table with a 'VARIANT' column. The JSON data contains nested arrays with varying depths. You need to extract specific values from the nested arrays and load them into separate columns in your Snowflake table. Which approach would provide the BEST performance and flexibility?
Correct Answer: A
Vote an answer
Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).
You are tasked with building a data pipeline that ingests data from various sources into Snowflake, processes it, and then writes the final results back to a data lake in AWS S3, partitioned by date. The data in S3 should be queryable by other applications outside of Snowflake. You choose to use Snowflake Iceberg tables for this purpose. Which of the following is the correct SQL statement to create an Iceberg table 'analytics.public.daily_summary' in Snowflake, backed by an S3 bucket 's3://your-bucket/data/daily_summary/', partitioned by the column, and specifying 'parquet' as the file format?
Correct Answer: B
Vote an answer
Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).
You are responsible for optimizing query performance on a Snowflake table called 'WEB EVENTS, which contains clickstream data'. The table has the following structure: CREATE TABLE WEB EVENTS ( event_id VARCHAR(36), user_id INT, event_time TIMESTAMP NTZ, event_type VARCHAR(50), page_url VARCHAR(255), device_type VARCHAR(50) Users frequently run queries that filter the 'WEB EVENTS table based on a combination of 'event_type', and a date range derived from 'event_time' You observe that these queries are consistently slow Which of the following strategies would be MOST effective in improving the performance of these frequently executed queries?
Correct Answer: B,C
Vote an answer
Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).
You are tasked with designing a solution to load semi-structured data (JSON) from an AWS S3 bucket into a Snowflake table using Snowpipe and the REST API. The data in S3 is constantly being updated, and you need to ensure that only new or modified files are loaded into Snowflake. Which of the following steps are essential for implementing an efficient and cost-effective solution?
Correct Answer: A,C
Vote an answer
Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).
You are working with a very large Snowflake table named 'CUSTOMER TRANSACTIONS which is clustered on 'CUSTOMER ID and 'TRANSACTION DATE. After noticing performance degradation on queries that filter by 'TRANSACTION AMOUNT and 'REGION' , you decide to explore alternative clustering strategies. Which of the following actions, when performed individually, will LEAST likely improve query performance specifically for queries filtering by 'TRANSACTION AMOUNT and 'REGION', assuming you can only have one clustering key?
Correct Answer: E
Vote an answer
Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).
You are developing a data pipeline in Snowflake that processes sensitive customer data'. You need to implement robust data governance controls, including column-level security and data masking. Which of the following combinations of Snowflake features, when used together, provides the MOST comprehensive solution for achieving this?
Correct Answer: C,D
Vote an answer
Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).
You are troubleshooting a slowly performing query in Snowflake that aggregates data from a large ORDERS table (10 billion rows) partitioned by ORDER DATE. The query execution plan shows significant 'Remote Spill to Disk'. Which of the following actions would be MOST effective in reducing the spill and improving query performance? Assume all statistics are up-to-date and the data is properly clustered by ORDER_DATE.
Correct Answer: D
Vote an answer
Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).
You are designing a data protection strategy for a Snowflake database. You need to implement dynamic data masking on the 'CREDIT CARD' column in the 'TRANSACTIONS' table. The requirement is that users with the 'FINANCE ADMIN' role should see the full credit card number, while all other users should see only the last four digits. You have the following masking policy:

What is the next step to apply this masking policy to the 'CREDIT CARD' column?

What is the next step to apply this masking policy to the 'CREDIT CARD' column?
Correct Answer: C
Vote an answer
Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).
Consider the following Snowflake SQL API call to execute a stored procedure:
Correct Answer: A,C,D
Vote an answer
Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).
A data engineering team is implementing Row Access Policies (RAP) on a table 'employee_data' containing sensitive salary information. They need to ensure that only managers can see the salary information of their direct reports. A user-defined function (UDF) 'GET returns a comma-separated string of manager usernames for a given username. Which of the following SQL statements correctly creates and applies a RAP to achieve this?


Correct Answer: D
Vote an answer
Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).
You've created a JavaScript UDF in Snowflake to perform complex string manipulation. You need to ensure this UDF can handle a large volume of data efficiently. The UDF is defined as follows:

When testing with a large dataset, you observe poor performance. Which of the following strategies, when applied independently or in combination, would MOST likely improve the performance of this UDF?

When testing with a large dataset, you observe poor performance. Which of the following strategies, when applied independently or in combination, would MOST likely improve the performance of this UDF?
Correct Answer: A,C,D
Vote an answer
Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).
You have a Snowflake table 'ORDERS with columns 'ORDER ID, 'CUSTOMER ID', 'ORDER DATE, and 'TOTAL AMOUNT. You notice that many queries filtering by 'ORDER DATE are slow, even after enabling query acceleration. You decide to implement a caching strategy to improve performance. Which of the following approaches will be most effective in leveraging Snowflake's caching capabilities and improving the performance of date-filtered queries, especially when the data volume for each date is large and varied? Assume virtual warehouse is medium size.
Correct Answer: D
Vote an answer
Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).
You are monitoring a Snowpipe pipeline that loads data from an external stage into a Snowflake table. You observe the following error messages in the PIPE ERRORS view: 'Invalid UTF-8 detected in string'. The data files on the stage are encoded in UTF-8. Which of the following actions, taken individually or in combination, are MOST likely to resolve this issue? (Select TWO)
Correct Answer: C,E
Vote an answer
Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).
You are building a data pipeline that extracts data from a REST API, transforms it using Pandas DataFrames, and loads it into Snowflake. You need to implement error handling to gracefully handle network issues and API rate limits. Which of the following code snippets demonstrates the most robust approach to handle potential errors during data loading into Snowflake using the Python connector?


Correct Answer: A
Vote an answer
Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).
