Free Snowflake DEA-C02 Practice Test & Real Exam Questions

Exam Code/Number: DEA-C02
Exam Name/Title: SnowPro Advanced: Data Engineer (DEA-C02)
Certification Provider: Snowflake
Corresponding Certification: SnowPro Advanced

Exam Questions: 354
Updated On: Jun 27, 2026

Page: 3 / 26
Total 354 questions

Question #29

You are loading JSON data into a Snowflake table with a 'VARIANT' column. The JSON data contains nested arrays with varying depths. You need to extract specific values from the nested arrays and load them into separate columns in your Snowflake table. Which approach would provide the BEST performance and flexibility?

A. Use a 'COPY' command with a 'TRANSFORM' clause that uses JavaScript UDFs to parse the JSON and extract the values during the load process. Load the extracted values directly into the target columns.

B. Load the entire JSON into a 'VARIANT column and then use SQL with nested 'FLATTEN' functions to extract the desired values during query time.

C. Use a stored procedure to parse the JSON data and insert values into the table row by row.

D. Use Snowpipe with auto-ingest, loading directly into the table with the 'VARIANT column. Define data quality checks with pre-load data transformation.

E. Create a view with nested 'FLATTEN' functions to extract the values from the 'VARIANT column. The view serves as the source for further transformations.

Discussion 0

Correct Answer: A Vote an answer

Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).

Question #30

You are tasked with building a data pipeline that ingests data from various sources into Snowflake, processes it, and then writes the final results back to a data lake in AWS S3, partitioned by date. The data in S3 should be queryable by other applications outside of Snowflake. You choose to use Snowflake Iceberg tables for this purpose. Which of the following is the correct SQL statement to create an Iceberg table 'analytics.public.daily_summary' in Snowflake, backed by an S3 bucket 's3://your-bucket/data/daily_summary/', partitioned by the column, and specifying 'parquet' as the file format?

A. Option C

B. Option E

C. Option B

D. Option D

E. Option A

Discussion 0

Correct Answer: B Vote an answer

Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).

Question #31

You are responsible for optimizing query performance on a Snowflake table called 'WEB EVENTS, which contains clickstream data'. The table has the following structure: CREATE TABLE WEB EVENTS ( event_id VARCHAR(36), user_id INT, event_time TIMESTAMP NTZ, event_type VARCHAR(50), page_url VARCHAR(255), device_type VARCHAR(50) Users frequently run queries that filter the 'WEB EVENTS table based on a combination of 'event_type', and a date range derived from 'event_time' You observe that these queries are consistently slow Which of the following strategies would be MOST effective in improving the performance of these frequently executed queries?

A. Create a clustering key on 'event_time' .

B. Create a materialized view that pre-aggregates data by 'event_type' , 'device_type' , and day (derived from 'event_time').

C. Create a clustering key with the following order: 'event_type' , 'device_type' , 'event_time' .

D. Create a search optimization service on the 'page_url' column.

E. Add a column to the 'WEB EVENTS' table for the date part of 'event_time' and create a clustering key using the new date column along with and device_type' .

Discussion 0

Correct Answer: B,C Vote an answer

Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).

Question #32

You are tasked with designing a solution to load semi-structured data (JSON) from an AWS S3 bucket into a Snowflake table using Snowpipe and the REST API. The data in S3 is constantly being updated, and you need to ensure that only new or modified files are loaded into Snowflake. Which of the following steps are essential for implementing an efficient and cost-effective solution?

A. Configure auto-ingest using SQS queue and SNOWPIPE object. No need to manually call the REST API endpoint for data loading.

B. Use the 'VALIDATION MODES copy option with 'RETURN_ALL RESULTS = TRUE to validate all data being loaded into the Snowflake table.

C. Configure an S3 event notification to trigger a REST API call to the Snowpipe endpoint whenever a new or modified file is added to the S3 bucket. The API call should include the file name in the request.

D. Configure Snowpipe to automatically detect new files in the S3 bucket using event notifications, but manually refresh the pipe using SYSTEM $PIPE STATUS periodically to ensure that all files are processed.

E. Create a Snowflake external function that polls the S3 bucket every minute, checks for new files using the LIST command, and then calls the Snowpipe REST API endpoint for each new file.

Discussion 0

Correct Answer: A,C Vote an answer

Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).

Question #33

You are working with a very large Snowflake table named 'CUSTOMER TRANSACTIONS which is clustered on 'CUSTOMER ID and 'TRANSACTION DATE. After noticing performance degradation on queries that filter by 'TRANSACTION AMOUNT and 'REGION' , you decide to explore alternative clustering strategies. Which of the following actions, when performed individually, will LEAST likely improve query performance specifically for queries filtering by 'TRANSACTION AMOUNT and 'REGION', assuming you can only have one clustering key?

A. Dropping the existing clustering key and clustering on 'TRANSACTION_AMOUNT' and 'REGION'.

B. Creating a new table clustered on 'TRANSACTION_AMOUNT and 'REGION', and migrating the data.

C. Creating a materialized view that pre-aggregates data by 'TRANSACTION_AMOUNT and 'REGION'.

D. Creating a search optimization on 'TRANSACTION_AMOUNT' and 'REGION' columns.

E. Adding ' TRANSACTION_AMOUNT and 'REGIO!V to the existing clustering key while retaining 'CUSTOMER_ID and 'TRANSACTION_DATE

Discussion 0

Correct Answer: E Vote an answer

Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).

Question #34

You are developing a data pipeline in Snowflake that processes sensitive customer data'. You need to implement robust data governance controls, including column-level security and data masking. Which of the following combinations of Snowflake features, when used together, provides the MOST comprehensive solution for achieving this?

A. Dynamic tables and masking policies.

B. Data masking policies and network policies.

C. Row access policies and data masking policies on base tables, supplemented with object tagging and column-level security policies on views that grant limited access to specific user roles.

D. Object tagging, column-level security policies (using views), and masking policies.

E. Row-level security policies and data masking policies.

Discussion 0

Correct Answer: C,D Vote an answer

Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).

Question #35

You are troubleshooting a slowly performing query in Snowflake that aggregates data from a large ORDERS table (10 billion rows) partitioned by ORDER DATE. The query execution plan shows significant 'Remote Spill to Disk'. Which of the following actions would be MOST effective in reducing the spill and improving query performance? Assume all statistics are up-to-date and the data is properly clustered by ORDER_DATE.

A. Rewrite the query to use window functions instead of aggregate functions.

B. Increase the virtual warehouse size. This will provide more memory for the query to execute.

C. Reduce the number of columns selected in the query, only selecting those that are essential for the aggregation.

D. Optimize the query to leverage data pruning based on ORDER DATE by ensuring the query filters on a specific or limited range of ORDER DATE values.

E. Increase the value of the parameter. This allows the warehouse to scale up further if needed.

Discussion 0

Correct Answer: D Vote an answer

Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).

Question #36

You are designing a data protection strategy for a Snowflake database. You need to implement dynamic data masking on the 'CREDIT CARD' column in the 'TRANSACTIONS' table. The requirement is that users with the 'FINANCE ADMIN' role should see the full credit card number, while all other users should see only the last four digits. You have the following masking policy:

What is the next step to apply this masking policy to the 'CREDIT CARD' column?

Discussion 0

Correct Answer: C Vote an answer

Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).

Question #37

Consider the following Snowflake SQL API call to execute a stored procedure:

A. Set the parameter to and retrieve the result set directly from the API response.

B. Use the parameter to specify which external functions are allowed to be called by the procedure.

C. Set the 'warehouse' parameter in the SQL API request to ensure the stored procedure uses a specific warehouse size.

D. Include the stored procedure's fully qualified name (database.schema.procedure_name) in the 'statement' parameter.

E. The stored procedure should handle the error handling for network disruptions and automatically retry.

Discussion 0

Correct Answer: A,C,D Vote an answer

Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).

Question #38

A data engineering team is implementing Row Access Policies (RAP) on a table 'employee_data' containing sensitive salary information. They need to ensure that only managers can see the salary information of their direct reports. A user-defined function (UDF) 'GET returns a comma-separated string of manager usernames for a given username. Which of the following SQL statements correctly creates and applies a RAP to achieve this?

A. Option C

B. Option E

C. Option B

D. Option D

E. Option A

Discussion 0

Correct Answer: D Vote an answer

Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).

Question #39

You've created a JavaScript UDF in Snowflake to perform complex string manipulation. You need to ensure this UDF can handle a large volume of data efficiently. The UDF is defined as follows:

When testing with a large dataset, you observe poor performance. Which of the following strategies, when applied independently or in combination, would MOST likely improve the performance of this UDF?

A. Convert the JavaScript UDF to a Java UDF, utilizing Java's more efficient string manipulation libraries and leveraging Snowflake's Java UDF execution environment.

B. Increase the warehouse size to the largest available size (e.g., X-Large) to provide more resources for the UDF execution.

C. Replace the JavaScript UDF with a SQL UDF that uses built-in Snowflake string functions like 'REGEXP REPLACE and 'REPLACE. SQL UDFs are generally more optimized within Snowflake's execution engine.

D. Pre-compile the regular expressions used within the JavaScript UDF outside of the function and pass them as constants into the function. JavaScript regex compilation is expensive, and pre-compilation can reduce overhead.

E. Ensure the input 'STRING' is defined with the maximum possible length to provide sufficient memory allocation for the JavaScript engine to manipulate the string.

Discussion 0

Correct Answer: A,C,D Vote an answer

Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).

Question #40

You have a Snowflake table 'ORDERS with columns 'ORDER ID, 'CUSTOMER ID', 'ORDER DATE, and 'TOTAL AMOUNT. You notice that many queries filtering by 'ORDER DATE are slow, even after enabling query acceleration. You decide to implement a caching strategy to improve performance. Which of the following approaches will be most effective in leveraging Snowflake's caching capabilities and improving the performance of date-filtered queries, especially when the data volume for each date is large and varied? Assume virtual warehouse is medium size.

A. Create a materialized view that pre-aggregates the data by 'ORDER_DATE , such as calculating the sum of 'TOTAL_AMOUNT for each date. This will allow Snowflake to serve the results directly from the materialized view for queries that require aggregation.

B. Apply a WHERE clause with a date range in all the SELECT statements. This forces the metadata caching.

C. Increase the data retention period for the 'ORDERS' table. A longer retention period will ensure that more data is available in the Snowflake cache.

D. Create a clustered table on 'ORDER_DATE. This will physically organize the data on disk, allowing Snowflake to quickly retrieve the relevant data for date- filtered queries.

E. Use after running a query filtered by 'ORDER_DATE'. This will cache the result of the query in the current session for subsequent queries with the same filter.

Discussion 0

Correct Answer: D Vote an answer

Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).

Question #41

You are monitoring a Snowpipe pipeline that loads data from an external stage into a Snowflake table. You observe the following error messages in the PIPE ERRORS view: 'Invalid UTF-8 detected in string'. The data files on the stage are encoded in UTF-8. Which of the following actions, taken individually or in combination, are MOST likely to resolve this issue? (Select TWO)

A. Convert the problematic files to UTF-16 encoding before loading them into the stage.

B. Drop and recreate the external stage with 'TYPE = INTERNAL'.

C. Verify the data files on the stage are actually valid UTF-8 and contain no corrupted characters.

D. Modify the COPY INTO statement to include the 'ON ERROR = 'SKIP_FILE" option.

E. Ensure the file format definition explicitly specifies 'ENCODING = 'UTF8".

Discussion 0

Correct Answer: C,E Vote an answer

Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).

Question #42

You are building a data pipeline that extracts data from a REST API, transforms it using Pandas DataFrames, and loads it into Snowflake. You need to implement error handling to gracefully handle network issues and API rate limits. Which of the following code snippets demonstrates the most robust approach to handle potential errors during data loading into Snowflake using the Python connector?

A. Option C

B. Option E

C. Option B

D. Option D

E. Option A

Discussion 0

Correct Answer: A Vote an answer

Explanation: Only visible for Pass4Leader members. You can sign-up / login (it's free).

Page: 3 / 26
Total 354 questions

Previous Page Next Page

Unlock all DEA-C02 features

No captcha needed
365 Days Free Updates
Set your Desired Pass Percentage
Allocate Time (Hours : Minutes)
Two Modes For DEA-C02 Practice
Customer Support

Get Full Access Now

Download Free Snowflake DEA-C02 Demo

Simply submit your e-mail address below to get started with our free demo of your Snowflake DEA-C02 exam.

Email Address:

Our demo shows only a few questions from your selected exam for evaluating purposes.