最新的Cloudera CDP Data Engineer - Certification - CDP-3002免費考試真題
問題1
How does Apache NiFi support schema inference during data flow management?
正確答案: A
說明:(僅 VCESoft 成員可見)
問題2
Which tool or API is primarily used for monitoring and inspecting the performance of Spark applications in real-time?
A Spark History Server
A Spark History Server
正確答案: C
說明:(僅 VCESoft 成員可見)
問題3
When creating a data pipeline in the Cloudera Data Engineering service, what is the primary file format used to define the pipeline steps and configuration?
正確答案: A
說明:(僅 VCESoft 成員可見)
問題4
You're given a DataFrame containing information about flights, including columns "origin", "destination", and "delay_minutes". How can you find the top 5 origin airports with the most delayed flights on average?
正確答案: B
說明:(僅 VCESoft 成員可見)
問題5
In a PySpark application running on Kubernetes, you want to enable dynamic allocation of Executors. Which configuration setting is essential to turn on this feature?
正確答案: B
說明:(僅 VCESoft 成員可見)
問題6
When configuring a Spark job to run on Kubernetes within the Cloudera Data Engineering (CDE. service, which property is crucial for specifying the Docker image to be used by the executors?
正確答案: B
說明:(僅 VCESoft 成員可見)
問題7
You encounter an error during the execution of your Airflow DAG. How can you identify the root cause of the issue and debug it effectively?
正確答案: A
說明:(僅 VCESoft 成員可見)
問題8
You're working with an Airflow DAG that performs data quality checks on sensitive dat a. How can you ensure data security during the checks?
正確答案: A,D
問題9
You are deploying a Spark application in a Kubernetes environment. Your application is designed to process large datasets using Spark's data frame API. You have created a Docker image for your Spark application. Which of the following 'kubectl* commands should you use to deploy your Spark application onto the Kubernetes cluster?
正確答案: C
說明:(僅 VCESoft 成員可見)
問題10
Your Airflow DAG encounters an error during the data transformation stage. What information can you access in the Airflow UI to troubleshoot the issue?
正確答案: C
說明:(僅 VCESoft 成員可見)
問題11
When analyzing an Explain Plan, what does the presence of a large number of "Nested Loop Joins" indicate about a query's potential performance?
正確答案: C
說明:(僅 VCESoft 成員可見)
問題12
You notice a significant performance overhead when persisting a large RDD to disk. What potential factors could contribute to this, and how can you mitigate them?
正確答案: C
說明:(僅 VCESoft 成員可見)
問題13
Which operator or feature in Apache Airflow can be used to dynamically adjust the schedule of data quality checks based on the volume of incoming data?
正確答案: C
說明:(僅 VCESoft 成員可見)
問題14
How can you leverage Spark Streaming for real-time data processing and analytics?
正確答案: A
說明:(僅 VCESoft 成員可見)
問題15
What command and parameters should be used to update an existing Spark job in the Cloudera Data Engineering (CDE. service to increase its executor memory using the CDE CLI?
正確答案: B
說明:(僅 VCESoft 成員可見)

