Google GCP-PDE Practice Questions

The latest changes and updates from the administration for this exam.

verified

Latest Update: Jun 08 2026

All questions are working fine.

All Questions (420)bookmarkBookmarked (0)

All420Show all accessible questions in this exam.Missed420Questions you haven't attempted yet.Incorrect0Questions you answered incorrectly on your last attempt.

Question 131

As a data engineer, you need to prepare a resource hierarchy for your company. Suppose your company has two different applications with development and production environment. With Google's best practices in mind, what should you do?

You should create all applications in one project.

You should create two projects, each for one environment.

You should create four different projects (for each application and environment). This isolates the environments from each other, so changes to the development project don't accidently impact production environment. This also gives you better access control.

You should create two projects, each for one application.

Question 132

Your organization has a number of relational databases hosted on Google Cloud SQL. Your task is to design a pipeline to migrate this data to BigQuery for analysis. The data in the databases changes frequently, and you are tasked to ensure that changes are reflected in BigQuery within 5 minutes of being committed in the source databases. Which of the following options should you choose?

Use Cloud Dataflow to create a streaming pipeline that reads data from Cloud SQL and writes it to BigQuery.

Use Cloud SQL's Change Data Capture feature to track changes and push these changes to BigQuery.

Use Data Fusion to read data from Cloud SQL and write it to BigQuery.

Set up Cloud SQL export to Google Cloud Storage, then use Cloud Dataflow to read the data from Google Cloud Storage and write it to BigQuery.

Question 133

You are a data engineer for a large company that generates a significant amount of log data daily. For compliance reasons, this data must be retained for two years. The data is accessed frequently for the first 30 days, less frequently for the next 60 days, and rarely thereafter. Which of the following lifecycle management strategies would be the most cost-effective solution for this scenario?

Store all data in Standard storage class for 2 years.

Store data in Standard storage class for 30 days, transition to Nearline for 60 days, then move to Coldline for the remainder.

Store data in Standard storage class for 30 days, transition to Coldline for the remainder.

Store data in Standard storage class for 90 days, transition to Coldline for the remainder.

Question 134

You have been tasked with designing a data processing pipeline in Google Cloud Platform that ingests, processes, and stores clickstream data from a website. The clickstream data has a high volume, with an incoming rate of millions of events per second, and needs to be processed in real-time for near-instantaneous analysis and visualization. What is the most appropriate architecture for this pipeline?

Use Cloud Dataflow to ingest, process, and store the data in BigQuery.

Use Apache Kafka running on Compute Engine to ingest the data, and use Apache Spark running on Compute Engine to process the data and store the results in BigQuery.

Use Pub/Sub to ingest the data, and use Apache Flink running on Compute Engine to process the data and store the results in Cloud Storage.

Use Cloud Functions to process the data in real-time, and store the processed data in BigQuery.

Question 135

A medical imaging company is developing a deep learning model to assist with the diagnosis of diseases based on CT scans. The model is currently running on a standard CPU server, but the processing time is too slow to be useful in a clinical setting. What hardware accelerator should the company use to speed up the processing time of the machine learning model and make it more useful in a clinical setting?

ASIC (Application-Specific Integrated Circuit)

FPGA (Field-Programmable Gate Array)

GPU (Graphics Processing Unit)

TPU (Tensor Processing Unit)

Question 136

Your company has collected a significant amount of IoT sensor data over the past year and plans to train an ML model to predict equipment failures. The dataset is very large (~500 TB) and is stored in Google Cloud Storage (GCS). You are tasked with choosing the appropriate training infrastructure. Considering the data size and the need for cost-effectiveness, which of the following should you use to train this model?

Use Cloud AutoML.

Use Dataproc with Hadoop MapReduce.

Use Dataflow with TensorFlow transformations (tf.Transform).

Use Google Cloud AI Platform Training with distributed training.

Question 137

As a data engineer, you need to configure access to the Cloud SQL MySQL database. You want to be sure that traffic is encrypted while minimizing administrative tasks, such as managing SQL certificates. What should you do?

You should use Cloud SQL Proxy.

You should use a private IP address.

You should use a public IP address.

You should use the TLS protocol.

Question 138

As a data engineer, you have built a multi-class classification model on Google Cloud's AI Platform. After deploying the model, you observe that the model's performance, as measured by macro-averaged F1 score, has started to decline over time. Which approach could potentially help you diagnose the cause for this performance degradation?

Use AI Platform Explanations to understand the feature importance in your model's predictions.

Use AI Platform Prediction to increase the prediction speed.

Use Cloud Functions to trigger functions based on cloud events.

Use Google Data Studio to create a dashboard and analyze the metrics.

Question 139

A media company wants to store and analyze large amounts of video content and metadata, including video titles, descriptions, and viewership statistics. The solution should be scalable and able to handle large amounts of data, as well as support fast search and retrieval of video content. What is the most appropriate solution for this scenario from the following?

Use Cloud Storage for storing the video content and metadata, Cloud Pub/Sub for real-time data processing, and Cloud Bigtable for data analysis and fast search and retrieval.

Use Cloud Storage for storing the video content and metadata, BigQuery for real-time data processing, and for data analysis and fast search and retrieval.

Use Cloud Bigtable for storing the video content and metadata, Cloud Pub/Sub for real-time data processing, and for data analysis and fast search and retrieval.

Use Cloud Storage for storing the video content and BigQuery for storing metadata, Cloud Pub/Sub for real-time data processing, and Cloud Bigtable for data analysis and fast search and retrieval.

Question 140

You want to set up a streaming data insert into a Redis cluster running on Compute Engine instances. Because you have PII data you need to encrypt data at rest with encryption keys that you can create, rotate and destroy as needed. What should you do?

You should create encryption keys locally. Then, upload your encryption keys to Cloud KMS and use those keys to encrypt your data in all of the Compute Engine cluster instances.

You should create encryption keys in Cloud KMS. Then, reference those keys in your API service calls when accessing the data in your Compute Engine cluster instances.

You should create a dedicated service account, and use encryption at rest to reference your data stored in your Compute Engine cluster instances as part of your API service calls.

You should create encryption keys in Cloud KMS. Then, use those keys to encrypt your data in all of the Compute Engine cluster instances.

Update History

Update History