Google GCP-PDE Practice Questions

The latest changes and updates from the administration for this exam.

verified

Latest Update: Jun 08 2026

All questions are working fine.

All Questions (420)bookmarkBookmarked (0)

All420Show all accessible questions in this exam.Missed420Questions you haven't attempted yet.Incorrect0Questions you answered incorrectly on your last attempt.

Question 21

Your company wants to analyze sales data, which is growing at a rapid pace. The sales data is currently stored in Google Cloud Storage. The business wants the ability to perform ad-hoc analysis on the sales data, including querying specific data ranges, filtering by various dimensions, and performing aggregate functions. They also want to visualize their sales data with interactive dashboards and reports. Which combination of Google Cloud products would you recommend?

Use Cloud Dataproc to process the data from Cloud Storage and write it to BigQuery. Use Looker for data visualization.

Use Cloud Dataflow to process the data from Cloud Storage and write it to BigQuery. Use Data Studio for data visualization.

Use Cloud Dataproc to process the data from Cloud Storage and write it to Cloud SQL. Use Data Studio for data visualization.

Use Cloud Dataflow to process the data from Cloud Storage and write it to BigTable. Use Looker for data visualization.

Question 22

You are a data engineer working for a media company that is looking to implement a machine learning solution to automatically categorize news articles based on their content. You want to utilize Google Cloud's pre-built machine learning models due to their ease of use and you don't have the resources to train and maintain a custom model. Which service should you primarily use to achieve this task?

AutoML Tables

Natural Language

AI Platform

Vision AI

Question 23

Your company needs to store and process large amounts of scientific data generated from simulations and experiments. The data needs to be stored in a scalable, secure, and cost-effective manner. What is the recommended approach for storing and processing this scientific data using Google Cloud services?

You should store the scientific data in BigTable and use Dataflow for processing.

You should store the scientific data in Cloud Pub/Sub and use Cloud Dataflow for processing.

You should store the scientific data in BigQuery and use Cloud Dataproc for processing.

You should store the scientific data in Cloud Storage and use Cloud Functions for processing.

check_circle

Correct AnswerC

You should store the scientific data in BigQuery and use Cloud Dataproc for processing. -> Correct. BigQuery: It's designed for handling very large datasets and is optimized for extremely fast SQL queries and data analysis. BigQuery also provides real-time analytics and is highly scalable. It supports a variety of data formats, and you can use SQL queries to manipulate the data, making it a good fit for scientific data that may require complex queries for analysis. Cloud Dataproc: This is a fast, easy-to-use, fully managed cloud service for running Apache Spark and Apache Hadoop clusters. It is designed to handle batch processing, streaming, and machine learning tasks, which would be relevant in a scientific data context.

You should store the scientific data in Cloud Storage and use Cloud Functions for processing. -> Incorrect. Cloud Storage is scalable and secure but is more suited for storing raw files rather than structured data optimized for queries. Cloud Functions are more suited for lightweight, single-purpose functions triggered by events. They may not be well-suited for heavy computational scientific data processing tasks.

You should store the scientific data in BigTable and use Dataflow for processing. -> Incorrect. BigTable is more geared towards operational databases with high read and write throughput, rather than analytical processing. Dataflow could be used for processing, but it's more oriented toward real-time and batch data processing jobs, not specifically scientific data analytics which might require more computational power.

You should store the scientific data in Cloud Pub/Sub and use Cloud Dataflow for processing. -> Incorrect. Cloud Pub/Sub is generally used for event-driven systems and real-time analytics. It's not optimized for storing large amounts of data. Dataflow, as mentioned earlier, is not specifically optimized for scientific data analytics.

https://cloud.google.com/dataproc/docs

https://cloud.google.com/bigquery/docs

Question 24

You are required to establish a connection to your Linux virtual machine in Google Cloud Platform (GCP). When do you not have the obligation to generate SSH keys on your own? Select all that apply.

Select all that apply

Connection via third-party tool.

Connection via OpenSSH.

Connection via Google Cloud CLI.

Connection via Google Cloud Console.

Question 25

You are working as a data engineer and you have been asked to develop a real-time predictive Machine Learning pipeline in Google Cloud. Your company's data is streaming continuously and stored in Cloud Storage. The Machine Learning models must be retrained frequently with the most recent data and then deployed to Cloud AI Platform. Which of the following approaches would be most appropriate to achieve this?

Use Cloud Scheduler to periodically trigger model training with the most recent data and then deploy the trained model to AI Platform.

Use Cloud Functions to trigger model retraining whenever new data arrives in Cloud Storage, then deploy the trained model to AI Platform.

Use Cloud Pub/Sub to publish new data, Cloud Dataflow to preprocess it and update BigQuery, then use Cloud Composer to orchestrate model training in AutoML Tables and deployment to AI Platform.

Use Google Kubernetes Engine to run a continuous delivery pipeline that reads the streaming data, trains the model on AutoML Tables, and deploys it to AI Platform.

Question 26

A global media company has hired you to design a data pipeline that can process both batch and real-time data streams from different regions worldwide. The pipeline should be able to scale automatically based on the volume of data, support complex transformations, and make the data available in near real-time for a machine learning model hosted on AI Platform. The system should also be fault-tolerant to ensure data reliability. What would be your recommendation for such a system?

Use Cloud Pub/Sub for data ingestion, Cloud Functions for processing, and BigQuery for storage. Feed data to AI Platform from BigQuery.

Use Cloud Storage for data ingestion, Cloud Dataproc running Apache Beam jobs for processing, and Cloud Bigtable for storage. Feed data to AI Platform from Bigtable.

Use Cloud Pub/Sub for data ingestion, Cloud Dataflow for processing, and BigQuery for storage. Feed data to AI Platform from BigQuery.

Use Cloud Pub/Sub for data ingestion, Cloud Dataflow for processing, and Firestore for storage. Feed data to AI Platform from Firestore.

Question 27

A marketing company wants to build a solution for processing large amounts of customer data to predict customer behavior. They want to be able to run complex machine learning algorithms on this data and scale their processing power as needed. What solution would you recommend for this use case to process large amounts of customer data, run complex machine learning algorithms, and scale processing power as needed?

Use Apache Spark ML on Google Cloud Dataproc to process the customer data and run the machine learning algorithms.

Use Compute Engine instances to process the customer data and run the machine learning algorithms.

Use Cloud Dataflow to process the customer data and run the machine learning algorithms.

Use BigQuery to process the customer data and run the machine learning algorithms.

Question 28

As a data engineer, you have been asked to integrate data from a third-party vendor's RESTful API. The data from this API needs to be ingested in real-time, transformed, and then stored in BigQuery for real-time analysis. Which approach would you use to implement this?

Use Cloud Data Fusion to connect to the RESTful API, transform the data, and load it into BigQuery.

Use Cloud Dataflow to continuously pull data from the API, process it, and store it in BigQuery.

Use Cloud Scheduler to trigger a Cloud Function every minute to pull data from the API, process it, and store it in BigQuery.

Use Cloud Pub/Sub to pull data from the API in real-time, and Cloud Dataflow to process and store the data in BigQuery.

Question 29

You are working on a data engineering project where you need to ingest streaming data and perform real-time analysis. The data comes in high volumes, and the processing needs to scale based on the data volume. You have chosen to use Google Cloud Platform for this project. What should you do to meet these requirements?

Use BigQuery alone for both data ingestion and real-time processing.

Use Cloud Storage for data ingestion and Dataproc for real-time processing.

Use Cloud Pub/Sub for data ingestion and Cloud Dataflow for real-time processing.

Use Cloud SQL for data ingestion and Dataflow for real-time processing.

Question 30

You're a data engineer in a financial organization. The company has built a machine learning model for fraud detection, deployed on Google AI Platform. The model needs continuous evaluation since fraudulent patterns can evolve over time. The prediction input and output are saved in BigQuery. Which approach should you use for continuous evaluation?

Use Data Studio to create a report that compares the model's predictions with actual outcomes.

Use Cloud Composer to schedule a workflow that compares the model's predictions with actual outcomes daily.

Use Cloud Scheduler to trigger BigQuery ML to evaluate the model's performance daily.

Use Cloud Functions to evaluate the model's performance every time a prediction is made.

Update History

Update History