As a data engineer, you need to prepare a resource hierarchy for your company. Suppose your company has two different applications with development and production environment. With Google's best practices in mind, what should you do?
The latest changes and updates from the administration for this exam.
Latest Update: Jun 08 2026
All questions are working fine.
As a data engineer, you need to prepare a resource hierarchy for your company. Suppose your company has two different applications with development and production environment. With Google's best practices in mind, what should you do?
Your organization has a number of relational databases hosted on Google Cloud SQL. Your task is to design a pipeline to migrate this data to BigQuery for analysis. The data in the databases changes frequently, and you are tasked to ensure that changes are reflected in BigQuery within 5 minutes of being committed in the source databases. Which of the following options should you choose?
You are a data engineer for a large company that generates a significant amount of log data daily. For compliance reasons, this data must be retained for two years. The data is accessed frequently for the first 30 days, less frequently for the next 60 days, and rarely thereafter. Which of the following lifecycle management strategies would be the most cost-effective solution for this scenario?
You have been tasked with designing a data processing pipeline in Google Cloud Platform that ingests, processes, and stores clickstream data from a website. The clickstream data has a high volume, with an incoming rate of millions of events per second, and needs to be processed in real-time for near-instantaneous analysis and visualization. What is the most appropriate architecture for this pipeline?
A medical imaging company is developing a deep learning model to assist with the diagnosis of diseases based on CT scans. The model is currently running on a standard CPU server, but the processing time is too slow to be useful in a clinical setting. What hardware accelerator should the company use to speed up the processing time of the machine learning model and make it more useful in a clinical setting?
Your company has collected a significant amount of IoT sensor data over the past year and plans to train an ML model to predict equipment failures. The dataset is very large (~500 TB) and is stored in Google Cloud Storage (GCS). You are tasked with choosing the appropriate training infrastructure. Considering the data size and the need for cost-effectiveness, which of the following should you use to train this model?
As a data engineer, you need to configure access to the Cloud SQL MySQL database. You want to be sure that traffic is encrypted while minimizing administrative tasks, such as managing SQL certificates. What should you do?
As a data engineer, you have built a multi-class classification model on Google Cloud's AI Platform. After deploying the model, you observe that the model's performance, as measured by macro-averaged F1 score, has started to decline over time. Which approach could potentially help you diagnose the cause for this performance degradation?
A media company wants to store and analyze large amounts of video content and metadata, including video titles, descriptions, and viewership statistics. The solution should be scalable and able to handle large amounts of data, as well as support fast search and retrieval of video content. What is the most appropriate solution for this scenario from the following?
You want to set up a streaming data insert into a Redis cluster running on Compute Engine instances. Because you have PII data you need to encrypt data at rest with encryption keys that you can create, rotate and destroy as needed. What should you do?