Download Google.Professional-Data-Engineer.Dump4Pass.2024-12-16.227q.vcex

Download Exam

File Info

Exam Professional Data Engineer on Google Cloud Platform
Number Professional-Data-Engineer
File Name Google.Professional-Data-Engineer.Dump4Pass.2024-12-16.227q.vcex
Size 877 KB
Posted Dec 16, 2024
Download Google.Professional-Data-Engineer.Dump4Pass.2024-12-16.227q.vcex


How to open VCEX & EXAM Files?

Files with VCEX & EXAM extensions can be opened by ProfExam Simulator.

Purchase

Coupon: MASTEREXAM
With discount: 20%






Demo Questions

Question 1

You are building a model to make clothing recommendations. You know a user’s fashion preference is likely to change over time, so you build a data pipeline to stream new data back to the model as it becomes available. 
How should you use this data to train the model? 
 


  1. Continuously retrain the model on just the new data. 
  2. Continuously retrain the model on a combination of existing data and the new data. 
  3. Train on the existing data while using the new data as your test set. 
  4. Train on the new data while using the existing data as your test set.  
Correct answer: B



Question 2

You designed a database for patient records as a pilot project to cover a few hundred patients in three clinics. 
Your design used a single database table to represent all patients and their visits, and you used self-joins to generate reports. The server resource utilization was at 50%. Since then, the scope of the project has expanded. The database must now store 100 times more patient records. You can no longer run the reports, because they either take too long or they encounter errors with insufficient compute resources. How should you adjust the database design? 
 


  1. Add capacity (memory and disk space) to the database server by the order of 200. 
  2. Shard the tables into smaller ones based on date ranges, and only generate reports with prespecified date ranges. 
  3. Normalize the master patient-record table into the patient table and the visits table, and create other necessary tables to avoid self-join. 
  4. Partition the table into smaller tables, with one for each clinic. Run queries against the smaller table pairs, and use unions for consolidated reports.  
Correct answer: C
Explanation:



Question 3

You create an important report for your large team in Google Data Studio 360. The report uses Google BigQuery as its data source. You notice that visualizations are not showing data that is less than 1 hour old. 
What should you do? 
 


  1. Disable caching by editing the report settings. 
  2. Disable caching in BigQuery by editing table details. 
  3. Refresh your browser tab showing the visualizations. 
  4. Clear your browser history for the past hour then reload the tab showing the virtualizations.  
Correct answer: A
Explanation:
Reference: https://support.google.com/datastudio/answer/7020039?hl=en  
Reference: https://support.google.com/datastudio/answer/7020039?hl=en 
 



Question 4

Your weather app queries a database every 15 minutes to get the current temperature. The frontend is powered by Google App Engine and server millions of users. How should you design the frontend to respond to a database failure? 
 


  1. Issue a command to restart the database servers. 
  2. Retry the query with exponential backoff, up to a cap of 15 minutes. 
  3. Retry the query every second until it comes back online to minimize staleness of data. 
  4. Reduce the query frequency to once every hour until the database comes back online.  
Correct answer: B



Question 5

You are creating a model to predict housing prices. Due to budget constraints, you must run it on a single resource-constrained virtual machine. Which learning algorithm should you use? 


  1. Linear regression 
  2. Logistic classification 
  3. Recurrent neural network 
  4. Feedforward neural network  
Correct answer: A



Question 6

Your company is using WILDCARD tables to query data across multiple tables with similar names. The SQL statement is currently failing with the following error: 
 
# Syntax error : Expected end of statement but got “-“ at [4:11] SELECT age 
FROM 
bigquery-public-data.noaa_gsod.gsod WHERE 
age != 99 
AND_TABLE_SUFFIX = ‘1929’ ORDER 
BY 
age DESC 
 
Which table name will make the SQL statement work correctly? 
 


  1. ‘bigquery-public-data.noaa_gsod.gsod‘ 
  2. bigquery-public-data.noaa_gsod.gsod* 
  3. ‘bigquery-public-data.noaa_gsod.gsod’* 
  4. ‘bigquery-public-data.noaa_gsod.gsod*`  
Correct answer: D
Explanation:
Reference: https://cloud.google.com/bigquery/docs/wildcard-tables  
Reference: https://cloud.google.com/bigquery/docs/wildcard-tables 
 



Question 7

You are designing a basket abandonment system for an ecommerce company. The system will send a message to a user based on these rules: 
  • No interaction by the user on the site for 1 hour 
  • Has added more than $30 worth of products to the basket Has 
  • not completed a transaction  
You use Google Cloud Dataflow to process the data and decide if a message should be sent. How should you design the pipeline? 
 


  1. Use a fixed-time window with a duration of 60 minutes. 
  2. Use a sliding time window with a duration of 60 minutes. 
  3. Use a session window with a gap time duration of 60 minutes. 
  4. Use a global window with a time based trigger with a delay of 60 minutes.  
Correct answer: C



Question 8

Your company handles data processing for a number of different clients. Each client prefers to use their own suite of analytics tools, with some allowing direct query access via Google BigQuery. You need to secure the data so that clients cannot see each other’s data. You want to ensure appropriate access to the data. Which three steps should you take? (Choose three.) 
 


  1. Load data into different partitions. 
  2. Load data into a different dataset for each client. 
  3. Put each client’s BigQuery dataset into a different table. 
  4. Restrict a client’s dataset to approved users. 
  5. Only allow a service account to access the datasets. 
  6. Use the appropriate identity and access management (IAM) roles for each client’s users.  
Correct answer: BDF



Question 9

You want to process payment transactions in a point-of-sale application that will run on Google Cloud Platform. Your user base could grow exponentially, but you do not want to manage infrastructure scaling. 
Which Google database service should you use? 
 


  1. Cloud SQL 
  2. BigQuery 
  3. Cloud Bigtable 
  4. Cloud Datastore  
Correct answer: D



Question 10

You need to store and analyze social media postings in Google BigQuery at a rate of 10,000 messages per minute in near real-time. Initially, design the application to use streaming inserts for individual postings. Your application also performs data aggregations right after the streaming inserts. You discover that the queries after streaming inserts do not exhibit strong consistency, and reports from the queries might miss in-flight data. How can you adjust your application design? 
 


  1. Re-write the application to load accumulated data every 2 minutes. 
  2. Convert the streaming insert code to batch load for individual messages. 
  3. Load the original message to Google Cloud SQL, and export the table every hour to BigQuery via streaming inserts. 
  4. Estimate the average latency for data availability after streaming inserts, and always run queries after waiting twice as long. 
     
Correct answer: D









PROFEXAM WITH A 20% DISCOUNT

You can buy ProfExam with a 20% discount!



HOW TO OPEN VCEX FILES

Use ProfExam Simulator to open VCEX files