What's Different About the New Professional Data Engineer Exam?
At the end of 2023, Google Cloud made a significant update to the Professional Data Engineer certification exam. This was not a minor refresh; the exam structure and content were completely replaced to better align with the skills and technologies that data engineers are expected to use in modern cloud environments.
The previous version of the exam had started to become outdated. It covered many topics that, while still relevant in some cases, no longer reflected the primary responsibilities of a data engineer working with Google Cloud. As a result, Google introduced an entirely new exam guide and shifted the focus of the certification to match the evolving landscape of data engineering.
This article provides a logical overview of what changed, what topics were reduced or removed, and what areas are now emphasized.
The Exam Guide Was Completely Replaced
The exam guide was fully replaced. This was not a minor revision but a complete rethinking of the structure, focus areas, and assumptions about the data engineer role. The emphasis has shifted to align more closely with what data engineers are expected to design, build, and operationalize in today’s cloud environments.
If you are studying for the certification, it is critical to use the new guide. Materials based on the previous version will not be sufficient. Candidates who rely on outdated resources will miss critical new topics and services that now make up a substantial part of the exam.
What You Will See Less Of
Several services and topics that were heavily featured on the old exam have been significantly reduced or removed. Key examples include:
Firestore (Datastore): Still relevant, but much less prominent. Expect to encounter Firestore only in specific scenarios rather than as a central focus.
Cloud SQL and Cloud Spanner: These relational database services are now covered only at a high level. Deep understanding of schema design and optimization is not emphasized.
Vertex AI and AutoML: Machine learning topics such as overfitting, hyperparameters, model types, and feature engineering are no longer central. Familiarity is helpful but no longer critical.
Cloud Natural Language API and Cloud Vision API: These specialized ML APIs are not significant parts of the exam anymore.
Compute Engine and Kubernetes Engine (GKE): While basic knowledge remains important, detailed configuration and management are not tested in depth.
Cloud IAM: Understanding roles and permissions remains important, but the exam no longer focuses heavily on detailed IAM policy writing.
Command Line and Code Usage: Candidates should still be able to use the gcloud CLI and basic commands, but memorizing syntax or writing scripts from scratch is not the primary focus.
Overall, the exam now places less emphasis on infrastructure management and machine learning engineering.
What You Will See More Of
The updated exam focuses more on real-world, business-driven data engineering solutions. Candidates are expected to understand how to architect, operationalize, and govern modern data systems. Key areas of increased focus include:
Organizational Data Sharing
Understanding how to securely and efficiently share data across teams, organizations, and clouds is now a core skill. Key services include:
BigLake: A storage engine that bridges data lakes and data warehouses.
BigQuery Omni: Enables cross-cloud querying across Google Cloud, AWS, and Azure.
Analytics Hub: Facilitates secure data sharing within and between organizations.
Data Mesh concepts: Focused on decentralized ownership and management of data domains.
These services reflect the industry's move toward distributed, multi-cloud data ecosystems.
Data Integration (Low-Code Tools)
Building data pipelines with low-code solutions is increasingly important. Candidates should be familiar with:
Dataform: A tool for SQL-based workflow orchestration and transformation.
Cloud Workflows: A service for coordinating cloud services and APIs.
Data Fusion: A fully managed ETL platform.
Datastream: A serverless change data capture and replication service.
These tools enable faster development cycles and integration without extensive custom coding.
Operationalizing Solutions
Beyond building solutions, candidates must understand how to operationalize them in production environments. Topics include:
Cloud Build: Automating builds and deployments.
CI/CD Best Practices: Structuring pipelines for data applications.
Monitoring Solutions: Observability practices using Cloud Monitoring and related tools.
The focus is on ensuring that data pipelines and applications are reliable, maintainable, and scalable.
Security and Networking
Candidates must demonstrate an understanding of fundamental cloud networking and security concepts, including:
Cloud VPC: Virtual private cloud configurations for secure resource management.
Cloud NAT: Providing outbound internet access to private resources.
Cloud Firewall: Implementing firewall rules for traffic control.
Key Management Service (KMS): Managing encryption keys securely.
Security is treated as a first-class concern, integrated into architecture rather than added later.
Governance and Data Management
Proper governance of data resources is emphasized more heavily. Key services include:
Dataplex: Unified data management for organizing, securing, and governing lakes and warehouses.
Data Catalog: Metadata management for data discovery.
Org Policy Service: Enforcing organization-wide governance and compliance policies.
Understanding these services is critical for designing scalable and compliant data architectures.
Availability and Resilience
High availability and resilience design are major priorities. Candidates must understand:
Recovery Point Objectives (RPO): How much data loss is acceptable in a disaster.
Failover Protection: Ensuring system continuity during failures.
Backup Strategies: Implementing proper backups for services such as Cloud SQL, Memorystore, Cloud Storage, and BigQuery.
These considerations are central to building fault-tolerant, production-ready systems.
The Big Trend: More Breadth, Less Depth
One major change in the updated exam is a move toward greater breadth with less technical depth in individual services.
Technical knowledge remains important, but the exam now tests a candidate's ability to design complete solutions across multiple services rather than asking in-depth configuration questions about any single service.
For example, whereas the old exam might have required detailed knowledge of Kubernetes cluster settings or machine learning hyperparameter tuning, the new exam focuses more on how you would integrate services like BigLake, Dataform, and Dataplex into a cohesive solution to meet specific business requirements.
This change reflects the way that modern data engineering is practiced: building architectures that span services, manage risk, ensure resilience, and maintain governance, rather than deep specialization in one tool.
The expectation is that candidates can think architecturally and understand trade-offs across different GCP services.
Final Thoughts
The 2023 overhaul of the Professional Data Engineer exam marks a fundamental shift in how Google Cloud defines the role of a data engineer.
Candidates who are preparing for the exam today should adapt their study plans to focus on architecture patterns, governance, security, operationalization, and modern data sharing and integration strategies.
Detailed infrastructure management and machine learning workflows, while still valuable skills, are no longer the centerpiece of the exam.
Focusing on the new topics and services introduced in the exam guide, and practicing the design of real-world solutions that meet business needs, will position candidates for success.
Understanding these changes is critical not just for passing the exam, but for being effective in data engineering roles that increasingly demand broad architectural thinking, cross-team collaboration, and integrated cloud-native solutions.
By approaching your preparation with this mindset, you will be better aligned with the real-world expectations of modern data engineering roles in the cloud era.
Additional products
-
Associate Cloud Engineer
CoursePrepares you to pass the Associate Cloud Engineer certification exam.
$10 / month
-
Professional Cloud Architect
CoursePrepares you to pass the Professional Cloud Architect certification exam.
$10 / month
-
Lifetime Access
BundlePay once and access all current and future courses I publish, forever. Any future courses I publish will automatically be added to your access.
$200