Responsibilities :
- GCP Solution Architecture & Implementation : Implement and architect data solutions on Google Cloud Platform (GCP) , leveraging its various components.
- End-to-End Data Pipeline Development : Design and create end-to-end data pipelines using technologies like Apache Beam, Google Dataflow, or Apache Spark .
- Data Ingestion & Transformation : Implement data pipelines to automate the ingestion, transformation, and augmentation of data sources, providing best practices for pipeline operations.
- Data Technologies Proficiency : Work with Python, Hadoop, Spark, SQL, BigQuery, BigTable, Cloud Storage, Datastore, Spanner, Cloud SQL, and Machine Learning services.
- Database Expertise : Demonstrate expertise in at least two of these technologies : Relational Databases, Analytical Databases, or NoSQL databases.
- SQL Development & Data Mining : Possess expert knowledge in SQL development and experience in data mining (SQL, ETL, data warehouse, etc.) using complex datasets in a business environment.
- Data Integration & Preparation : Build data integration and preparation tools using cloud technologies (like Snaplogic, Google Dataflow, Cloud Dataprep, Python, etc.).
- Data Quality & Regulatory Compliance : Identify downstream implications of data loads / migration, considering aspects like data quality and regulatory compliance.
- Scalable Data Solutions : Develop scalable data solutions that simplify user access to massive data, capable of adapting to a rapidly changing business environment.
- Programming : Proficient in programming languages such as Java and Python .
Required Skills :
GCP Data Engineering Expertise : Strong experience with GCP Data Engineering , including BigQuery, SQL, Cloud Composer / Python, Cloud Functions, Dataproc + PySpark, Python injection, Dataflow + Pub / Sub .Expert knowledge of Google Cloud Platform ; other cloud platforms are a plus.Expert knowledge in SQL development .Expertise in building data integration and preparation tools using cloud technologies (like Snaplogic, Google Dataflow, Cloud Dataprep, Python, etc.).Proficiency with Apache Beam / Google Dataflow / Apache Spark in creating end-to-end data pipelines.Experience in some of the following : Python, Hadoop, Spark, SQL, BigQuery, BigTable, Cloud Storage, Datastore, Spanner, Cloud SQL, Machine Learning .Proficiency in programming in Java, Python , etc.Expertise in at least two of these technologies : Relational Databases, Analytical Databases, NoSQL databases .Strong analytical and problem-solving skills.Capability to work in a rapidly changing business environment.Certifications (Major Advantage) :
Certified in Google Professional Data Engineer / Solution Architect.Skills Required
Google Cloud Platform, Sql Development, Python, Hadoop, Spark, Relational Databases