Description :
Location : Pune (Kharadi), India
Experience : 8+ Years
About the Role :
We are seeking a highly skilled and experienced Senior Data Engineer to join our team in Pune. This role is crucial for designing, building, and optimizing our large-scale data processing systems and platforms. As a senior member of the team, you will apply your deep expertise in Python, PySpark, Databricks, and the AWS data ecosystem to solve complex data challenges and drive the next generation of our data infrastructure.
Key Responsibilities :
- Data Pipeline Development : Lead the design, construction, and maintenance of robust, scalable, and high-performance Big Data pipelines using Python and PySpark.
- Big Data Engineering : Apply 8+ years of experience as a Big Data Engineer to architect and manage solutions using core technologies like Hadoop and Apache Spark.
- Cloud & Platform Management : Serve as a subject matter expert for the Databricks platform, effectively leveraging its capabilities for data processing, governance, and model deployment.
- Data Structuring & Optimization : Implement and optimize data storage solutions, with extensive experience required in utilizing Delta Tables and working proficiently with JSON and Parquet file formats.
- Problem Solving : Tackle and resolve complex data processing, transformation, and integration problems, ensuring data quality, reliability, and low latency across all systems.
- Database Integration : Utilize a solid working knowledge of NoSQL and RDBMS databases for efficient data ingestion, storage, and retrieval strategies.
- Collaboration & Communication : Leverage good communication skills to effectively collaborate with cross-functional teams, including Data Scientists, Analysts, and other engineering groups, to understand requirements and deliver technical solutions.
Required Qualifications :
Experience : 8+ years of professional experience as a Big Data Engineer or in a deeply related data engineering role.Programming Proficiency : Must be proficient with Python & PySpark.Core Big Data : In-depth, practical knowledge of Hadoop and Apache Spark.Data Lake / Warehouse Technologies : Extensive hands-on experience with Databricks and expert-level usage of Delta Tables.File Formats : Must have extensive experience working with and optimizing data stored in JSON and Parquet file formats.Cloud & Ecosystem : In-depth knowledge of AWS data analytics services.Database Knowledge : Solid working Knowledge of NoSQL and RDBMS databases.Problem Solving : Proven ability to analyze and solve complex data processing, transformation, and optimization related problems.Good-to-Have Qualifications :
AWS Data Services : Hands-on experience with specific AWS data analytics services like Athena, Glue, Redshift, and EMR.Data Warehousing : Familiarity with Data Warehousing concepts and best practices.(ref : hirist.tech)