Although the role category specified in the GPP is Remote, the requirement is for Hybrid.
Key Responsibilities :
- Design and Automation : Deploy distributed systems for ingesting and transforming data from various sources (relational, event-based, unstructured).
- Data Quality and Integrity : Implement frameworks to monitor and troubleshoot data quality and integrity issues.
- Data Governance : Establish processes for managing metadata, access, and retention for internal and external users.
- Data Pipelines : Build reliable, efficient, scalable, and quality data pipelines with monitoring and alert mechanisms using ETL / ELT tools or scripting languages.
- Database Structure : Design and implement physical data models to optimize database performance through efficient indexing and table relationships.
- Optimization and Troubleshooting : Optimize, test, and troubleshoot data pipelines.
- Large Scale Solutions : Develop and operate large-scale data storage and processing solutions using distributed and cloud-based platforms (e.g., Data Lakes, Hadoop, Hbase, Cassandra, MongoDB, Accumulo, DynamoDB).
- Automation : Use modern tools and techniques to automate common, repeatable, and tedious data preparation and integration tasks.
- Infrastructure Renovation : Renovate data management infrastructure to drive automation in data integration and management.
- Agile Development : Ensure the success of critical analytics initiatives using agile development technologies such as DevOps, Scrum, Kanban.
- Team Development : Coach and develop less experienced team members.
External Qualifications and Competencies
Qualifications :
College, university, or equivalent degree in a relevant technical discipline, or equivalent experience required. Licensing may be required for compliance with export controls or sanctions regulations.Competencies :
System Requirements Engineering : Translate stakeholder needs into verifiable requirements; establish acceptance criteria; track requirements status; assess impact of changes.Collaboration : Build partnerships and work collaboratively to meet shared objectives.Communication : Develop and deliver communications that convey a clear understanding of the unique needs of different audiences.Customer Focus : Build strong customer relationships and deliver customer-centric solutions.Decision Quality : Make good and timely decisions to keep the organization moving forward.Data Extraction : Perform ETL activities from various sources using appropriate tools and technologies.Programming : Create, write, and test computer code, test scripts, and build scripts to meet business, technical, security, governance, and compliance requirements.Quality Assurance Metrics : Apply measurement science to assess solution outcomes using ITOM, SDLC standards, tools, metrics, and KPIs.Solution Documentation : Document information and solutions to enable improved productivity and effective knowledge transfer.Solution Validation Testing : Validate configuration item changes or solutions using SDLC standards, tools, and metrics.Data Quality : Identify, understand, and correct data flaws to support effective information governance.Problem Solving : Solve problems using systematic analysis processes; implement robust, data-based solutions; prevent problem recurrence.Values Differences : Recognize the value of different perspectives and cultures.Additional Responsibilities Unique to this Position
Skills :
ETL / Data Engineering Solution Design and Architecture : Expert level.SQL and Data Modeling : Expert level (ER Modeling and Dimensional Modeling).Team Leadership : Ability to lead a team of data engineers.MSBI (SSIS, SSAS) : Experience required.Databricks (Pyspark) and Python : Experience required.Additional Skills : Snowflake, Power BI, Neo4j (good to have).Communication : Good communication skills.Preferred Experience :
8+ years of overall experience.5+ years of relevant experience in data engineering.Knowledge of the latest technologies and trends in data engineering.Technologies : Familiarity with analyzing complex business systems, industry requirements, and data regulations.Big Data Platform : Design and development using open source and third-party tools.Tools : SPARK, Scala / Java, Map-Reduce, Hive, Hbase, Kafka.SQL : Proficiency in SQL query language.Cloud-Based Implementation : Experience with clustered compute cloud-based implementations.Large File Movement : Experience developing applications requiring large file movement for cloud environments.Analytical Solutions : Experience in building analytical solutions.IoT Technology : Intermediate experience preferred.Agile Software Development : Intermediate experience preferred.Role : Data Engineer
Industry Type : Industrial Equipment / Machinery
Department : Engineering - Software & QA
Employment Type : Full Time, Permanent
Role Category : Software Development
Education
UG : B.Tech / B.E. in Any Specialization
PG : Any Postgraduate
Skills Required
Pyspark, snowflake , Databricks, Python, Sql