About Us
About DATAECONOMY : We are a fast-growing data & analytics company headquartered in Dublin with offices inDublin, OH, Providence, RI, and an advanced technology center in Hyderabad,India. We are clearly differentiated in the data & analytics space via our suite of solutions, accelerators, frameworks, and thought leadership.
Job Description
We are seeking a highly experienced and hands-on Lead / Senior Data Engineer to architect, develop, and optimize data solutions in a cloud-native environment. The ideal candidate will have 7–12 years of strong technical expertise in AWS Glue, PySpark, and Python , along with experience designing robust data pipelines and frameworks for large-scale enterprise systems. Prior exposure to the financial domain or regulated environments is a strong advantage.
Key Responsibilities
- Solution Architecture : Design scalable and secure data pipelines using AWS Glue, PySpark, and related AWS services (EMR, S3, Lambda, etc.)
- Leadership & Mentorship : Guide junior engineers, conduct code reviews, and enforce best practices in development and deployment.
- ETL Development : Lead the design and implementation of end-to-end ETL processes for structured and semi-structured data.
- Framework Building : Develop and evolve data frameworks, reusable components, and automation tools to improve engineering productivity.
- Performance Optimization : Optimize large-scale data workflows for performance, cost, and reliability.
- Data Governance : Implement data quality, lineage, and governance strategies in compliance with enterprise standards.
- Collaboration : Work closely with product, analytics, compliance, and DevOps teams to deliver high-quality solutions aligned with business goals.
- CI / CD Automation : Set up and manage continuous integration and deployment pipelines using AWS CodePipeline, Jenkins, or GitLab.
- Documentation & Presentations : Prepare technical documentation and present architectural solutions to stakeholders across levels.
Requirements
Required Qualifications :
7–12 years of experience in data engineering or related fields.Strong expertise in Python programming with a focus on data processing.Extensive experience with AWS Glue (both Glue Jobs and Glue Studio / Notebooks).Deep hands-on experience with PySpark for distributed data processing.Solid AWS knowledge : EMR, S3, Lambda, IAM, Athena, CloudWatch, Redshift, etc.Proven experience in architecture and managing complex ETL workflows.Proficiency with Apache Airflow or similar orchestration tools.Hands-on experience with CI / CD pipelines and DevOps best practices.Familiarity with data quality, data lineage, and metadata management.Strong experience working in agile / scrum teams.Excellent communication and stakeholder engagement skills.Preferred / Good To Have
Experience in financial services, capital markets, or compliance systems.Knowledge of data modeling, data lakes, and data warehouse architecture.Familiarity with SQL (Athena / Presto / Redshift Spectrum).Exposure to ML pipeline integration or event-driven architecture is a plus.Benefits
Flexible work culture and remote optionsOpportunity to lead cutting-edge cloud data engineering projectsSkill-building in large-scale, regulated environments.Skills Required
S3, Pyspark, AWS Glue, Emr, Redshift, Apache Airflow, Jenkins, Lambda, Cloudwatch, Iam, Gitlab, Python