(Immediate Joiners only)
Job Description – Data Engineer (3–5 Years Experience)
Location : Onsite (Pune)
Employment Type : Full-time
Experience : 3 to 5 Years
Department : Data Engineering
About The Machine Learning Company (TMLC)
At TMLC , we build enterprise-grade AI and Data solutions that drive measurable business impact. From predictive analytics to intelligent automation, our teams deliver end-to-end data platforms and AI products for clients across healthcare, BFSI, retail, and manufacturing domains.
We are expanding our Data Engineering practice and are looking for hands-on engineers who can design, build, and optimize robust data pipelines and warehousing solutions for large-scale enterprise systems.
Role Overview
As a Data Engineer , you will be responsible for developing and maintaining data pipelines, integrating multiple data sources, and ensuring efficient data flow into analytical and reporting systems. You will work closely with data scientists, analysts, and business teams to ensure high-quality, scalable data infrastructure that supports advanced analytics and AI use cases.
Key Responsibilities
- Design, develop, and maintain ETL / ELT pipelines using Python and SQL.
- Build and manage data warehouse models (Star / Snowflake schemas, fact / dimension design).
- Ensure data accuracy, consistency, and performance across multiple systems.
- Collaborate with AI / ML and BI teams to integrate pipelines with analytical and visualization layers.
- Optimize SQL queries and data workflows for performance and scalability.
- Monitor and troubleshoot data pipelines, ensuring reliability and minimal downtime.
- Contribute to automation and CI / CD for data workflows using version control and deployment tools.
- Document data flows, schemas, and best practices for ongoing reference.
Required Skills & Experience
3–5 years of hands-on experience in Data Engineering or ETL Development.Strong expertise in SQL (complex joins, stored procedures, performance tuning).Proficiency in Python for data manipulation and scripting (Pandas, NumPy, etc.).Good understanding of Data Warehousing concepts — schema design, dimensional modeling, and query optimization.Experience with ETL tools or frameworks (e.g., Airflow, Apache Beam, Pentaho, Talend).Exposure to version control systems (Git) and basic CI / CD practices.Strong analytical and problem-solving skills with attention to detail.Preferred / Add-on Skills
Experience in Google Cloud Platform (GCP) — BigQuery, Cloud Storage, Dataflow, Pub / Sub, or Composer.Familiarity with data orchestration frameworks and cloud-native pipeline deployment.Exposure to other cloud ecosystems (AWS, Azure) is a plus.Knowledge of data lake architectures and API-based data ingestion .Education
Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Systems, or related field.Why Join TMLC
Work on cutting-edge AI and data platforms impacting large enterprises.Collaborate with a team of AI engineers, architects, and data scientists.Opportunity to grow into cloud or ML engineering roles.Competitive compensation and exposure to enterprise-grade, global projects.