About the Organization-
Impetus Technologies is a digital engineering company focused on delivering expert services and products to help enterprises achieve their transformation goals. We solve the analytics, AI, and cloud puzzle, enabling businesses to drive unmatched innovation and growth.
Founded in 1991, we are cloud and data engineering leaders providing solutions to fortune 100 enterprises, headquartered in Los Gatos, California, with development centers in NOIDA, Indore, Gurugram, Bengaluru, Pune, and Hyderabad with over 3000 global team members. We also have offices in Canada and Australia and collaborate with a number of established companies, including American Express, Bank of America, Capital One, Toyota, United Airlines, and Verizon.
Locations : Noida / Indore / Pune / Gurugram / Bengaluru
Job Overview :
We are seeking talented and experienced professionals in PySpark, Big Data technologies, and cloud solutions to join our team. The position spans across three levels : Engineer , Module Lead , and Lead Engineer . In these roles, you will be responsible for developing, optimizing, and managing ETL pipelines on cloud and on-premises environments using Big Data tools and AWS services. You will collaborate with cross-functional teams, ensuring that business requirements are met through efficient and scalable data solutions.
Key Responsibilities :
- ETL Pipeline Development : Design and develop efficient ETL pipelines as per business requirements while adhering to development standards and best practices.
- AWS Integration & Testing : Perform integration testing on AWS environments and ensure seamless data operations across platforms.
- Estimation & Planning : Provide estimates for development, testing, and deployments across various environments.
- Peer Reviews & Best Practices : Participate in code peer reviews, ensuring code quality, adherence to best practices, and promoting continuous improvement within the team.
- Cost-Effective Solutions : Build and maintain cost-effective pipelines using AWS services like S3, IAM, Glue, EMR, Redshift, etc.
- Cloud Migrations : Support and manage cloud migrations from on-premise to cloud or between cloud environments.
- Orchestration & Scheduling : Manage job orchestration with tools like Airflow and any other relevant job scheduler.
Required Skills & Qualifications :
Experience :
Engineer Level : 2-5 years of experience with PySpark, Hadoop, Hive, and related Big Data technologies.Module Lead Level : 4-6 years of experience with PySpark, Hadoop, Hive, and related Big Data technologies.Lead Engineer Level : 5-7 years of experience with PySpark, Hadoop, Hive, and related Big Data technologies.Technical Skills :
Hands-on experience with PySpark (DataFrame and SparkSQL) , Hadoop , and Hive .Proficiency in Python and Bash scripting .Solid understanding of SQL and data warehouse concepts .Experience with AWS Big Data services (IAM, Glue, EMR, Redshift, S3, Kinesis) is a plus.Experience with Orchestration tools (e.g., Airflow, job schedulers) is beneficial.Analytical & Problem-Solving Skills :Strong analytical, problem-solving, and data analysis skills.Ability to think creatively and implement innovative solutions beyond readily available tools.Communication & Interpersonal Skills :Excellent communication, presentation, and interpersonal skills to collaborate with internal and external teams effectively.Desired Skills & Experience :
Experience with migrating workloads between on-premise systems and cloud environments.Familiarity with cloud-native technologies and platforms.Knowledge of performance optimization techniques for distributed data processing.For Quick Response- Interested Candidates can directly share their resume along with the details like Notice Period, Current CTC and Expected CTC at anubhav.pathania@impetus.com