Data Engineer - Python/SparkVarite • Hyderabad

Data Engineer - Python / Spark

Varite • Hyderabad

30+ days ago

Job description

About The Job :

Develops technical tools and programming to cleanse, organize and transform data and to maintain, protect and update data structures and integrity on an automated basis.
Applies data extraction, transformation, and loading techniques in order to tie together large data sets from a variety of sources.
Partners with both internal & external sources to design, build and oversee the deployment and operation of technology architecture, solutions and software.
Designs, develops and programs methods, processes and systems to capture, manage, store and utilize structured and unstructured data to generate actionable insights and solutions.
Responsible for the maintenance, improvement, cleaning, and manipulation of data in the business client's operational and analytics databases.
Proactively analyzes and evaluates the business client's databases in order to identify and recommend improvements and optimization.

Essential Job Functions :

Uses knowledge of existing and emerging data science engineering principles, theories, and techniques to inform business decisions; and produce accurate business insights.

Completes projects and assignments of moderate scope and complexity under normal supervision to ensure customer and business needs are met.

Applies discretion and independent judgement to interpret data trends and summarize data insights.

Assists in the preliminary data exploration, data preparation for accurate model development.

Establishes working relationships with others outside area of Data Science Engineering expertise.

Prepares presentations of project outputs for external customers with assistance.

Design, develop, and maintain scalable data pipelines and systems for data processing.

Utilize Data Lakehouse, Spark on Kubernetes and related technologies to manage large-scale data processing.

Perform data ingestion from various sources like API's, RDBMS, NoSQL DB's, Kafka, Middleware & Files using Spark and process data into Lakehouse platform.

Develop and maintain py-spark scripts for automation of data processing tasks.

Implement full and incremental data loading strategies to ensure data consistency and availability.

Orchestrate and monitor workflows using Apache Airflow.

Ensure code quality and version control using GIT.

Troubleshoot and resolve data-related issues in a timely manner.

Stay up-to-date with the latest industry trends and technologies to continuously improve our data infrastructure.

Design, develop, and maintain scalable data pipelines and systems for data processing.

Utilize Data Lakehouse, Spark on Kubernetes and related technologies to manage large-scale data processing.

Perform data ingestion from various sources like API's, RDBMS, NoSQL DB's, Kafka, Middleware &

Files using Spark and process data into Lakehouse platform.

Develop and maintain py-spark scripts for automation of data processing tasks.

Implement full and incremental data loading strategies to ensure data consistency and availability.

Orchestrate and monitor workflows using Apache Airflow.

Ensure code quality and version control using GIT.

Troubleshoot and resolve data-related issues in a timely manner.

Stay up-to-date with the latest industry trends and technologies to continuously improve our data :

Proven experience as a Data Engineer (ETL, data warehousing, data Lakehouse).

Strong knowledge of Spark on Kubernetes, S3 and Docker Images.

Proficiency in Data engineering techniques with Py-spark.

Strong experience in Data warehousing techniques like data mining, data analysis, data profiling.

Experience with Python scripting for automation.

Expertise in full and incremental data loading techniques.

Excellent problem-solving skills and attention to detail.

Ability to work collaboratively in a team environment and communicate effectively with stakeholders.

Good to have :

Understanding of streaming data applications using.

Hands-on experience with Apache Airflow for workflow orchestration.

Proficiency with GIT for version control

Understanding of data engineering integration with LLMs or GEN-AI applications and Vector DB.

Knowledge on Shell scripting Postgres SQL or SQL server or MSBI.

(ref : hirist.tech)

Create a job alert for this search

Data Engineer • Hyderabad

Related jobs

Data Engineer – Databricks Platform

Amicon Hub Services • Hyderabad, Telangana, India

Delta Lake, Spark, PySpark, SQL).SQL Server, MongoDB, InfluxDB).Kafka, Azure Event Hubs, or similar).Excellent problem-solving skills and the ability to work in a fast-paced environment.Familiar wi...Show more

Last updated: 26 days ago • Promoted

Senior Data Engineer

Intellias • Hyderabad, IN

Apache Flink / Apache Spark (Streaming).Data Engineer or similar role, with hands-on expertise in large-scale, production-grade data pipelines. Kafka + Flink / Spark Streaming).Python for data engin...Show more

Last updated: 16 days ago • Promoted

Senior Python Data Engineer

iVoyant • hyderabad, telangana, in

Join a dynamic engineering team working on a high-impact tax reporting platform for the 2025 fiscal season.The core goal is to modernize and significantly accelerate the generation of Excel-based r...Show more

Last updated: 2 days ago • Promoted

Senior Data Engineer - AWS & Python

Egen • hyderabad, telangana, in

Design, develop, and maintain ETL / ELT data pipelines using Python and AWS native services (Glue, Lambda, EMR, Step Functions, etc. Build and manage data lakes and data warehouses using Amazon S3, Re...Show more

Last updated: 20 hours ago • Promoted • New!

Data Engineer with Databricks & GCP

iitjobs, Inc. • secunderabad, telangana, in

Role : - Data Engineer with Databricks & GCP.Extensive hands-on experience with Databricks (Autoloader, DLT, Delta Lake, CDF) and PySpark. Expertise in SQL and advanced query optimization.Proficiency ...Show more

Last updated: 20 hours ago • Promoted • New!

Lead Data Engineer - Python & GCP || Contract Job || 8-10 Years Experience

People Prime Worldwide • Hyderabad, IN

Our Client is a global IT services company headquartered in Southborough, Massachusetts, USA.Founded in 1996, with a revenue of $1. B, with 35,000+ associates worldwide, specializes in digital engin...Show more

Last updated: 2 hours ago • Promoted • New!

Senior Data Engineer

Straive • Hyderabad, Telangana, India

The ideal candidate is a strong software engineer with hands-on experience in Spark (3.You'll be responsible for designing and implementing ETL / ELT solutions, collaborating with teams to deliver da...Show more

Last updated: 30+ days ago • Promoted

Data Engineer

Egen • Hyderabad, Telangana, India

Job Overview : We are looking for a skilled and motivated Lead Data Engineer with strong experience in Python programming and Google Cloud Platform (GCP) to join our data engineering team.The ideal...Show more

Last updated: 30+ days ago • Promoted

Data Engineer

People Prime Worldwide • Hyderabad, Telangana, India

Our client is a trusted global innovator of IT and business services, present in 50+ countries.They specialize in digital & IT modernization, consulting, managed services, and industry-specific sol...Show more

Last updated: 30+ days ago • Promoted

Data Engineer

IntraEdge • Hyderabad, IN

Python, PySpark, AWS services (Glue, Lambda), and Snowflake.The ideal candidate will design, build, and maintain scalable data pipelines, ensure efficient data integration, and enable advanced anal...Show more

Last updated: 30+ days ago • Promoted

Data Engineer - Snowflake

Prudent Technologies and Consulting, Inc. • hyderabad, telangana, in

We are seeking a skilled Data Engineer with strong experience in Python, Snowflake, and AWS.The ideal candidate will be responsible for building and optimizing scalable data pipelines, integrating ...Show more

Last updated: 2 days ago • Promoted

AWS Data Engineer

People Prime Worldwide • Hyderabad, India

They balance innovation with an open, friendly culture and the backing of a long-established parent company, known for its ethical reputation. We guide customers from what’s now to what’s next by un...Show more

Last updated: 3 hours ago • Promoted • New!

Data Engineer

WhiteLotus Talent Partners • Hyderabad, Telangana, India

Job Title- Azure Data Engineer Pyspark Location- Whitefield / Domlur Embassy Park , Bangalore Mode- Onsite Job Type : Full-Time - Minimum 5 years of PySpark Development experience, especially in...Show more

Last updated: 15 hours ago • Promoted • New!

Lead Data Pipeline Engineer

Straive • Hyderabad, Republic Of India, IN

Last updated: 30+ days ago • Promoted

Data Solutions Engineer

Straive • Hyderabad, Republic Of India, IN

Design, build and maintain scalable.Implement core ETL / ELT logic in Scala and Python;.Write and optimize complex SQL for ingestion, transformation and consumption layers. Tune Spark jobs for perform...Show more

Last updated: 2 days ago • Promoted

Cloud Data Engineer (AWS)

Innova Solutions • Hyderabad, Republic Of India, IN

INNOVA Hiring Immediate Joiner For.AWS Data Engineer with SCD -1 / 2 ,AWS, SQL, Pyspark, Python @ Hyderabad, Chennai, Noida. JD : The role focuses on ETL development, AWS Cloud technologies (Glue, Athe...Show more

Last updated: 16 days ago • Promoted

Data Engineer

Ubique Systems • Hyderabad, IN

Primary skills : Python, SQL, data lakes, azure.Pipeline Development & Automation.Design, build, and maintain CI / CD pipelines to automate deployment of DQ rules and data services across environments...Show more

Last updated: 30+ days ago • Promoted

Data Engineer

Straive • Hyderabad, Telangana, India

Key Responsibilities : • Design, build and maintain scalable batch and (optionally) streaming data pipelines using Apache Spark 3. Scala, Python & SQL • Implement core ETL / ELT logic in Scala and Py...Show more

Last updated: 30+ days ago • Promoted