Role - Data Engineer
Experience - 3+ yrs
Location - Bangalore
Roles and Responsibilities :
Mandatory Qualifications :
Proficiency in either Scalar or Python, expertise in Spark, and experience with performance tuning and optimization. DBT is also Mandatory.
We are looking for a candidate with 2+ years of experience in a Data Engineer role.
Experience with big data tools : HDFS / S3, Spark / Flink,Hive,Hbase, Kafka / Kinesis, etc.
Experience with relational SQL and NoSQL databases, including Elasticsearch and Cassandra / Mongodb.
Experience with data pipeline and workflow management tools : Azkaban, Luigi, Airflow, etc.
Experience with AWS / GCP cloud services
Experience with stream-processing systems : Spark-Streaming / Flink etc.
Experience with object-oriented / object function scripting languages : Java, Scala, etc.
Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
Strong analytic skills related to working with structured / unstructured datasets.
Build processes supporting data transformation, data structures, dimensional modeling, metadata, dependency, schema registration / evolution and workload management.
Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
Experience supporting and working with cross-functional teams in a dynamic environment
Have a few weekend side projects up on GitHub
Have contributed to an open-source project
Have worked at a product company
Have a working knowledge of a backend programming language
Programming Language (Mandatory) → Candidate should be really good in either Scala OR Python (not necessarily both).
Used mainly for writing data pipelines and transformations.
Big Data Framework (Mandatory) → Strong hands-on expertise in Apache Spark.
Must know how to write Spark jobs and also tune / optimize them for better performance (not just basic Spark usage).
Performance Tuning & Optimization (Mandatory) →
Able to handle large datasets efficiently.
Optimize Spark jobs (partitions, shuffles, caching, memory usage, etc.)
DBT (Mandatory) →
Must have hands-on experience with dbt (Data Build Tool) for data transformations and data modeling.
This is often used on top of warehouses like BigQuery, Snowflake, Redshift.
Senior Data Engineer • bangalore, karnataka, in