DEUTSCHE TELEKOM DIGITAL LABS PRIVATE LIMITEDGurugram
30+ days ago
Job description
Responsibilities :
You will be involved in the design of data solutions using Hadoop based technologies along with Hadoop, AWS.
Responsibilities includes design and implementation of various Big Data platform components like (Batch Processing, Live Stream Processing, In- Memory Cache, Query Layer (SQL), Rule Engine and Action Framework ).
Design and Implemented Data Access Layer, which can connect to various data sources and uses advanced caching techniques to provide fast responses to real time SQL queries using Big Data Technologies.
Implement scalable solutions to meet the ever-increasing data volumes, using big data / cloud technologies; spark, Kafka, any Cloud computing etc.
Requirements :
Programming Language : Good hands-on knowledge on one of the programming language java / scala / python.
Data Store : Good understanding of SQL & NoSQL Data bases. Mysql, Mongo.
Data Infrastructure : Familiar with cloud solutions for data infrastructure. AWS Preferably.
Data handling frameworks : Good hands on one of the data handling frameworks - Spark, Apache Beam, Apache - ink etc
File formats : Familiar di- erent types of data formats - Apache Parquet, Avro, ORC etc.
Good understanding of how to setup and optimise data pipeline and overall infrastructure for setting up ETL jobs.
Proficient with distributed file system and computations concepts.