Greetings from TCS!
Job Title : Senior Data Engineer in a BIGDATA environment
Required Skillset : Python, Spark, Databricks, AWS (S3, Glue, AirFlow, Cloudwatch, Lambda)
Location : Kolkata
Experience Range : 8+ years
Job Description
CONTEXT
B2C Data Platform is the ENGIE B2C data platform that enables all data-producing domains to make raw or reworked data available in a datalake. In particular, this data platform can be used to manage the functions and data of the Network Relations business unit (SDSI).
As such, it enables :
- Integrate and upgrade raw data received from distributors
- Make distribution network operators data available for use by B2C and ENGIE : requirements of all operational and analytical business areas
ROLE AND MISSION DESCRIPTION
Main tasks
Master Databricks tools (job creation, cluster, notebook) and be able to query efficiently with SQLMaintain the platform in operational condition in production (analyze and correct incidents and defects)Development of Python data ingestion and transformation jobs with Spark on large volumes of dataProvide a long-term vision, both operationally and in terms of data platform strategySupport and promote best practicesParticipate in technical and functional design workshopsWrite and update technical documentationTechnical skills required in order of priority :
MUST :
PythonSparkDatabricksSQLSHOULD :
AWS (S3, Glue, AirFlow, Cloudwatch, Lambda, IAM)COULD :
Big DATAWOULD :
GITMethodologies
CI / CD avec GitlabJIRA / ConfluenceScrum