Key Responsibilities :
The Data Engineer will be responsible for designing, building, and maintaining the data products, evolution of the data products, and utilize the most suitable data architecture required for our organization's data needs to support GPS
Responsible for delivering high quality, data products and analytic ready data solutions
Develop and maintain data models to support our reporting and analysis needs.
Develop ad-hoc analytic solutions from solution design to testing, deployment, and full lifecycle management.
Optimize data storage and retrieval to ensure efficient performance and scalability
Collaborate with data architects, data analysts and data scientists to understand their data needs and ensure that the data infrastructure supports their requirements
Ensure data quality and integrity through data validation and testing
Implement and maintain security protocols to protect sensitive data
Stay up-to-date with emerging trends and technologies in data engineering and analytics
Proficient in working with data / analytic technologies such as MS Excel Macro, Tableau, Cloudera Data Platform (CDP), Domino, and AWS / Azure / MS Cloud Products, ChatGPT
Participate in the analysis, design, build, manage, and operate lifecycle of the enterprise data lake and analytics focused digital capabilities
Develop cloud-based (AWS) data pipelines to facilitate data processing and analysis
Define data operations support (example : Access management) and experience with the supporting tools
Build e-2-e data ETL pipelines from data integration ->
data processing ->
data integration ->
visualization
Responsible for maintaining of data acquisition / operational focused capabilities including : Data Catalog; User Access Request / Tracking; Data Use Request
Proficient in Python, Spark, SQL, AWS Redshift / DBs, S3, Glue / Glue-Studio, Athena, AI.
Good to have any Metadata platform (Like CDP-Impala or Okera), Neo4J, IAM, CFT & other Native AWS Service familiarity with Domino / data lake principles.
Familiarity and experience with Cloud infrastructure management and work closely with the Cloud engineering team
Participate in effort and cost estimations when required
Partner with other data, platform, and cloud teams to identify opportunities for continuous improvements
Architect and develop data solutions according to legal and company guidelines
Assess system performance and recommend improvements
Required :
5-7 years of experience in information technology field in developing AWS cloud native data lakes and ecosystems
Deeper understanding of cloud technologies preferably AWS and related services in delivering and supporting data and analytics solutions / data lakes
Proficient in Python, Spark, SQL, AWS Redshift / DBs, S3, Glue / Glue-Studio, Athena, AI.
3-5 years of experience with operations and production support, and optimizing the supporting process
Atleast 1-2 years of experience in an onshore offshore delivery model
Solid programming skills in Python and Spark and strong proficiency in Cloud – AWS
Knowledge of data security and privacy best practices
Ideal Candidates Would Also Have :
Prior experience in global life sciences especially in the GPS functional area will be a plus
Experience working internationally with a globally dispersed team including diverse stakeholders and management of offshore technical development team(s)
Strong communication and presentation skills
Other Qualifications :
Bachelor’s degree in Computer Science, Information Systems, Computer Engineering or equivalent is preferred
Software Engineer Ii • Hyderabad, India