Role- Data EngineerLocation- Bangalore(Hybrid)Experience- 4-6 yrsAbout Scientist Technologies : At Scientist Technologies, we believe in driving global progress through scientific innovation, engineering expertise, and policy collaboration. Our mission is to create solutions that empower businesses, societies, and public institutions to tackle the most pressing challenges of our time. We offer a dynamic, collaborative environment where your expertise contributes to groundbreaking projects that weave together science, business, and policy for equitable human progress.Key Responsibilities : Design, develop, and maintain scalable ETL / ELT pipelines using Python and PySpark.Ensure efficient data movement, transformation, and integration from various data sources to target systemsManage large-scale data storage systems like HDFS, S3, or Delta Lake.Develop and maintain agile enterprise-grade data pipelines, Work with structured and unstructured datasetsTransform raw data into performant data models for analysts / scientists.Implement high-quality practices : data monitoring, alerting, and DQ frameworks.Design and develop RESTful Web ServicesDesign and implement end-to-end flow for executing large-scale data mining and analytics processes using platforms like Spark and KafkaWork with Data Scientists to understand the business problems and the analytics solutions and implement different algorithms in PythonIdentify & address application issues that affect application integrityAnalyze and ensure efficient transition of all technical design documentsWork effectively with Tech leads, Scientist and Data Science Developers to create and maintain the overall data architecture, priorities and success measures for the enterpriseAddress application issues and ensure design document transitions.Develop, deploy, and maintain machine learning models and training pipelines.Monitor the performance and efficiency of deployed AI models and pipelines.Key Qualifications : Programming : Proficiency in Python with strong coding and debugging skills.Advanced SQL, dimensional modeling, and data architectureProficiency in cloud technologies, AWS (preferred)Experience with orchestration tools (e.g., Airflow)Strong knowledge and experience in object-oriented programming, PythonStrong knowledge and experience in design and development of RESTful Web ServicesStrong knowledge and experience in using platforms for parallel processing and job streaming (like Spark and Kafka).Experience with both relational databases (e.g., PostgreSQL, MySQL) and NoSQL databases (e.g., Cassandra, MongoDB, DynamoDB).Familiarity with containerization and orchestration tools such as Docker and Kubernetes.Skills in monitoring tools (Datadog, Splunk, ELK stack).Good analytical skills and understanding of basics concepts in data scienceExperience working on analytic / data science projects like optimization and data mining is a plusMonitoring and Logging : Experience with tools like Datadog, Splunk, or ELK stack for pipeline monitoring and troubleshooting.Able to balance and prioritize multiple concurrent projects effectively.Effective communication and ability to work in cross-functional teams.Degree in Data Engineering, Computer Science, or comparable practical experience.