Talent.com
No longer accepting applications
▷ Apply in 3 Minutes! Senior Software Engineer

▷ Apply in 3 Minutes! Senior Software Engineer

CogniteIndia
5 days ago
Job description

Cognite is revolutionising industrial data management through our flagship product, Cognite Data Fusion - a state-of-the-art SaaS platform that transforms how industrial companies leverage their data. We're seeking a Senior Data Platform Engineer who excels at building high-performance distributed systems and thrives in a fast-paced startup environment. You'll be working on cutting-edge data infrastructure challenges that directly impact how Fortune 500 industrial companies manage their most critical operational data.

Responsibilities :

  • High-Performance Data Systems : Design and implement robust data processing pipelines using Apache Spark, Flink, and Kafka for terabyte-scale industrial datasets.
  • Build efficient APIs and services that serve thousands of concurrent users with sub-second response times.
  • Optimise data storage and retrieval patterns for time-series, sensor, and operational data.
  • Implement advanced caching strategies using Redis and in-memory data structures.

Distributed Processing Excellence :

  • Engineer Spark applications with a deep understanding of Catalyst optimiser, partitioning strategies, and performance tuning
  • Develop real-time streaming solutions processing millions of events per second with Kafka and Flink.
  • Design efficient data lake architectures using S3 / GCS with optimised partitioning and file formats (Parquet, ORC).
  • Implement query optimisation techniques for OLAP datastores like ClickHouse, Pinot, or Druid.
  • Scalability and Performance :

  • Scale systems to 10K+ QPS while maintaining high availability and data consistency.
  • Optimise JVM performance through garbage collection tuning and memory management.
  • Implement comprehensive monitoring using Prometheus, Grafana, and distributed tracing.
  • Design fault-tolerant architectures with proper circuit breakers and retry mechanisms.
  • Technical Innovation :

  • Contribute to open-source projects in the big data ecosystem (Spark, Kafka, Airflow).
  • Research and prototype new technologies for industrial data challenges.
  • Collaborate with product teams to translate complex requirements into scalable technical solutions.
  • Participate in architectural reviews and technical design discussions.
  • Requirements :

  • Distributed Systems Experience (4-6 years) - Production Spark experience - built and optimised large-scale Spark applications with understanding of internals - Streaming systems proficiency - implemented real-time data processing using Kafka, Flink, or Spark Streaming - JVM Language expertise - strong programming skills in Java, Scala, or Kotlin with performance optimisation experience.
  • Data Platform Foundations (3+ years) - Big data storage systems - hands-on experience with data lakes, columnar formats, and table formats (Iceberg, Delta Lake) - OLAP query engines - worked with Presto / Trino, ClickHouse, Pinot, or similar high-performance analytical databases - ETL / ELT pipeline development - built robust data transformation pipelines using tools like DBT, Airflow, or custom frameworks
  • Infrastructure and Operations - Kubernetes production experience -deployed and operated containerised applications in production environments. Cloud platform proficiency - hands-on experience with AWS, Azure, or GCP data services.
  • Monitoring and observability - implemented comprehensive logging, metrics, and alerting for data systems.
  • Technical Depth Indicators :

  • Performance Engineering - System optimisation experience - delivered measurable performance improvements (2x+ throughput gains).
  • Resource efficiency - optimised systems for cost while maintaining performance requirements.
  • Concurrency expertise - designed thread-safe, high-concurrency data processing systems.
  • Data Engineering Best Practices - Data quality frameworks -implemented validation, testing, and monitoring for data pipelines.
  • Schema evolution - managed backwards-compatible schema changes in production systems.
  • Data modelling expertise - designed efficient schemas for analytical workloads.
  • Collaboration and Growth :

  • Technical Collaboration - Cross-functional partnership - worked effectively with product managers, ML engineers, and data scientists.
  • Codereview excellence - provided thoughtful technical feedback and maintained high code quality standards.
  • Documentation and knowledge sharing - created technical documentation and participated in knowledge transfer.
  • Continuous Learning - Technology adoption - quickly learned and applied new technologies to solve business problems.
  • Industry awareness - stayed current with big data ecosystem developments and best practices.
  • Problem-solving approach - demonstrated a systematic approach to debugging complex distributed system issues.
  • Startup Mindset :

  • Execution Excellence - Rapid delivery - consistently shipped high-quality features within aggressive timelines.
  • Technical pragmatism - made smart trade-offs between technical debt, velocity, and system reliability.
  • End-to-end ownership - took responsibility for features from design through production deployment and monitoring.
  • Ambiguity comfort - thrived in environments with evolving requirements and unclear specifications.
  • Technology flexibility - adapted to new tools and frameworks based on project needs.
  • Customer focus - understood how technical decisions impact user experience and business metrics.
  • Bonus Points :

  • Open-source contributions to major Apache projects in the data space (e. g. Apache Spark or Kafka) are a big plus.
  • Conference speaking or technical blog writing experience, Industrial domain knowledge - previous experience with IoT, manufacturing, or operational technology systems.
  • Technical Stack :

    Primary Technologies :

  • Languages : Kotlin, Scala, Python, Java.
  • Big Data : Apache Spark, Apache Flink, Apache Kafka.
  • Storage : PostgreSQL, ClickHouse, Elasticsearch, S3-compatible systems.
  • Infrastructure : Kubernetes, Docker, Terraform.
  • Technologies You May Work With :

  • Table Formats : Apache Iceberg, Delta Lake, Apache Hudi.
  • Query Engines : Trino / Presto, Apache Pinot, DuckDB.
  • Orchestration : Apache Airflow, Dagster.
  • Monitoring : Prometheus, Grafana, Jaeger, ELK Stack.
  • Create a job alert for this search

    Senior Software Engineer • India