Talent.com
Data Engineer - Financial Infrastructure & Analytics

Data Engineer - Financial Infrastructure & Analytics

MerilKarnataka, India
2 days ago
Job description

About the Role

As a Quantitative Data Engineer , you will be the backbone of the data ecosystem powering our quantitative research, trading, and AI-driven strategies . You will design, build, and maintain the high-performance data infrastructure that enables low-latency, high-fidelity access to market, fundamental, and alternative data across multiple asset classes.

This role bridges quant engineering, data systems, and research enablement , ensuring that our researchers and traders have fast, reliable, and well-documented datasets for analysis and live trading. You’ll be part of a cross-functional team working at the intersection of finance, machine learning, and distributed systems .

Responsibilities

  • Architect and maintain scalable ETL pipelines for ingesting and transforming terabytes of structured, semi-structured, and unstructured market and alternative data.
  • Design time-series optimized data stores and streaming frameworks to support low-latency data access for both backtesting and live trading.
  • Develop ingestion frameworks integrating vendor feeds (Bloomberg, Refinitiv, Polygon, Quandl, etc.), exchange data, and internal execution systems.
  • Collaborate with quantitative researchers and ML teams to ensure data accuracy, feature availability, and schema evolution aligned with modeling needs.
  • Implement data quality checks, validation pipelines, and version control mechanisms for all datasets.
  • Monitor and optimize distributed compute environments (Spark, Flink, Ray, or Dask) for performance and cost efficiency.
  • Automate workflows using orchestration tools (Airflow, Prefect, Dagster) for reliability and reproducibility.
  • Establish best practices for metadata management, lineage tracking, and documentation.
  • Contribute to internal libraries and SDKs for seamless data access by trading and research applications.

In Trading Firms, Data Engineers Typically :

  • Build real-time data streaming systems to capture market ticks, order books, and execution signals.
  • Manage versioned historical data lakes for backtesting and model training.
  • Handle multi-venue data normalization (different exchanges and instruments).
  • Integrate alternative datasets (satellite imagery, news sentiment, ESG, supply-chain data).
  • Work closely with quant researchers to convert raw data into research-ready features .
  • Optimize pipelines for ultra-low latency where milliseconds can impact P&L.
  • Implement data observability frameworks to ensure uptime and quality.
  • Collaborate with DevOps and infra engineers to scale storage, caching, and compute.
  • Tech Stack

  • Languages : Python, SQL, Scala, Go, Rust (optional for HFT pipelines)
  • Data Processing : Apache Spark, Flink, Ray, Dask, Pandas, Polars
  • Workflow Orchestration : Apache Airflow, Prefect, Dagster
  • Databases & Storage : PostgreSQL, ClickHouse, DuckDB, ElasticSearch, Redis
  • Data Lakes : Delta Lake, Iceberg, Hudi, Parquet
  • Streaming : Kafka, Redpanda, Pulsar
  • Cloud & Infra : AWS (S3, EMR, Lambda), GCP, Azure, Kubernetes
  • Version Control & Lineage : DVC, MLflow, Feast, Great Expectations
  • Visualization / Monitoring : Grafana, Prometheus, Superset, DataDog
  • Tools for Finance : kdb+ / q (for tick data), InfluxDB, QuestDB
  • What You Will Gain

  • End-to-end ownership of core data infrastructure in a high-impact, mission-critical domain.
  • Deep exposure to quantitative research workflows , market microstructure , and real-time trading systems .
  • Collaboration with elite quantitative researchers, traders, and ML scientists.
  • Hands-on experience with cutting-edge distributed systems and time-series data technologies .
  • A culture that emphasizes technical excellence, autonomy, and experimentation.
  • Qualifications

  • Bachelor’s or Master’s in Computer Science, Data Engineering, or related field.
  • 2+ years of experience building and maintaining production-grade data pipelines .
  • Proficiency in Python , SQL , and frameworks like Airflow , Spark , or Flink .
  • Familiarity with cloud storage and compute (S3, GCS, EMR, Dataproc) and versioned data lakes (Delta, Iceberg) .
  • Experience with financial datasets , tick-level data , or high-frequency time series is a strong plus.
  • Strong understanding of data modeling, schema design, and performance optimization .
  • Excellent communication skills with an ability to support multidisciplinary teams .
  • Create a job alert for this search

    Infrastructure Engineer • Karnataka, India