This job offer is not available in your country.

Data Engineer Intern

TensorStaxHyderabad, IN

2 days ago

Job description

TensorStax is building the next generation of autonomous agents for data engineering. Backed by a $5M seed round.

The Role

As Data Engineer you will design, build, and optimize production-grade pipelines that our agents learn from and eventually operate. You will own the modeling layer in dbt, the orchestration layer in Airflow, and the heavy-lift workloads in Spark.

What You’ll Do

Model complex, interdependent schemas in dbt across hundreds of tables
Build advanced, multi-branch Airflow DAGs with sophisticated dependency and failure handling
Author high-performance Spark jobs (PySpark or Scala) for large-scale batch and incremental workloads
Codify lineage, testing, and metadata so agents can reason about pipeline state
Profile and tune query performance across warehouses and lakehouse engines
Partner with the agent research team to expose realistic failure modes, data drifts, and SLA violations for RL training
Containerize and deploy everything on Kubernetes-backed infra

About You

4+ years in data engineering or analytics engineering, shipping pipelines at scale

Deep experience with dbt, including macros, custom tests, and refactoring legacy models

Track record building and debugging complex Airflow DAGs (Sensors, TaskGroups, SubDAG patterns)

Spark power-user capable of distributed joins, window functions, and memory tuning

Solid Python, Git, and CI discipline

Bonus : experience with Iceberg, Delta, or DataFusion; prior RL or agent work

Why TensorStax

Write the pipelines our autonomous agents learn to operate

Work in a tight, senior team that values clean code and measurable impact

Competitive salary, meaningful equity, and hardware budget

Remote-first with optional SF office

Create a job alert for this search

Data Engineer • Hyderabad, IN