Talent.com
Data Engineer - ETL / PySpark

Data Engineer - ETL / PySpark

Talent VelocityBhopal
15 days ago
Job description

Description :

Experience : 6 to 10 Years

Location : Any Xebia Location (Bangalore, Chennai, Pune, Bhopal, Hyderabad, Jaipur, Gurgaon)

Contract Type : TPC 6 Months (Extendable)

Interview Platform : Flow Career (Ensure weekend availability for interviews)

About the Role :

We are seeking an experienced Data Engineer with deep expertise in Databricks, Python, SQL, and Postgres. The ideal candidate will have practical experience working with Vector Databases (pgvector, Qdrant, Pinecone, etc.) and exposure to Generative AI use cases such as Retrieval-Augmented Generation (RAG) pipelines and embedding-based search.

You will collaborate with cross-functional teams to design, build, and optimize scalable data pipelines and contribute to innovative AI-driven data solutions on Azure.

Key Responsibilities :

  • Design and develop robust ETL / ELT pipelines using Databricks, PySpark, and Azure Data Factory.
  • Implement scalable data processing workflows integrating Delta Lake and Azure Data Lake Storage Gen2.
  • Build, optimize, and maintain Postgres databases (schema design, indexing, performance tuning).
  • Develop and manage RAG pipelines and vector search capabilities using pgvector, Qdrant, or Pinecone.
  • Work on embedding generation and integration with Azure OpenAI or equivalent Generative AI services.
  • Write and optimize complex SQL queries for analytics and data modeling.
  • Automate deployments and manage version control using CI / CD and Git-based workflows.
  • Collaborate with data scientists, ML engineers, and business teams to deliver high-quality data products.

Required Skills :

  • Python (PySpark, pandas, API integration)
  • Databricks (notebooks, Delta Lake, workflows)
  • SQL (data modeling, optimization, analytics)
  • Postgres (schema design, indexing, query tuning)
  • Azure Services Data Factory, Data Lake Storage Gen2, Synapse Analytics
  • Vector Databases pgvector, Qdrant, Pinecone (any)
  • Generative AI Exposure RAG pipelines, embeddings, integration with Azure OpenAI
  • Version Control & DevOps Git, CI / CD pipelines
  • Additional Information :

  • Immediate joiners preferred.
  • Interviews will be conducted via Flow Career please ensure weekend availability for interviews.
  • (ref : hirist.tech)

    Create a job alert for this search

    Data Engineer • Bhopal