Lead Data Engineer, Group 42 (G42)

Apply for this job

Email *
Executive Name *

Job Description

The Lead Data Engineer at Inception builds scalable data pipelines which support both AI and LLM and RAG system activities. The role focuses on building reliable data infrastructure which enables vector search and retrieval systems while maintaining data quality across multiple platforms. The project requires teamwork between AI engineers and engineering teams to create secure data solutions which operate at high performance for enterprise systems.

Job ID: 2805

Date Posted: NA

Expiration Date: NA

Apply: Click Here

Main Duties

  • Design and optimize scalable data pipelines for advanced AI and LLM workloads efficiently.
  • Develop ETL and ELT workflows for structured, unstructured, and streaming enterprise data systems.
  • Build vector database indexing and similarity search pipelines supporting intelligent retrieval systems.
  • Enable RAG, semantic search, and enterprise knowledge retrieval solutions across distributed environments.
  • Ensure data quality, monitoring, observability, and reliability across large-scale production data pipelines.

Essential Qualifications

  • Requires applicants to have a bachelor’s degree in computer science or engineering or a related technical field. 
  • Eight years of experience in data engineering and distributed systems and AI infrastructure development. 
  • Possesses advanced skills in Python programming which he uses for data processing and automation and API development and handling distributed workloads. 
  • Demonstrates SQL proficiency and possesses practical experience with NoSQL databases in enterprise-level settings. 
  • Possesses practical knowledge of vector databases and data modeling and ETL pipeline development frameworks.

Preferred Qualifications

  • Experience in creating and implementing RAG pipelines which meet production standards for use in enterprise AI systems. 
  • Expertise in using graph databases together with hybrid search systems to develop advanced enterprise data retrieval solutions. 
  • Maintains knowledge about LLM deployment together with inference optimization and caching methods that support scalable AI system development. 
  • Acquired knowledge about cloud security procedures together with data management practices and identity access control systems that operate on multiple platforms. 
  • Worked with Kubernetes together with distributed systems to create infrastructure deployments that deliver both scalability and reliability.