About this role
Our client is seeking an experienced AI Engineer with a specialization in Retrieval-Augmented Generation (RAG) and fine-tuning techniques. This role can be performed in a hybrid or fully remote capacity and is available on a contract or full-time basis.
In this position, you will be responsible for architecting and implementing the RAG pipeline, which includes embeddings ingestion, vector search using MongoDB Atlas or similar technologies, and context-aware chat generation. You will also design and build Python-based services using FastAPI for generating and updating embeddings, as well as host and apply LoRA/QLoRA adapters for per-user fine-tuning. Additionally, automating data pipelines for efficient data ingestion will be a key part of your responsibilities.
Key Responsibilities:
- Architect and implement the RAG pipeline.
- Manage embeddings ingestion and vector search functionalities.
- Develop context-aware chat generation solutions.
- Design and build FastAPI services for embedding generation and updates.
- Host and apply LoRA/QLoRA adapters for fine-tuning.
- Automate data pipelines for seamless data ingestion and processing.
Required Skills & Qualifications:
- Bachelor’s or Master’s degree in Computer Science or a related technical field.
- Minimum of 5 years of experience in AI engineering or a similar role.
- Proficient in Python and familiar with FastAPI.
- Experience with MongoDB Atlas or similar vector search technologies.
- Strong understanding of RAG concepts and fine-tuning techniques.
- Ability to work independently and manage multiple tasks effectively.
What we offer:
Our client provides a dynamic work environment, opportunities for professional growth, and the chance to work on cutting-edge AI technologies. You will be part of a collaborative team that values innovation and creativity.