ROLE: GENAI

Experience: 5+ exp

Work Mode: onsite

The client is looking for a GenAI Engineer who can design and deploy end-to-end solutions. Key focus areas include Vector Embeddings, RAG Pipelines, and FastAPI integration. Please review the full Job Description below and focus your preparation on the "Deep-Dive" topics provided.

Full Job Description

Role Overview

We are seeking a GenAI Engineer to design, develop, and deploy Generative AI solutions that enhance business workflows and user experiences. The ideal candidate will have strong expertise in LLMs (Large Language Models), prompt engineering, and integration of AI services into scalable applications.

Key Responsibilities

Model Integration: Implement/fine-tune LLMs; build APIs/microservices for GenAI features.

Prompt Engineering: Design, optimize, and evaluate prompts for safety and accuracy.

RAG (Retrieval-Augmented Generation): Develop pipelines for document ingestion, vector embeddings, and semantic search.

App Dev: Integrate GenAI into web/mobile apps using FastAPI, Streamlit, or React.

Optimization: Monitor token usage, latency, and inference costs.

Safety: Implement moderation, bias detection, and responsible AI guidelines.

Required Skills

Python (FastAPI, Flask, Django), LLM APIs (OpenAI, Azure), Vector DBs (Pinecone, Weaviate, FAISS).

Cloud (AWS/Azure/GCP), Docker/K8s, ML fundamentals (embeddings, tokenization).

Real-time AI (SSE/WebSockets).

Preferred Skills

LangChain, LlamaIndex, Image models (Stable Diffusion), MLOps, CI/CD.

Technical Deep-Dive: Vector Embeddings

Since the JD specifically asks for knowledge of embeddings and vector databases, your engineers should be prepared to answer the following:

1. Conceptual Understanding

What are they? They are high-dimensional numerical representations of data (text, images, audio). Unlike keyword search, embeddings capture semantic meaning.

Dimensionality: Be familiar with common sizes (e.g., OpenAIs text-embedding-3-small is 1536-dimensional).

Distance Metrics: Know when to use Cosine Similarity (directional similarity) vs. Euclidean Distance (magnitude-based) vs. Dot Product.

2. Implementation Challenges

Chunking: How to break a 100-page PDF into chunks so the embedding captures context without losing detail.

Normalization: Why we normalize vectors to unit length before storing them (crucial for Cosine Similarity performance).

Matryoshka Embeddings: (Advanced 2026 topic) Being able to explain how to shorten vectors (e.g., from 3072 to 256) without losing significant accuracy to save on storage costs.

Artificial Intelligence Engineer

TerraGiG

Let experts apply while you prepare for interviews

Job Description

Services you might be interested in

We Search & Apply Jobs for You!