About Eloelo
Eloelo is building the future of live interactive entertainment, where real-time voice, video, and AI come together to create deeply engaging social experiences. We operate at a massive scale and are investing heavily in multimodal AI to redefine how users interact, express, and connect on our platform.
Role Overview
We're building Voxy — a deeply personalised AI companion that speaks, listens, and feels real. This isn't a ChatGPT wrapper. We train and fine-tune our own models, run our own inference stacks, and obsess over every millisecond of latency and every token of quality.
We're looking for an AI Engineer who has gone deep — someone who has read the Attention Is All You Need paper and implemented it, who knows why LoRA works at a mathematical level, and who gets uncomfortable when someone says "just call the OpenAI API."
If you've fine-tuned a model, shipped it to production, and then debugged why it degraded three weeks later — we want to talk.
What You'll Own
LLM Fine-tuning for Companion AI — SFT, RLHF, DPO pipelines on open-source base models (Gemma, LLaMA, Mistral family, etc ) to build Voxy's conversational personality engine
Efficient Training — PEFT methods including LoRA, QLoRA, adapter layers; balancing quality vs. compute budget
Text-to-Speech & Voice — Fine-tune TTS models (XTTS, Tortoise, StyleTTS2 or similar); work on prosody, emotion, and low-latency streaming voice output
Text-to-Image — Fine-tune diffusion models (Stable Diffusion, FLUX) using DreamBooth, LoRA, ControlNet for companion avatar generation
Quantisation & Inference — GPTQ, AWQ, GGUF, bitsandbytes; optimise models for GPU and edge targets without killing quality
Evaluation Systems — Build robust eval harnesses: automated metrics, human eval pipelines, regression detection, and model behaviour monitoring
Data Engineering — Own training data — curation, dedup, quality filtering, formatting.
What We're Looking For
Must-Haves
3–6 years of experience, with at least 2 years directly in ML/AI model work (not just ML infra or MLOps)
Hands-on fine-tuning experience on transformer-based LLMs — not just running training scripts, but understanding why hyperparameters matter
Deep understanding of attention mechanisms — multi-head attention, RoPE, GQA, Flash Attention — can reason about tradeoffs, not just use them
Practical experience with PEFT: LoRA rank selection, target modules, merging strategies, catastrophic forgetting mitigation
Strong grasp of quantisation — INT4/INT8, calibration, quality-compute tradeoffs across GPTQ/AWQ/GGUF
Experience with model evaluation — BLEU/ROUGE are a floor, not a ceiling; you've built task-specific evals
Data-first mindset — can write complex SQL, has built data pipelines for training, understands data quality deeply
Strong DSA fundamentals — can pass a coding round at a top product company; writes efficient, production-quality Python
Good to Have
Prior work on voice models — TTS fine-tuning, voice cloning, vocoder pipelines (HiFi-GAN, BigVGAN), streaming inference
Experience with diffusion models — training dynamics, classifier-free guidance, LoRA for personalisation
Familiarity with multi-modal architectures — how vision encoders interface with LLMs (LLaVA, Idefics, etc.)
Contributed to open-source ML projects or published papers/technical blogs with real traction
Experience with distributed training — DDP, FSDP, DeepSpeed ZeRO stages
Proven Track Record — This Is Non-Negotiable
You must be able to show at least one of the following:
A fine-tuned model you shipped to production with measurable impact (latency, quality, cost)
An open-source project, HuggingFace model card, or technical writeup that demonstrates depth — not just a notebook
A prior role where you owned an AI model end-to-end, from data to deployment
Research/academic work in NLP, speech, or generative models with real results
We will ask you to walk through your work in detail during interviews. Vague answers don't pass here.
Tech Stack, You'll Work With
PyTorch · HuggingFace Transformers / PEFT / Datasets · vLLM / TGI · bitsandbytes · Axolotl / LLaMA-Factory · Weights & Biases · Python · SQL / Athena · CUDA (basic understanding) · GCP / AWS
Why Join Eloelo
Work on a real model stack — not prompt engineering on top of GPT-4
Direct impact on millions of users in a live product
Small, high-ownership team — your decisions ship
Competitive compensation + ESOPs