AI - Engineer

3-6

Eloelo -HQ

Full-Time

About Eloelo

Eloelo is building the future of live interactive entertainment, where real-time voice, video, and AI come together to create deeply engaging social experiences. We operate at a massive scale and are investing heavily in multimodal AI to redefine how users interact, express, and connect on our platform.

Role Overview

We're building Voxy — a deeply personalised AI companion that speaks, listens, and feels real. This isn't a ChatGPT wrapper. We train and fine-tune our own models, run our own inference stacks, and obsess over every millisecond of latency and every token of quality.

We're looking for an AI Engineer who has gone deep — someone who has read the Attention Is All You Need paper and implemented it, who knows why LoRA works at a mathematical level, and who gets uncomfortable when someone says "just call the OpenAI API."

If you've fine-tuned a model, shipped it to production, and then debugged why it degraded three weeks later — we want to talk.

What You'll Own

LLM Fine-tuning for Companion AI — SFT, RLHF, DPO pipelines on open-source base models (Gemma, LLaMA, Mistral family, etc ) to build Voxy's conversational personality engine

Efficient Training — PEFT methods including LoRA, QLoRA, adapter layers; balancing quality vs. compute budget

Text-to-Speech & Voice — Fine-tune TTS models (XTTS, Tortoise, StyleTTS2 or similar); work on prosody, emotion, and low-latency streaming voice output

Text-to-Image — Fine-tune diffusion models (Stable Diffusion, FLUX) using DreamBooth, LoRA, ControlNet for companion avatar generation

Quantisation & Inference — GPTQ, AWQ, GGUF, bitsandbytes; optimise models for GPU and edge targets without killing quality

Evaluation Systems — Build robust eval harnesses: automated metrics, human eval pipelines, regression detection, and model behaviour monitoring

Data Engineering — Own training data — curation, dedup, quality filtering, formatting.

What We're Looking For

Must-Haves

3–6 years of experience, with at least 2 years directly in ML/AI model work (not just ML infra or MLOps)

Hands-on fine-tuning experience on transformer-based LLMs — not just running training scripts, but understanding why hyperparameters matter

Deep understanding of attention mechanisms — multi-head attention, RoPE, GQA, Flash Attention — can reason about tradeoffs, not just use them

Practical experience with PEFT: LoRA rank selection, target modules, merging strategies, catastrophic forgetting mitigation

Strong grasp of quantisation — INT4/INT8, calibration, quality-compute tradeoffs across GPTQ/AWQ/GGUF

Experience with model evaluation — BLEU/ROUGE are a floor, not a ceiling; you've built task-specific evals

Data-first mindset — can write complex SQL, has built data pipelines for training, understands data quality deeply

Strong DSA fundamentals — can pass a coding round at a top product company; writes efficient, production-quality Python

Good to Have

Prior work on voice models — TTS fine-tuning, voice cloning, vocoder pipelines (HiFi-GAN, BigVGAN), streaming inference

Experience with diffusion models — training dynamics, classifier-free guidance, LoRA for personalisation

Familiarity with multi-modal architectures — how vision encoders interface with LLMs (LLaVA, Idefics, etc.)

Contributed to open-source ML projects or published papers/technical blogs with real traction

Experience with distributed training — DDP, FSDP, DeepSpeed ZeRO stages

Proven Track Record — This Is Non-Negotiable

You must be able to show at least one of the following:

A fine-tuned model you shipped to production with measurable impact (latency, quality, cost)

An open-source project, HuggingFace model card, or technical writeup that demonstrates depth — not just a notebook

A prior role where you owned an AI model end-to-end, from data to deployment

Research/academic work in NLP, speech, or generative models with real results

We will ask you to walk through your work in detail during interviews. Vague answers don't pass here.

Tech Stack, You'll Work With

PyTorch · HuggingFace Transformers / PEFT / Datasets · vLLM / TGI · bitsandbytes · Axolotl / LLaMA-Factory · Weights & Biases · Python · SQL / Athena · CUDA (basic understanding) · GCP / AWS

Why Join Eloelo

Work on a real model stack — not prompt engineering on top of GPT-4

Direct impact on millions of users in a live product

Small, high-ownership team — your decisions ship

Competitive compensation + ESOPs

View all job openings