AI Research & Engineering

Research Scientist, Pretraining - Foundation Models

Bengaluru, India

About Soket AI

Soket is an AI research firm headquartered in Bengaluru with a mission to build efficient and generalized intelligence for humanity. We are focused on advancing frontier AI research through the development of large-scale foundation models in math, code and reasoning that are open, energy-efficient, multilingual, and responsible by design. Funded and supported by the IndiaAI Mission, Government of India. Our work places a strong emphasis on India and the Global South, where access to high-quality AI systems remains limited despite immense linguistic and cultural diversity.

At Soket, we believe the future of AI should be accessible, scalable, and aligned with real-world societal needs. Our teams work across large language models, multimodal systems, speech technologies, reasoning systems, and large-scale AI infrastructure, with a strong focus on open research and practical deployment. We are deeply passionate about pushing the boundaries of AI research while building systems that are useful, trustworthy, and globally impactful.

Compensation

Rs 80,00,000 – Rs 1,50,00,000 (Includes Equity Benefits)
Compensation will be commensurate with industry standards and will be determined based on the candidate's current compensation, relevant experience, skills, and overall qualifications.

Workloads you would be involved in

Drive research and development across both pretraining and post-training pipelines for large-scale foundation models.
Proactively explore, evaluate, and implement state-of-the-art research in foundational model pretraining and post-training methodologies and algorithms, driving continual advancement in both theory and practice.
Design and optimize model architectures for reasoning, coding, mathematics, multilingual, and domain-specific capabilities.
Define and evolve data mixture strategies, dataset composition, filtering policies, and curriculum design for frontier model training.
Lead experimentation around scaling laws, model behavior, capability emergence, and training dynamics.
Optimize large-scale training stacks across distributed infrastructure, parallelism strategies, and compute efficiency.
Design and improve post-training systems including SFT, preference optimization, alignment, instruction tuning, and evaluation-driven refinement.
Conduct systematic experimentation on optimizers, training recipes, hyperparameter tuning, and stability improvements.
Build robust evaluation methodologies and benchmarking pipelines to measure capability, safety, reasoning, and generalization.
Collaborate across data, infrastructure, and research teams to translate experimental findings into production-grade model improvements.
Analyze failure modes, model regressions, and emergent behaviors to guide future research directions.
Stay at the frontier of foundation model research and translate new ideas into scalable training systems.

You are a good fit if you:

Have 7+ years of experience in machine learning, deep learning, or foundation model research.
Hold a PhD or Master's degree in Computer Science or a related field (PhD preferred).
Have strong understanding of transformer architectures, LLM training dynamics, and large-scale optimization.
Love experimenting with models, training recipes, and uncovering why systems behave the way they do.
Have a strong research mindset and enjoy turning ideas into measurable improvements.
Are comfortable operating across both theory and systems aspects of large-scale model development.
Have strong programming and experimentation skills in Python and modern ML tooling.

You are a strong candidate if you have experience with:

Large-scale pretraining and post-training of language or multimodal models.
Architecture research involving Transformers, Mamba, Diffusion, MoE systems, attention mechanisms, or efficiency-oriented designs.
Publication in top-tier AI/NLP conferences such as ACL, EMNLP, NeurIPS, ICML, ICLR, etc.
Data mixing, curriculum learning, synthetic data generation, and large-scale dataset engineering.
Post-training methods including SFT, RLHF, DPO, preference learning, and alignment techniques.
Distributed training frameworks and optimization stacks such as PyTorch, Megatron-LM (must have experience), DeepSpeed, FSDP, or equivalent systems.
CUDA kernel design and optimization for high-performance training and inference workloads.
Training stability, scaling laws, optimizer research, and compute-efficient model development.
Benchmarking and evaluation of reasoning, coding, multilingual, and agentic capabilities.
Strong understanding of large-scale HPC and GPU cluster environments, including distributed systems and high-performance AI infrastructure.
Working with large GPU clusters and distributed AI infrastructure.

Why work with Soket?

At Soket, you will get the chance to work on problems that only a handful of teams in the world are solving today - building frontier foundation models at scale. You will see first-hand how intelligence is baked into large models and work across the entire stack that powers modern AI systems. You will work with supercomputing-scale GPU clusters and tackle challenging problems in petabyte scale data aggregation and processing, distributed training, model architectures, infrastructure, inference optimization, and large-scale AI deployment.

One day you might be debugging CUDA kernels or NCCL issues, another day optimizing throughput for multi-GPU training runs, building new infrastructure tooling, or experimenting with ideas that make training faster and more efficient. We are a deeply research-driven and engineering-focused team that loves nerding out about systems, scaling laws, training stacks, and AI research. If you enjoy going deep into technical problems and learning from highly talented researchers and engineers, you will feel right at home here. Most importantly, we are building efficient, open, and accessible AI systems for India, the Global South, and ultimately for humanity as a whole.

If this sounds exciting to you, come build the future with us.

Apply Now!

Soket AI Labs is a research-first AI company headquartered in Bengaluru. We are an equal opportunity employer and strongly encourage applications from people of all genders, backgrounds, and ethnicities. We offer competitive compensation, equity participation opportunities, flexible work arrangements across office and remote settings, comprehensive leave policies including parental and wellness leaves, and regular team offsites designed to foster collaboration and innovation.

As an AI-native organization, we use AI systems as part of our candidate assessment and interview processes. Please make sure your resume aligns with the job description. More details about how candidate data is processed and used will be available on the application page.

Apply for this role Browse all openings →