Golang Developer with DevOps/LLM Experience - Remote / Telecommute

Remote, USA Full-time
Job Description: Required Skills: • Proficiency in Golang for building scalable and performant backend services. • Deep experience building services in modern cloud environments on distributed systems (i.e., containerization (Kubernetes, Docker), infrastructure as code, CI/CD pipelines, APIs, authentication and authorization, data storage, deployment, logging, monitoring, alerting, etc.) • Experience working with Large Language Models (LLMs), particularly hosting them to run inference. • Strong verbal and written communication skills. • Candidates job will involve communicating with local and remote colleagues about technical subjects and writing detailed documentation. • Experience with building or using benchmarking tools for evaluating LLM inference for various models, engine, and GPU combinations. • Familiarity with various LLM performance metrics such as prefill throughput, decode throughput, TPOT, and TTFT. • Experience with one or more inference engines: e.g., vLLM, SGLang, and Modular Max. • Familiarity with one or more distributed inference serving frameworks: e.g., llm-d, NVIDIA Dynamo, and Ray Serve etc. • Experience with client and NVIDIA GPUs, using software like CUDA, ROCm, AITER, NCCL, Client, etc. • Knowledge of distributed inference optimization techniques - tensor/data parallelism, KV cache optimizations, smart routing etc. • Develop and maintain an inference platform for serving large language models optimized for the various GPU platforms they will be run on. • Work on complex AI and cloud engineering projects through the entire product development lifecycle (PDLC) - ideation, product definition, experimentation, prototyping, development, testing, release, and operations. • Build tooling and observability to monitor system health, and build auto tuning capabilities. • Build benchmarking frameworks to test model serving performance to guide system and infrastructure tuning efforts. • Build native cross platform inference support across NVIDIA and client GPUs for a variety of model architectures. • Contribute to open source inference engines to make them perform better on DigitalOcean cloud. Apply tot his job
Apply Now

Similar Jobs

Go (Golang) Backend Developer

Remote, USA Full-time

Google Ads Lead Generation Specialist job at SMB Team in Philadelphia, PA

Remote, USA Full-time

100% Remote Golang Developer with Devops/LLM exp. W2 Consultant

Remote, USA Full-time

Google Ads Specialist - Water Damage / Roofing Experience Required

Remote, USA Full-time

Google Ads Specialist

Remote, USA Full-time

Management Consultant - Remote, High-Income, Flexible Work

Remote, USA Full-time

Senior RIM Consultant, Info Governance

Remote, USA Full-time

[Remote] Senior Change Management Consultant (Manager or Director Level)

Remote, USA Full-time

Business Growth Consultant

Remote, USA Full-time

Management Consulting Expert

Remote, USA Full-time

Experienced Remote Sales Professional – Flexible Hours, Unlimited Earning Potential, and Comprehensive Benefits

Remote, USA Full-time

Analista de Business Intelligence especializado en Atención al Mutualista y Cont

Remote, USA Full-time

[Remote] Bioinformatician | $90/hr | Remote

Remote, USA Full-time

Real Estate Agent

Remote, USA Full-time

Financial Consultant - Schaumburg, IL 1750 E Golf Road, Schaumburg IL

Remote, USA Full-time

Netflix Work From Home (Entry Level/No Experience) $75000/Year

Remote, USA Full-time

Experienced E-commerce Customer Service Representative – Remote Opportunity for Passionate and Client-Focused Individuals

Remote, USA Full-time

[Remote] SME – Health Systems Analyst

Remote, USA Full-time

Head of Worksite Partnership Development (Hiring Immediately)

Remote, USA Full-time

**Experienced Full Stack Utility Account Representative Trainee – Customer Care Division at blithequark**

Remote, USA Full-time
Back to Home