Provide support to Broadcom AI Infrastructure based on VKS with multiple LLMs
Management of day-to-day operations for the AI environment, end user support for issues/questions/clarifications
Strong knowledge in AI area
Strong Python Proficiency: Deep experience in Python, as it's the primary language for most AI/ML frameworks and scripting.
LLM API Integration: Proven experience integrating with major Large Language Model (LLM) APIs such as Anthropic (claude models), and Google (Gemini).
AI Orchestration Frameworks: Hands-on experience with frameworks like LangChain, project Goose, Google Agent Builder, multi-step AI workflows and agents.
RAG (Retrieval-Augmented Generation): Practical knowledge of building RAG pipelines. This includes generating embeddings and using vector databases. Knowledge of reranker models
Prompt Engineering: A strong understanding of how to design, test, and refine effective prompts to get reliable, accurate, and consistent outputs.
Backend & API Development: Experience building and maintaining REST APIs (using frameworks like Flask, FastAPI, or Django) to serve AI-powered endpoints to other internal systems.
Database Knowledge: Proficiency with SQL (e.g., PostgreSQL, MySQL) and/or NoSQL databases for storing logs, user data, and operational metrics.
Core DevOps Skills: Familiarity with Git, Docker, and CI/CD pipelines to properly test and deploy your AI applications
Model Context Protocol (MCP): Experience developing and running MCP servers. This is critical for building a standardized bridge between our AI agents and our internal operational tools, databases, and APIs, enabling the AI to safely perform actions on the operations team's behalf.
Nice to have:
Experience fine-tuning smaller, open-source models (e.g., Llama 3, Mistral) for specific tasks
Familiarity with simple frontend or app-building tools (like Streamlit or Gradio) to quickly build web-based UIs for the tools you create
Knowledge of cloud provider AI services (Google Vertex AI & Google AI suite)
Strong knowledge of Linux operating systems and VKS Kubernetes platform
Willingness to support end user issues / resolution, fulfillment of service & change requests consistent with standard process & procedures
Works with multiple IT teams to ensure "Keep the Lights On" goal for the AI Environment
Looks for opportunities to improve day-to-day operations by implementing pro-active monitoring/management and automation of routine tasks
Develops processes, checklists for handing projects and subsequent operations tasks. Oversees upkeep of process/procedures & technical documentation. Makes improvements by incorporating the learnings from recent issues, security compliance reports, upgrades & patches.
Ability to work in a fast-moving environment & self-driven.
Coaches and mentors junior team members
Qualifications:
Bachelor's degree in Computer Science, Information Systems Management, Computer Engineering, and Mathematics or a related discipline or equivalent work experience.
5+ Years of experience in Information Technology
2+ years of experience in the areas of AI Infrastructure, architecture & design specifically in highly virtualized environments
Strong knowledge AI Orchestration Framework, LLM API integrations, RAG, Prompt engineering, Backend API integrations, Knowledge of database integrations etc.
Must be self-motivated with excellent teamwork, interpersonal, communication, presentation, and organizational skills
Ability to work effectively with clients, management, support team members and staff members in a fast-paced environment
Deep technical knowledge of Unix/Linux and virtualization technologies especially VMware and Nutanix
Scripting Skills (Shell, Perl, Python)
Strong problem analysis and troubleshooting skills
Must be proficient in hardware/OS monitoring concepts and automation framework
Applies broad concepts and theories to achieve innovative and cost-effective solutions to complex problems
Knowledge of networking and other OS technologies is a plus
Proactive management techniques, exposure to automation, scripting skills is a plus
Determines own priorities, both tactical and strategic
Additional Job Description:
Compensation and Benefits
The annual base salary range for this position is $81,000 - $130,000.
This position is also eligible for a discretionary annual bonus in accordance with relevant plan documents, and equity in accordance with equity plan documents and equity award agreements.
Broadcom offers a competitive and comprehensive benefits package: Medical, dental and vision plans, 401(K) participation including company matching, Employee Stock Purchase Program (ESPP), Employee Assistance Program (EAP), company paid holidays, paid sick leave and vacation time. The company follows all applicable laws for Paid Family Leave and other leaves of absence.
Broadcom is proud to be an equal opportunity employer. We will consider qualified applicants without regard to race, color, creed, religion, sex, sexual orientation, national origin, citizenship, disability status, medical condition, pregnancy, protected veteran status or any other characteristic protected by federal, state, or local law. We will also consider qualified applicants with arrest and conviction records consistent with local law. If you are located outside USA, please be sure to fill out a home address as this will be used for future correspondence.