Senior System Administrator / DevOps Infrastructure Engineer

SambaNova Systems

SambaNova Systems

Software Engineering, Other Engineering, IT
Palo Alto, CA, USA
Posted 6+ months ago

The era of pervasive AI has arrived. In this era, organizations will use generative AI to unlock hidden value in their data, accelerate processes, reduce costs, drive efficiency and innovation to fundamentally transform their businesses and operations at scale.

SambaNova Suite™ is the first full-stack, generative AI platform, from chip to model, optimized for enterprise and government organizations. Powered by the intelligent SN40L chip, the SambaNova Suite is a fully integrated platform, delivered on-premises or in the cloud, combined with state-of-the-art open-source models that can be easily and securely fine-tuned using customer data for greater accuracy. Once adapted with customer data, customers retain model ownership in perpetuity, so they can turn generative AI into one of their most valuable assets.

Working at SambaNova

SambaNova is changing the world of AI through revolutionary solutions including hardware acceleration for machine learning applications. This role is key to unleashing developer productivity for SambaNova’s world class engineering teams. As a Senior System Administrator / DevOps Infrastructure Engineer you will be working with our leading software and infrastructure teams enabling them to deploy and maintain our best-in-class solutions. It will be key to ensuring that there is an IaC (Infrastructure-as-Code) deployment process and tools that helps us maintain environmental consistency and secure tenancy. In addition, you’ll be working with operational teams to identify problems through operational response and monitoring.

This role presents a unique opportunity to shape the future of AI and the value it can unlock across every aspect of our organization’s business and operations. It is a high-visibility position which will shape the future of the AI platform in delivering the best-in-class cost-performance software solutions. As a hands-on contributor within the engineering team, you will be responsible for automating deployment and ensuring operational accuracy with the infrastructure environments.

Responsibilities

The location for this role is Palo Alto, CA. The work environment is hybrid.

  • Drive the adoption of cutting-edge DevOps technologies, such as GitOps, MLOps, and Kubernetes, to enhance software development, testing, and deployment efficiency
  • Manage the maintenance of leading CI/CD infrastructure and ensure seamless integration with other systems and tools for optimal performance
  • Build, manage and orchestrate our infrastructure, which today is largely an on-premises data center environment
  • Work closely with stakeholders to assess and implement enterprise-level systems that align with the organization's goals and objectives
  • Support engineering teams by enhancing, maintaining, performance tuning, and planning capacity of the engineering compute environment
  • Constantly improve the engineering environment from design through deployment, including additional refinement and scale-up to support future growth, along with day-to-day operations including monitoring, measuring, and troubleshooting infrastructure and services
  • Occasionally visit SambaNova data center locations in the San Francisco Bay Area

Basic Qualifications

  • 7+ years of experience in infrastructure related roles such as system administration, DevOps, or SRE
  • Demonstrated ability to design, implement and maintain secure, scalable, and resilient systems to support software development and deployment.

Additional Required Qualifications

  • Strong technical acumen with Python, GitHub, and other relevant technologies and tools
  • Proven experience and ability to lead technical teams and mentor junior members
  • Strong knowledge of containerization, orchestration, and monitoring technologies, such as Docker, Kubernetes (K8s), Prometheus, Grafana, and ELK Stack, with the ability to implement and maintain a secure and scalable infrastructure
  • Ability to drive the adoption of Infrastructure-as-Code (IaC) as a best practice within the DevOps team
  • Proven start-up experience in scaling and supporting engineering teams, with the ability to lead by example and foster a high-performance culture
  • Experience working with Open Source tooling to create solutions for an on-premise data center environment
  • Broad experience in system administration supporting Linux (RedHat / CentOS preferred) to include automated OS installation, software compilation, package management, virtualization, OS lifecycle management, diagnostic and performance troubleshooting/profiling and configuration management tools (Ansible, Puppet, Chef, Salt, etc.)
  • Broad experience supporting and maintaining common Linux/Unix applications and services, as well as a deep understanding of DNS, DHCP, LDAP, NFS, AutoFS, Kerberos, PAM, PXE, SNMP, SSH, VNC, X11, HTTP/S, and NTP
  • Experience with high performance storage solutions such as NetApp, Pure, cluster file systems and cloud-based solutions
  • Understanding of complex interplay between networking layers

Preferred Qualifications

  • In-depth understanding of DevOps principles, practices, and tools, including Continuous Integration, Continuous Deployment, and Continuous Delivery, with hands-on experience configuring and maintaining CI/CD pipelines using tools such as Jenkins, Travis CI, or CircleCI
  • Proficient in cloud computing technologies and experience working with Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform (GCP)
  • Scripting proficiency in languages such as Python or Bash
  • Experience automating infrastructure and deployment processes
  • Familiarity with version control systems, such as Git and GitHub
  • Experience with GitOps and Infrastructure-as-Code (IaC) principles and tools, such as Terraform or CloudFormation
  • Experience with HPC environments including batch processing using job schedulers

Annual Salary Range and Level

The base salary for this position ranges from $160,000/year up to $190,000/year. This range is based on role, level, and location and reflects the salary target for new hires in the US. Individual pay within the range will depend on a number of factors, including a candidate’s job-related qualifications, skills, competencies and experience, and location.

#LI-SB1

Benefits Summary for US-Based Full-Time Direct Employment Positions

(The Recruiter will provide benefit details for non-US-based roles)

SambaNova offers a competitive total rewards package, including the base salary, plus equity and benefits. We cover 95% premium coverage for employee medical insurance, and 77% premium coverage for dependents and offer a Health Savings Account (HSA) with employer contribution. We also offer Dental, Vision, Short/Long term Disability, Basic Life, Voluntary Life, and AD&D insurance plans in addition to Flexible Spending Account (FSA) options like Health Care, Limited Purpose, and Dependent Care. Our library of well-being benefits available to you and your dependents includes a full subscription to Headspace, Gympass+ membership with access to physical gyms, One Medical membership, counseling services with an Employee Assistance Program, and much more.

Submission Guidelines

Please note that in order to be considered an applicant for any position at SambaNova Systems, you must submit an application form for each position for which you believe you are qualified.

If you are a new, recent (within the last two years), or upcoming college graduate and are interested in opportunities with SambaNova Systems, please apply through our university job listings.

EEO Policy

SambaNova Systems is an Equal Opportunity/Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard basis of age (40 and over), color, disability, gender identity, genetic information, marital status, military or veteran status, national origin/ancestry, race, religion, creed, sex (including pregnancy, childbirth, breastfeeding), sexual orientation, and any other applicable status protected by federal, state, or local laws.