Job Summary
As the largest online distributor of restaurant supplies and equipment, WebstaurantStore offers a catalog of more than 430,000 products supported by fast, reliable shipping. Nearly all our technological design, development, and system management are handled in-house, allowing us to build custom, innovative solutions in a rapidly evolving e-commerce landscape.
Due to this growth, we are seeking an Infrastructure Engineer with a focus in Server Performance to support our on-premises infrastructure, centered on designing and creating high performance computing solutions that power AI workloads and other resource-intensive business operations.
This role is primarily focused on on-premises infrastructure and does not focus on cloud administration.
Responsibilities
• Administer and optimize physical servers (Dell), SQL Server environments, GPU-accelerated AI and 3D render servers, and VMware vSphere/ESXi virtual infrastructure (vCenter HA).
• Own infrastructure performance tuning across compute, virtualization, and operating system layers to improve reliability, efficiency, and scalability.
• Proactively identify bottlenecks, capacity constraints, and performance risks using monitoring and observability tools such as Prometheus and Grafana.
• Investigate physical server issues end to end, perform root cause analysis, and determine the most appropriate corrective solution.
• Manage physical and virtual infrastructure and support partner teams including DBRE, Security, SRE, Media, and Automated Warehouses.
• Support server lifecycle management activities including provisioning, configuration, patching, upgrades, hardware refreshes, and decommissioning.
• Perform hardware diagnostics, firmware and BIOS updates, RAID configuration, driver management, and out-of-band administration using tools such as iDRAC or equivalent technologies.
• Participate in disaster recovery and business continuity efforts, including failover testing, backup and recovery validation, and restoration readiness.
• Contribute to infrastructure projects such as migrations, platform rollouts, performance improvement initiatives, and hardware modernization efforts.
• Support Windows Server and Linux-based environments, including RHEL, Ubuntu, and Rocky Linux.
• Collaborate with internal stakeholders to solve complex infrastructure problems, recommend improvements, and strengthen long-term system health.