
100 - 125 Posted: 6 hours ago
Job Description
<p><h3>Overview</h3><p>The Manager will serve as a Linux Subject Matter Expert (SME), responsible for monitoring, maintaining, troubleshooting, and supporting high-performance computing (HPC) nodes critical to our client’s day-to-day operations. The role focuses on ensuring a secure, optimized, and highly available HPC environment, while delivering deep technical expertise and guidance to users and internal teams.</p><p>Candidates must be able to work onsite at least 4 days per week.</p><h3>Key Responsibilities</h3><ul><li>Act as the primary technical expert for Linux-based HPC clusters – ensuring performance, capacity, and availability targets are met.</li><li>Identify, diagnose, and resolve complex second-level issues for hardware, software, network, VPN, and Linux environments; escalate as needed with full documentation.</li><li>Manage daily operations of Linux-based HPC environments, including patching, upgrades, security hardening, and configuration of Ubuntu and RedHat systems.</li><li>Support job submission and workload management using Slurm or OpenHPC, and assist end users in optimizing compute workloads.</li><li>Migrate existing nodes to Linux, ensuring minimal downtime and performance impact.</li><li>Implement and manage cluster patching / automation tools such as Foreman (or similar) to streamline operations.</li><li>Install and configure servers, storage, hypervisors (KVM), and other HPC infrastructure components.</li><li>Automate administrative tasks to improve operational efficiency.</li><li>Execute firewall access requests, monitor security alerts, and assist in incident response.</li><li>Provide second-level support and mentorship to junior technical staff, ensuring knowledge transfer and consistent process execution.</li><li>Develop, maintain, and publish technical documentation, KB articles, and end-user guides for new systems or upgrades.</li><li>Participate in on-call rotations, emergency incident response, and occasional after-hours maintenance windows.</li></ul><h3>Education & Experience</h3><p>Diploma or Degree in Computer Science, Information Technology, or related field.</p><ul><li>Minimum 2+ years in IT (with a related University Degree) or 7+ years in IT (with a three-year College Diploma).</li><li>Enterprise-level Linux expertise (Ubuntu and / or RedHat) is essential.</li><li>Certifications (e.g., MCSE, CISSP) are strong assets.</li></ul><h3>Specialized Skills</h3><ul><li>Proven track record as a Linux SME in installation, tuning, and operational support.</li><li>In-depth experience with HPC clusters and job scheduling tools such as Slurm, LSF, or GridEngine.</li><li>Strong knowledge of KVM or similar hypervisors.</li><li>Working understanding of network systems, protocols, and standards including Active Directory integration.</li><li>Identity management experience (Microsoft Identity Manager, Azure AD Connect).</li><li>Solid scripting skills (Bash required; additional scripting languages are an asset).</li><li>Experience applying advanced troubleshooting to resolve performance, configuration, or security issues.</li><li>Excellent problem-solving, organizational, and documentation skills.</li><li>Ability to communicate clearly with both technical and non-technical stakeholders.</li><li>Bilingualism (English / French) is an asset.</li><li>Microsoft Windows knowledge is an asset.</li></ul><h3>Decision Making & Supervision</h3><ul><li>Operate with minimal supervision while making decisions based on analysis, troubleshooting, and established procedures.</li><li>Coordinate with helpdesk, networking, platform, and security teams to ensure alignment of upgrades, patches, and operations.</li></ul><h3>Working Conditions</h3><ul><li>Comfortable office environment with periodic physical tasks (e.g., installing hardware).</li><li>Requires appropriate security clearance.</li><li>Must be willing to provide occasional off-hours support and participate in on-call rotation.</li></ul></p>#J-18808-Ljbffr
Browse Jobs in Canada by City
Create Your Resume First
Give yourself the best chance of success. Create a professional, job-winning resume with AI before you apply.
It's fast, easy, and increases your chances of getting an interview!
Application Disclaimer
You are now leaving Hiringgg.com and being redirected to a third-party website to complete your application. We are not responsible for the content or privacy practices of this external site.
Important: Beware of job scams. Never provide your bank account details, credit card information, or any form of payment to a potential employer.