Myticas Consulting logo

Infrastructure Manager - Linux Subject Matter Expert (34285)

Myticas Consulting

Ottawa, Canada

Share this job:
100 - 125 Posted: 7 hours ago

Job Description

<p><h3>Overview</h3><p>The Manager will serve as a Linux Subject Matter Expert (SME), responsible for monitoring, maintaining, troubleshooting, and supporting high-performance computing (HPC) nodes critical to our client’s day-to-day operations. The role focuses on ensuring a secure, optimized, and highly available HPC environment, while delivering deep technical expertise and guidance to users and internal teams.</p><p>Candidates must be able to work onsite at least 4 days per week.</p><h3>Key Responsibilities</h3><ul><li>Act as the primary technical expert for Linux-based HPC clusters – ensuring performance, capacity, and availability targets are met.</li><li>Identify, diagnose, and resolve complex second-level issues for hardware, software, network, VPN, and Linux environments; escalate as needed with full documentation.</li><li>Manage daily operations of Linux-based HPC environments, including patching, upgrades, security hardening, and configuration of Ubuntu and RedHat systems.</li><li>Support job submission and workload management using Slurm or OpenHPC, and assist end users in optimizing compute workloads.</li><li>Migrate existing nodes to Linux, ensuring minimal downtime and performance impact.</li><li>Implement and manage cluster patching / automation tools such as Foreman (or similar) to streamline operations.</li><li>Install and configure servers, storage, hypervisors (KVM), and other HPC infrastructure components.</li><li>Automate administrative tasks to improve operational efficiency.</li><li>Execute firewall access requests, monitor security alerts, and assist in incident response.</li><li>Provide second-level support and mentorship to junior technical staff, ensuring knowledge transfer and consistent process execution.</li><li>Develop, maintain, and publish technical documentation, KB articles, and end-user guides for new systems or upgrades.</li><li>Participate in on-call rotations, emergency incident response, and occasional after-hours maintenance windows.</li></ul><h3>Education & Experience</h3><p>Diploma or Degree in Computer Science, Information Technology, or related field.</p><ul><li>Minimum 2+ years in IT (with a related University Degree) or 7+ years in IT (with a three-year College Diploma).</li><li>Enterprise-level Linux expertise (Ubuntu and / or RedHat) is essential.</li><li>Certifications (e.g., MCSE, CISSP) are strong assets.</li></ul><h3>Specialized Skills</h3><ul><li>Proven track record as a Linux SME in installation, tuning, and operational support.</li><li>In-depth experience with HPC clusters and job scheduling tools such as Slurm, LSF, or GridEngine.</li><li>Strong knowledge of KVM or similar hypervisors.</li><li>Working understanding of network systems, protocols, and standards including Active Directory integration.</li><li>Identity management experience (Microsoft Identity Manager, Azure AD Connect).</li><li>Solid scripting skills (Bash required; additional scripting languages are an asset).</li><li>Experience applying advanced troubleshooting to resolve performance, configuration, or security issues.</li><li>Excellent problem-solving, organizational, and documentation skills.</li><li>Ability to communicate clearly with both technical and non-technical stakeholders.</li><li>Bilingualism (English / French) is an asset.</li><li>Microsoft Windows knowledge is an asset.</li></ul><h3>Decision Making & Supervision</h3><ul><li>Operate with minimal supervision while making decisions based on analysis, troubleshooting, and established procedures.</li><li>Coordinate with helpdesk, networking, platform, and security teams to ensure alignment of upgrades, patches, and operations.</li></ul><h3>Working Conditions</h3><ul><li>Comfortable office environment with periodic physical tasks (e.g., installing hardware).</li><li>Requires appropriate security clearance.</li><li>Must be willing to provide occasional off-hours support and participate in on-call rotation.</li></ul></p>
#J-18808-Ljbffr
Back to Listings

Create Your Resume First

Give yourself the best chance of success. Create a professional, job-winning resume with AI before you apply.

It's fast, easy, and increases your chances of getting an interview!

Create Resume

Application Disclaimer

You are now leaving Careeler.com and being redirected to a third-party website to complete your application. We are not responsible for the content or privacy practices of this external site.

Important: Beware of job scams. Never provide your bank account details, credit card information, or any form of payment to a potential employer.