Platform Engineer
Job Description
Overview
Principal Platform Engineer, Reliability and Observability
Ncounter is hiring a senior Platform Engineer to own reliability and observability across a mission-critical trading platform. This is a deeply technical role focused on keeping complex, distributed systems stable, measurable, and predictable under real-time load. You will work directly on shared platform services that underpin trading and research workloads, where latency, partial failure, and blind spots in monitoring are not tolerated.
Observability is a core engineering concern here, not a bolt-on toolset. You will design and operate metrics, logging, tracing, and alerting pipelines that ingest high-volume telemetry, expose system behaviour under stress, and materially reduce operational risk. The role blends production engineering, platform tooling, automation, and reliability-led architecture, with direct ownership of systems running at scale.
Responsibilities
- Owning reliability and observability for shared platform services in Linux and Kubernetes environments
- Designing and operating high-throughput metrics, logging, and tracing pipelines for real-time systems
- Hardening services against latency degradation, cascading failure, and outages using reliability engineering principles
- Reducing operational toil through automation, GitOps workflows, and platform tooling
- Improving on-call signal quality through alert design, runbooks, and post-incident learning
- Partnering with engineers to bake observability and resilience into services by default
Core Technical Background
- Strong experience in SRE, production engineering, or platform reliability with ownership of live systems
- Deep Linux systems knowledge, debugging, and performance tuning
- Software engineering with Python or Go, plus solid Git and CI/CD experience
- Hands-on expertise with observability stacks covering metrics, logs, traces, and alerting
- Experience operating systems at scale, including HA, DR, and incident response
Nice to Have
- Infrastructure automation with Terraform or Ansible
This is a role for engineers who enjoy understanding how systems really behave under pressure and who want to own reliability as a first-class engineering problem. If you like solving hard platform problems where observability directly drives system correctness, this is worth a conversation.
#J-18808-Ljbffr
How to Apply
Ready to start your career as a Platform Engineer at Ncounter Technology Recruitment?
- Click the "Apply Now" button below.
- Review the safety warning in the modal.
- You will be redirected to the employer's official portal to complete your application.
- Ensure your resume and cover letter are tailored to the job description using our AI tools.
Frequently Asked Questions
Who is hiring?▼
This role is with Ncounter Technology Recruitment in Toronto.
Is this a remote position?▼
This appears to be an on-site role in Toronto.
What is the hiring process?▼
After you click "Apply Now", you will be redirected to the employer's official site to submit your resume. You can typically expect to hear back within 1-2 weeks if shortlisted.
How can I improve my application?▼
Tailor your resume to the specific job description. You can use our free Resume Analyzer to see how well you match the requirements.
What skills are needed?▼
Refer to the "Job Description" section above for a detailed list of required and preferred qualifications.