SRE (Site Reliability Engineer) Job at innovitusa, Salem, OR

VC9ZTFZFcnp4NGdTaktrYmFxcU1GVUdKNmc9PQ==
  • innovitusa
  • Salem, OR

Job Description

Hiring: W2 Candidates Only

Visa: Open to any visa type with valid work authorization in the USA

We are seeking a highly skilled Site Reliability Engineer (SRE) to build scale and maintain our production infrastructure. The ideal candidate blends software engineering expertise with strong operational discipline. You will ensure the reliability availability security and performance of our cloud-based systems while driving automation and continuous improvement across engineering teams.

Key Responsibilities

  • Design build and manage highly scalable and reliable infrastructure across cloud environments (AWS/Azure/GCP).
  • Develop automation for deployment monitoring scaling and recovery using tools such as Terraform Ansible Helm or CloudFormation.
  • Implement CI/CD pipelines and partner with development teams to enhance deployment velocity and operational stability.
  • Monitor system performance using tools like Prometheus Grafana Datadog ELK Stack or CloudWatch.
  • Perform incident response root cause analysis (RCA) and postmortems to ensure continuous improvement.
  • Build and maintain robust alerting systems and SLO/SLIs to uphold service-level reliability targets.
  • Improve system resilience through capacity planning chaos engineering fault-tolerance testing and disaster recovery strategies.
  • Maintain and enhance security posture ensure compliance and enforce operational best practices.
  • Manage containers and orchestration platforms such as Docker and Kubernetes at scale.
  • Collaborate with cross-functional teams to drive reliability performance tuning and cost optimization.

Required Skills & Qualifications

  • Bachelors degree in Computer Science Engineering or a related technical field.
  • 4-8 years of SRE DevOps or Cloud Engineering experience.
  • Strong proficiency in cloud platforms: AWS Azure or GCP .
  • Expertise with infrastructure-as-code tools (Terraform CloudFormation Pulumi Ansible).
  • Hands-on experience with Kubernetes Docker and container orchestration.
  • Strong scripting/programming skills in Python Go Bash or similar.
  • Solid understanding of networking fundamentals (DNS TCP/IP Load Balancing VPC).
  • Experience with monitoring log management and observability tools.
  • Strong problem-solving debugging and troubleshooting skills in large-scale distributed systems.
  • Good communication skills and ability to work in fast-paced collaborative environments.

Preferred Qualifications

  • Experience supporting microservices-based architectures.
  • Knowledge of serverless technologies (Lambda GCP Cloud Functions Azure Functions).
  • Experience with GitOps tools (ArgoCD Flux).
  • Background in security hardening compliance or cloud architecture.
  • Familiarity with chaos engineering tools (Gremlin LitmusChaos).
  • Experience in on-call rotations with strong incident management skills

Job Tags

Full time,

Similar Jobs

Crime Scene Resources, Inc

Chief Forensic Laboratories Job at Crime Scene Resources, Inc

 ...biochemistry, toxicology, pharmacology, criminalistics, physics, or biology AND two (2) years of experience supervising personnel in a forensic sciences laboratory**. -OR- OPTION II: A Master's degree* from an accredited college with specialization in chemistry... 

NoGigiddy

Remote Transcription Specialist Job at NoGigiddy

 .... Meet Deadlines: Manage multiple transcription projects simultaneously and deliver completed transcriptions within specified time frames. Maintain Confidentiality: Handle sensitive information with discretion and maintain the confidentiality of the content... 

James & Whitney Co.

Gutter Lead Installer Job at James & Whitney Co.

 ...Gutter Lead Installer Join a High-Performance Team at James & Whitney Co.! Ready to take the lead and build something great? James & Whitney Co. is growing fast, and were looking for a skilled, motivated Gutter Lead Installer to join our top-tier production team... 

NASCENT Technology

Scrum Master Job at NASCENT Technology

 ...Position Overview The Scrum Master is an integral team lead member of one or more agile teams in the application software department. They are responsible for guiding the team to success during 2-week Sprints by using the best practices of Disciplined Agile Development... 

Deborah Heart and Lung Center

RN - MICU Job at Deborah Heart and Lung Center

 ...ill adult and geriatric patients with cardiovascular/pulmonary health deficits, in whom outcomes may be unpredictable; participates in...  ...assistance, short and long term disability benefits, life insurance, meal discount, dependent care subsidy, adoption assistance and...