- Search for JobsSearch for Jobs
- Browse for JobsBrowse for Jobs
- Create a ResumeCreate a Resume
- Company DirectoryCompany Directory
Sr SRE / Bangalore / 4 to 10 years
Senior Site Reliability Engineer
About the job
Who We Are
The Cloud Infrastructure Engineering team at Cisco drives the technology that's transforming the way IT departments secure their networks, and more importantly, their users. As an Engineer you will be a member of the team supporting the existing network as a service platform that runs our (and our customers) business, as well as building the team that is designing, developing, and running the next-gen systems that will carry Ciscos new security and network SASE offerings. At its core, the Cloud Infrastructure team provides Infrastructure as a Service for engineering in dozens of our data centers, globally
What Youll Be Doing
Our engineers have experience running large, distributed systems both on bare metal and on cloud services (bonus points for Amazon Web Services.) Our systems process 250+ BILLION transactions per day! You will join a team that focuses on quality, pragmatic solutions to issues. Operating in an automation first environment, you will work alongside our development team to understand the current architecture and moving pieces. We own these applications end-to-end; architecture, software development, infrastructure and the operational component. The challenges can be significant. At times, it may mean rewriting or refactoring of an application in order to automate everything and remove human dependency. Key responsibility are :
Implement frameworks to scale up product line
Measure, monitor, and improve availability, resilience and latency across micro-services
Automate and streamline global deployments
Participate in 12*7 on-call rotation
Who You Are
To join the team, we are looking for candidates with a background in Linux/BSD systems and operation at scale. Strong candidates can show skill in networking and automation, with Ansible and Python experience. While it would be great if you already know something about the DNS, as long as you have a basic understanding and ability to learn, we are always happy to teach new team members the ropes. We also want to know that you've been responsible for time-sensitive, mission critical systems with a high attention to detail. Our services are the heart of the Cisco Umbrella product, and we take ownership of that very seriously -- our SLOs are high. You have proven all the vital skills to build and improve these systems over the course of your career -- this isnt your first rodeo.
8+ year experience in SRE or software development
Experience scaling large-scale transaction production systems for operational resiliency
Solid knowledge in Linux kernel, networking, and internet technology
Enjoy troubleshooting and root cause complex issues across micro-service architecture
Automation with Java, Python, or Go
Experience with one or more virtualization platforms docker, Openstack, Kubernetes
Experience with one or more monitoring and virtualization tools such as Prometheus, Nagios, Datadog, Grafana
Experience with supporting 12*7 on-call rotation
Qualifications
Background in systems administration Debian / Linux
Familiar with AWS or any cloud infrastructure
Experience running a highly visible, 24x7 mission-critical service using DevOps practices
Have a background using Ansible, Terraform or similar configuration management and automation tools
Ability to participate in a 24/7/365 on-call rotation and resolve production issues within SLAs
Senior Site Reliability Engineer
About the job
Who We Are
The Cloud Infrastructure Engineering team at Cisco drives the technology that's transforming the way IT departments secure their networks, and more importantly, their users. As an Engineer you will be a member of the team supporting the existing network as a service platform that runs our (and our customers) business, as well as building the team that is designing, developing, and running the next-gen systems that will carry Ciscos new security and network SASE offerings. At its core, the Cloud Infrastructure team provides Infrastructure as a Service for engineering in dozens of our data centers, globally
What Youll Be Doing
Our engineers have experience running large, distributed systems both on bare metal and on cloud services (bonus points for Amazon Web Services.) Our systems process 250+ BILLION transactions per day! You will join a team that focuses on quality, pragmatic solutions to issues. Operating in an automation first environment, you will work alongside our development team to understand the current architecture and moving pieces. We own these applications end-to-end; architecture, software development, infrastructure and the operational component. The challenges can be significant. At times, it may mean rewriting or refactoring of an application in order to automate everything and remove human dependency. Key responsibility are :
Implement frameworks to scale up product line
Measure, monitor, and improve availability, resilience and latency across micro-services
Automate and streamline global deployments
Participate in 12*7 on-call rotation
Who You Are
To join the team, we are looking for candidates with a background in Linux/BSD systems and operation at scale. Strong candidates can show skill in networking and automation, with Ansible and Python experience. While it would be great if you already know something about the DNS, as long as you have a basic understanding and ability to learn, we are always happy to teach new team members the ropes. We also want to know that you've been responsible for time-sensitive, mission critical systems with a high attention to detail. Our services are the heart of the Cisco Umbrella product, and we take ownership of that very seriously -- our SLOs are high. You have proven all the vital skills to build and improve these systems over the course of your career -- this isnt your first rodeo.
8+ year experience in SRE or software development
Experience scaling large-scale transaction production systems for operational resiliency
Solid knowledge in Linux kernel, networking, and internet technology
Enjoy troubleshooting and root cause complex issues across micro-service architecture
Automation with Java, Python, or Go
Experience with one or more virtualization platforms docker, Openstack, Kubernetes
Experience with one or more monitoring and virtualization tools such as Prometheus, Nagios, Datadog, Grafana
Experience with supporting 12*7 on-call rotation
Qualifications
Background in systems administration Debian / Linux
Familiar with AWS or any cloud infrastructure
Experience running a highly visible, 24x7 mission-critical service using DevOps practices
Have a background using Ansible, Terraform or similar configuration management and automation tools
Ability to participate in a 24/7/365 on-call rotation and resolve production issues within SLAs
The health and safety of Cisco's employees, customers, and partners is a top priority. Our goal is to protect and mitigate the spread of COVID-19 infection for strong business resiliency during the pandemic. Therefore, Cisco requires all new hires to be fully vaccinated against COVID-19 in the U.S., unless otherwise prohibited by applicable law, and in countries where COVID-19 vaccination is legally required. The company will consider legally required accommodations/exceptions for medical, religious, and other reasons as per the requirements of the role and in accordance with applicable law. Additional information will be provided to candidates about the requirements and accommodation process at the offer time based on region.