4 days old

Site Reliability Engineer - OS/Cloud Native Applications

Cisco Systems Inc.
Feltham, England TW13 4
 

Title: Site Reliability Engineer

Location: Bedfont Lakes, Feltham, UK

We at Cisco are looking for a SRE/Cloud Engineer to join our IT team, to enable our IaaS strategy in support of next generation cloud native applications.

Why you will love Cisco:

For years, Cisco's vision has been to change the way the world works, lives, plays, and learns. Our vision is more relevant today than ever. We made the Internet what it is today. First, we focused on creating connectivity. Now, we're entering the Internet of Everything transitionan era where we'll help create unprecedented value by connecting the unconnected. The Internet of Everything is a global industry phenomenon that is driving the biggest market transition for Cisco and our customers. This includes the intelligent connection of people, process, data, and things. It's where everything is converged on the Internet, making networked connections more relevant and valuable than before. To help us bring this vision to life, join us in our exciting journey.

We celebrate the creativity and diversity that fuels our innovation. We are dreamers and we are doers.

What Youll Do

You will join our Infrastructure Services team (IS). You will help us move from scripts and playbooks to a fully automated pipeline running through our Continuous Integration/ Continuous Delivery (CI/CD) system. We want to move to zero-touch automation of builds, patches, and upgrades of our Hosting Infrastructure, VMware and OpenStack environment.

Responsibilities include:

  • Participate in Agile scrum
  • Automate OS provisioning, Lifecycle and configuration management using Python, Ansible, terraform and other tools
  • Deliver releases through CI/CD pipeline.
  • Dealing with security patches and issues at the UCS Hardware, RHEL OS layer.
  • Diagnose and Fix the L3 level support incidents.
  • Serve as an Incident Management escalation point for major incidents.
  • Undertake root cause analysis of major incidents.
  • Work on monitoring and alerting requirements for our hosting platform.
  • Deliver bug fixes, new features, and functionality as requested by our customers
  • Participating in on-call rotation

Who You'll Work With

We are a DevOps team inside Cisco IT maintain and building the next generation hybrid  platform which is hosted On prem and on Cloud Envts. Our services that will be used by all of Cisco Business units as we move to cloud-native applications. This is a team of highly motivated individuals leveraging Agile Scrum. We move at a fast pace and are passionate about the cloud, automation, and security. Giving back and contributing to the Opensource projects is encouraged.

Who You Are

Ideally: you know servers; you know cloud; you know storage; you should have proven experience with Python, Git, Ansible, terraform and you should understand IT infrastructure customer needs.

Technical Expertise:

  • BS/MS and 10 years of relevant experience.

       Experience in leading large-scale infrastructure with more than 10k compute systems.

  • Redhat Enterprise Linux build, development, and operations
  • Experience with configuration management tools (Ansible and/or Puppet)
  • Automation of OS configuration, builds, upgrade and patching.
  • Software development lifecycle including design, development, testing, packaging, deployment, upgrade and support. Git and Jenkins
  • Python, Ruby, or similar programming experience.
  • QA and testing experience of your code and the entire platform.
  • Understanding of security including OS hardening, firewalls, iptables, and working with Security
  • Understanding of Cisco UCS compute infrastructure
  • Availability to be on pager duty during weekends and a rotation basis
  • Proven ability to respond to critical issues on a 24/7/365 basis and to own problems from beginning to end.

Non-Technical Requirements:

  • Agile software development practices
  • Work with geographically distributed teams
  • Understand IT processes, including architecture, design, implementation, and operations
  • Opensource development experience
  • Self-motivated, able and willing to help where help is needed
  • Able to build and establish relationships, be culturally sensitive, have goal alignment and learning agility

 

Categories

Posted: 2020-11-20 Expires: 2020-12-20

Before you go...

Our free job seeker tools include alerts for new jobs, saving your favorites, optimized job matching, and more! Just enter your email below.

Share this job:

Site Reliability Engineer - OS/Cloud Native Applications

Cisco Systems Inc.
Feltham, England TW13 4

Join us to start saving your Favorite Jobs!

Sign In Create Account
Powered ByCareerCast