- Search for JobsSearch for Jobs
- Browse for JobsBrowse for Jobs
- Create a ResumeCreate a Resume
- Company DirectoryCompany Directory
Site Reliability/DevOps Engineer- Cisco Digital Network Architecture
What You'll Do
You are deeply motivated Site Reliability engineer with background in DevOps/SRE software development and operations. Ideal candidate must have experience building, shipping and operating software-as-a-service (SaaS) product. Ideal candidate would have managed such products using Cloud Native Principles and exposed to cloud technologies. This position will enable Continuous Monitoring & Management of infrastructure while providing timely response within designated SLA times to service effecting faults and performance issues. As an SRE you will work closely with our Managed Services Team to diagnose & characterize issues to provide continuous improvement and to develop infrastructure best practices. As SRE you will be driven to build highly scalable, fault-tolerant, and easy to administer infrastructure. You must be pro-active and organized, diligent about documentation, and passionate about monitoring and automating everything.
Who You'll Work With
Cisco is transforming the networking industry. To make this happen, we are heavily investing in team responsible for The Network. Intuitive. We are disrupting the industry by building a new networking platform that can learn, adapt, and secure itself at the speed of todays businesses. This Digital Network Architecture platform automates network management and provides our customers with state-of-the-art analytics and insights. This team's innovations span articial intelligence, machine learning, analytics, IoT, security, automation, and more. Who You Are This role is primarily to apply your SRE skills to create complete self-serve Software Delivery Machine. The targeted platform will support vast number of cloud and hybrid customers. The candidate is expected to have strong hands-on skills and will guide and contribute technically to the infrastructure engineering.
Develop full-fledged software tooling to deliver programmable infrastructure (infrastructure as code) Develop tooling to drive end-to-end micro-services monitoring and management Implement Kubernetes compliance and best practices in terms of security, audits, network policies, reporting Develop Self-service Console to provide infrastructure visibility
Responsibilities
Manage the availability, scalability and performance of the Infrastructure platforms
Create the tools and infrastructure leveraged by the rest of the engineering teams
Diagnose and repair network, application, and hardware bottlenecks
Test and tune network, hardware, and software congurations to maximize performance
Deploy and manage monitoring and diagnostic tools
Monitoring systems, databases and networks for proper operation and performance
Providing a 724 on call support for the operations infrastructure
Create and maintain continuous integration (CI) and continuous deployment (CD) environments to facilitate an agile development process. Work is generally expected to take place during normal working hours however the Platform Operations Team provides Tier2 and Tier3 7x24x365 on call escalation and candidates should be exible with schedules to meet the needs and demands of the business.
Qualifications
Strong knowledge of core Enterprise LINUX (Red Hat/CentOS) with a focus upon building, maintaining, securing and performance tuning systems
Proven experience capacity planning, performance tuning, and infrastructure architecture. Experience scaling web, application, and data systems horizontally and vertically
Experience with K8S and other virtual infrastructure platforms
High-level shell uency + one or more scripting languages ( Python, Go, Perl, or similar )
Experience with system automation using Ansible
Experience with monitoring, alerting, and pipeline analysis tools
Experience with queuing/data-pipelining
Experience with SQL/NoSQL systems such as PostgresSQL, MySQL, Cassandra, or Redis
Experience in the development of operational procedures, processes, and scripts
The candidate expected to have strong hands-on skills and will guide and contribute technically to the product
BS/MS in Computer Science or related area Four or more years of relevant work experience Hands on experience working with Kubernetes infrastructure Kubernetes Certification is highly preferred Expert understanding of Kubernetes internals (clustering, scheduling, controllers, API server, etc. Very good understanding of container networking Very good software programming skills using Go/Python/YM Excellent understanding of microservices architecture Experience with Kubernetes monitoring tools (prometheus)
Why Cisco
At Cisco, each person brings their unique talents to work as a team and make a difference. Yes, our technology changes the way the world works, lives, plays and learns, but our edge comes from our people. We connect everything people, process, data and things and we use those connections to change our world for the better. We innovate everywhere - From launching a new era of networking that adapts, learns and protects, to building Cisco Services that accelerate businesses and business results. Our technology powers entertainment, retail, healthcare, education and more from Smart Cities to your everyday devices. We benet everyone - We do all of this while striving for a culture that empowers every person to be the difference, at work and in our communities.
*LI-IS1