Descripción del Trabajo
Jabil is a product solutions company providing comprehensive design, manufacturing, supply chain and product management services. Operating from over 100 facilities in 29 countries, Jabil delivers innovative, integrated and tailored solutions to customers across a broad range of industries and end-markets, such as automotive, consumer lifestyle and wearable tech, defense and aerospace, connected home and building, industrial and energy, enterprise and infrastructure, healthcare, mobility, packaging and printing.
As a Site Reliability Engineer within Jabil’s Cloud Test Software Development team, you will directly contribute to the daily operations and development of our Cloud Test Platform deployed at multiple production facilities worldwide. Key responsibilities of this position include providing first line response to production issues including but not limited to outages, end user performance, change management, monitoring, improving the efficiency and usability of production applications, and ensuring all site software and hardware is maintained with the latest updates to ensure high levels of performance and reliability.
MANAGEMENT & SUPERVISORY RESPONSIBILITIES
- Reports to Management
ESSENTIAL DUTIES AND RESPONSIBILITIES
- Lead operational and development support for the software and test infrastructure deployed at our production facilities
- Provide incident response, analysis and corrective actions for all site operations.
- Design, develop, and maintain product packaging, installation, upgrades, management and administration scripts and utilities
- Develop and implement centralized logging and monitoring tools.
- Continually maintain and improve software build methodology, procedures, and environment.
- Monitor and alert based on system metrics, analysis of logfiles and custom alert rules
- Definition and collaboration on software features while providing insights based on site operations and uptime challenges.
- Actively participate in peer/code reviews.
- Drive closed loop response to factory test failures by identifying test gaps and future opportunities.
- Expertise in the following programming/scripting languages: C, C++, Python, Ruby, Java, BASH, Expect.
- Experience in the Linux environment and a good understanding of its fundamentals and internals.
- Well-versed in the following container/virtualization environments: VMware, Docker, Kubernetes.
- Solid understanding of large-scale distributed systems in practice, including multi-tier architectures, application security, monitoring and storage systems.
- Experience with application integration using web services.
- Experience with common web APIs (REST, XML-RPC).
- Expertise with networking systems, hardware, software and protocols including but not limited to enterprise ethernet datacenter switching/routing (L1 – L3).
- Demonstrated systematic problem-solving capability, coupled with strong communication skills and a sense of ownership and drive.
- Ability and desire to debug and optimize code and automate routine tasks.
- Prior experience with code versioning tools (Git preferred).
- Database experience (SQL or similar).
EDUCATION & EXPERIENCE REQUIREMENTS
- BS degree in Electrical/Computer Engineering, Computer Science or related field. MS preferred.
- 1-3 years of software engineering and/or IT operations and infrastructure experience.
- Excellent verbal and written communication skills.
- Experience working in a multi-site and multi-cultural environments.
- Domestic and/or International travel, up to 10%, may be required.