Softworld, Inc.
https://cdn.haleymarketing.com/templates/62150/logos/softworldinc-hml.png
http://www.softworldinc.com
http://www.softworldinc.com
true
SITE RELIABILITY ENGINEER – Tier 3 Operations Premium Support
Job Description
TS/SCI with Poly Clearance needed for this role.
Required Experience/Skills:
• Linux (at command line interface level)
• Docker
• Distributed computing systems
• Network infrastructure (configuration and operation)
• Terraform or other deployment automation pipeline solutions
Preferred Experience/Skills:
• 2 Years experience maintaining distributed systems
• Database Administration
• Scripting (Bash, Python)
Work with Site Reliability Engineering (SRE) team on the shared full-stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide development teams to engineer and add premier capabilities to a Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs).
SRE Tier 3
Reston, VA 20170 US
Posted: 09/15/2023
2023-09-15
2023-10-29
Employment Type:
Contract
Industry: IT
Job Number: 241264
Job Description
SITE RELIABILITY ENGINEER – Tier 3 Operations Premium Support
Job Description
TS/SCI with Poly Clearance needed for this role.
Required Experience/Skills:
• Linux (at command line interface level)
• Docker
• Distributed computing systems
• Network infrastructure (configuration and operation)
• Terraform or other deployment automation pipeline solutions
Preferred Experience/Skills:
• 2 Years experience maintaining distributed systems
• Database Administration
• Scripting (Bash, Python)
Work with Site Reliability Engineering (SRE) team on the shared full-stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide development teams to engineer and add premier capabilities to a Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs).