Site Reliability Engineer
|Job Title:||Site Reliability Engineer|
|Location:||City of London, London|
|Contact Name:||Ben Small|
|Job Published:||February 14, 2018 13:43|
Based in London
A Site Reliability Engineer is responsible for working with project teams to build and maintain automated deployment processes, to encourage common code ownership of the deployment stack amongst project developers, encourage the breaking down of silos between operations and development, and to monitor and administer cloud based infrastructure and platforms.
The Analytics Platform is a modern data analytics solution for government that manages reproducible data pipelines, provides data scientists access to analytical tools including RStudio and iPython/Jupyter notebook, and a hosting platform for web-based data applications. Running within AWS, it utilises a number of AWS services and Docker with Kubernetes to provide a dynamic, elastic hosting platform. As such the successful applicant must have previous experience maintaining and managing applications with a modern container scheduler.
●Take ownership of ongoing infrastructure configuration, management and development
●Linux server configuration and tuning
●Contribute to platform architecture and design
●Design and develop monitoring and logging services for both product components and security functions
●Troubleshoot infrastructure and cluster networking issues, and work to improve reliability and performance
●Assist in developing comprehensive automated testing for infrastructure and deployment pipelines
Skills & Qualifications
* Experience of running Kubernetes clusters in production
* Familiarity with the Kubernetes API and ecosystem
* Knowledge of container networking solutions such as Calico, Weave, Canal etc.
* Building and deploying with Docker using best practices
* Continuous integration/deployment with Jenkins
* Building infrastructure and configuration with Terraform
* Using AWS services and APIs including EC2, VPC, S3, IAM, Route53
* Shell scripting
* HTTP, TLS/SSL
* Monitoring and logging software such as Elasticsearch/Logstash/Kibana, Fluentd , Grafana and Prometheus
* HMG Cloud Security Principles
* Federated identity, OAuth, OIDC, SAML
Lawrence Harvey is acting as an Employment Business in regards to this position.
Visit our website www.lawrenceharvey.com and follow us on Twitter for all live vacancies @lawharveyjobs
Get similar jobs like these by email
By submitting your details you agree to our T&C's