Site Reliability Engineer - Kubernetes/Terraform/Python
Location: Palo Alto
Posted on: August 5, 2022
Pangea is a well-funded Series A rocketship led and invested-in
by veterans of the security industry. We are a product-led company,
whose mission is to deliver an amazing product built specifically
for developers. We are hiring talented software engineers to build
a collection of cloud-agnostic security services. Engineers who are
passionate about innovating in the security space and driven to
deliver exceptional product experiences for developers are an ideal
fit for this role.
- Ensuring the quality of orchestration and integration of tools
needed to support daily operations for Cloud Applications and
- Implement cloud provider capabilities and services especially
as they relate to deployment, monitoring and incident/alert
- Implement cloud capabilities to enable and support SLAs of the
entire platform and product and 24x7 availability of services.
- Automate and orchestrate various parts of the CICD
- Lead certification efforts regarding testing and implement
performance and scale testing of the product and
- Proficient in networking and service mesh technologies like
Envoy and Istio.
- Proficient in at least one or more compliance standard (SOC2,
ISO27001, PCI, HIPAA, Fedramp, etc.) and be able to implement
- Proficient in infrastructure management and monitoring for
delivering reliable services with required SLO, SLA and SLIs.
- Develop documentation regarding design of implemented
- Experience with Total Cost of Ownership (TCO) & Cost of Goods
Sold (COGS) analysis and benchmarking.
- Coordinate systems design and deployment with the greater
- Build infrastructure as a code using Terraform, Ansible and
- Partner with developers and quality engineering teams to
automate the monitoring, alerting, availability and scalability of
our applications and systems.
- Follow SRE best practices and procedures.
- Experience in Go and/or Python
- Scaling and maintaining production systems on AWS and/or
- Managing Kubernetes in a large scale production
- Extensive background in developing and operating large-scale
cloud-based distributed applications
- Direct experience developing/running applications on AWS and
- Laser focus and be able to design infrastructure solutions for
scalability, reliability, high availability, performance, software
maintainability, and operational excellence
- Well-versed with infrastructure as code software (eg.
Terraform, AWS and Google Cloud Deployment, CloudFormation).
- Experience with Serverless Architecture is preferred. (eg.
- 5 years' experience in continuous integration practices & tools
(Jenkins, Travis CI, CircleCI, etc---)
- Linux administration in a large-scale SaaS environment.
- Experience with monitoring solutions such as: CloudWatch,
Stackdriver, Prometheus, Graphite, Grafana, ELK, SignalFX, Splunk,
Alert Logic, Datadog.
- Experience with Kafka, Mesos, Spark, Storm, Cassandra,
ElasticSearch, PostgreSQL, Redis, Zookeeper, Nginx.
If you like building products for developers that are simple and
intuitive to use, and enjoy being responsible for solving extremely
complex problems, then please submit your application because we
would love to speak with you.
Different people approach problems differently. We need that.
Pangea is committed to diversity as well as inclusion. We are an
Equal Opportunity workplace and Affirmative Action employer. We do
not discriminate in employment decisions on the basis of race,
color, religion, gender (including pregnancy), national origin,
political affiliation, sexual orientation, gender identity or
expression, marital status, disability, genetic information, age,
veteran status, or any other applicable legally protected
characteristic. All employment decisions are made on the basis of
individual qualifications, merit, and business needs.
Keywords: Pangea, Palo Alto , Site Reliability Engineer - Kubernetes/Terraform/Python, Engineering , Palo Alto, California
Didn't find what you're looking for? Search again!