PaloAltoRecruiter Since 2001
the smart solution for Palo Alto jobs

Site Reliability Engineer - All Levels - (Senior/Lead/Principal) (Multiple Locations)

Company: Salesforce
Location: Palo Alto
Posted on: May 3, 2021

Job Description:

To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts.Job CategoryProducts and TechnologyJob DetailsSite Reliability Engineer - All Levels - (Senior/Lead/Principal) (Multiple Locations)Job DetailsNote: By applying to the Site Reliability Engineer posting, recruiters and hiring managers across the organization hiring Site Reliability Engineers will review your resume. Our goal is for you to apply once and have your resume reviewed by multiple hiring teams.Are you an upcoming or recent graduate (within the past 2.5 years)? Please check out our FutureForce program at . We appreciate your interest but we are seeking industry experienced engineers.Salesforce, the Customer Success Platform and world's #1 CRM, empowers companies to connect with their customers in a whole new way. The company was founded on three disruptive ideas: a new technology model in cloud computing, a pay-as-you-go business model, and a new integrated corporate philanthropy model. These founding principles have taken our company to great heights, including being named one of Forbes's "World's Most Innovative Company" five years in a row and one of Fortune's "100 Best Companies to Work For" eight years in a row. We are the fastest growing of the top 10 enterprise software companies, and this level of growth equals incredible opportunities to grow a career at Salesforce. Together, with our whole Ohana (Hawaiian for "family") made up of our employees, customers, partners and communities, we are working to improve the state of the world.About Salesforce Tech and Product EngineeringOur Tech and Product team is tasked with innovating and maintaining a massive distributed systems engineering platform that ships hundreds of features to production for tens of millions of users across all industries every day. Our users count on our platform to be highly reliable, lightning fast, supremely secure, and to preserve all of their customizations and integrations every time we ship. Our platform is deeply customizable to meet the differing demands of our vast user base, creating an exciting environment filled with complex challenges for our hundreds of agile engineering teams every day.Check out our "We are Salesforce Engineering" videoWe are Salesforce EngineeringDepartmental Description:Salesforce is seeking an engineering candidate to join the Site Reliability organization in one of our US locations. Working closely with counterparts in the Infrastructure and R&D organizations, this organization provides a global team of engineers monitoring cloud service availability and ready to swiftly repair any service-impacting issues. Seven days a week, 24 hours a day, in a follow-the-sun model, the Site Reliability team keeps the Salesforce cloud and our customers protected. As a member of the Site Reliability team, you will be tasked with detecting and resolving incidents within minutes. This objective is met by monitoring the services, reacting to problems, and proactively addressing issues before they affect performance or availability.Position Description:When not fighting fires, the team is responsible for fire prevention through monitoring, automation, self-healing and resiliency initiatives, destructive testing, and game day exercises. The incumbent in this role would demonstrate a strong focus on tactical operations, as well as large-scale production engineering and orchestration.Keep the customer-facing services available at top performance by maintaining the constant health of the supporting cident management - Act in key response roles during major incidents e.g. Sev0, Sev1. Also, participate in the technical review of the incident for problem managementProblem Management - populate in participate in (Root Cause Analyses (RCAs) and hand them off to the Global Solutions teamEnsuring that work carried out by the Site Reliability team is executed in such a way as to comply with the company's internal compliance policy and directivesBeing available to discuss and resolve technical issues and escalations with other technical staff as requiredWork with and lead other members of the team in staying on top of key industry innovation and technology, and assist in team development growthIdentifying work opportunities and preparing or assisting with the preparation of technical proposals as requiredAbility to operate in the high-pressure environment and troubleshoot complex issues quickly successfully handle multiple prioritiesWork to automate detection and resolution of recurring issues in the production environmentBasicRequirements:Bachelors Degree in Computer Science or related field OR equivalent experienceSystems engineering experience in enterprise scale internet service engineering or related roleExpertise in TCP/IP related technologies (networking protocols, network programming, etc.)Expertise in CLI enterprise support of Unix variants (Linux/Solaris/BSD) as well as strong Linux/UNIX knowledge with significant exposure to Red Hat Enterprise Linux and SolarisExperience with monitoring implementations and administrationStrong communication skills (Written and Oral)Past experience in Incident Management and ITIL service operationsExperience in working in a 24/7 team managing large data centersPreferred Qualifications:Masters in Computer SciencePerl/Python/BASH scripting experiencePrior Chef/Puppet or automated deployment experienceExperience in maintaining a monitoring and alert systemsExperience troubleshooting relational databases and distributed platformsExperience in maintaining Java applicationsExperience in Docker orchestration and management.Hands on experience configuring and managing AWS (Amazon Web Services), using the CLI/SDKsExperience managing systems monitoring and alerts.Experience with JVM optimization and Java server technologies like Tomcat or JettyBenefits& perks:We have a public-facing website that explains our various benefits, including wellbeing reimbursement, generous parental leave, adoption assistance, fertility benefits, and more. Visit for the full breakdown!Accommodations - If you require assistance due to a disability applying for open positions please submit a request via this Accommodations Request Form.Posting StatementAt Salesforce we believe that the business of business is to improve the state of our world. Each of us has a responsibility to drive Equality in our communities and workplaces. We are committed to creating a workforce that reflects society through inclusive programs and initiatives such as equal pay, employee resource groups, inclusive benefits, and more. Learn more about Equality at Salesforce and explore our and are Equal Employment Opportunity and Affirmative Action Employers. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, or disability status. and do not accept unsolicited headhunter and agency resumes. and will not pay any third-party agency or company that does not have a signed agreement with or .Salesforce welcomes all.Pursuant to the San Francisco Fair Chance Ordinance and the Los Angeles Fair Chance Initiative for Hiring, Salesforce will consider for employment qualified applicants with arrest and conviction records. Full timeSDL2017

Keywords: Salesforce, Palo Alto , Site Reliability Engineer - All Levels - (Senior/Lead/Principal) (Multiple Locations), Other , Palo Alto, California

Click here to apply!

Didn't find what you're looking for? Search again!

I'm looking for
in category
within


Log In or Create An Account

Get the latest California jobs by following @recnetCA on Twitter!

Palo Alto RSS job feeds