Site Reliability Engineer

Get help on your job search

Need help in your climate job search? Dive deep into climate with Terra.do’s 12-week climate bootcamp course.

Terra.do has partnered with ClimateTechList to give ClimateTechList users a 15% discount for its flagship Climate Change: Learning for Action program.

Job Description

The Water Institute is seeking an experienced and motivated Site Reliability Engineer to enhance the resilience of the FloodID application and lead on-call incident response to minimize downtime and resolve issues swiftly. FloodID is an application providing actionable real-time flood forecasts and custom decision support tools to emergency operations center managers. Please watch this 5m overview video of FloodID on YouTube and this LinkedIn post to learn more.

Originally developed with funding support from the State of Louisiana, FloodID is now in use by two emergency operations centers, one statewide and one citywide, with additional business development efforts to expand usage in and around the Gulf Coast and the Eastern seaboard underway. Currently there is a fractional technical development team, including one senior full-stack developer, one data engineer, and one senior computational scientist (i.e., numerical modeler and HPC lead), supported by a senior product manager and the director of the Digital Solutions department at The Water Institute. This position will report to the Senior Computational Scientist.

Roles and Responsibilities: • Reduce manual work by developing and maintaining scripts and tools to automate recurring tasks and streamline incident response. • Implement monitoring tools to proactively identify and resolve potential issues. • Identify existing infrastructure risks and work with the team to develop pragmatic plans for addressing them. • Optimize system performance, enhance stability, scalability, and resilience. • Act as the on-call engineer, responding quickly to incidents, especially during hurricane season (June – Nov). • Diagnose, troubleshoot, and resolve incidents to minimize user impact. • Coordinate with colleagues to ensure rapid incident resolution and root cause analysis. • Document incidents, responses, and resolutions in a clear and structured format. • Contribute to the team's standard operating procedures for handling incidents. • Hands-on coding and development. • Implement best practices for testing, code reviews, and quality assurance to ensure software reliability and security. • Stay up-to-date with industry trends, emerging technologies, and best practices in site reliability and share knowledge with the team.

ClimateTechList.com logo

The Water Institute number of job openings over time by month

ClimateTechList is the web's largest aggregator of climate, clean tech, renewable energy & green jobs. Contact us if you'd like to use partner or use our current or historical jobs data in any way.

Apply to Job

👉 Please mention that you found the job on ClimateTechList, this helps us get more climate tech companies listed here, thanks!

Get a referral to The Water Institute

If possible, try to get a warm intro/referral to The Water Institute before applying! Do a LinkedIn search to see who you may know at the company. See this LinkedIn post from Steven for more details on this tactic.

All job openings from The Water Institute

Join ClimateTechList Talent Collective

Want to be matched with companies directly? Apply to the talent collective.

Here's how it works:

  1. You submit an application

  2. We'll share your profile with climate tech companies potentially interested in chatting with you

  3. We'll reach out if there's a company interested in talking to you.

Join ClimateTechList Talent Collective

Want to be matched with companies directly? Apply to the talent collective.

Here's how it works:

  1. You submit an application

  2. We'll share your profile with climate tech companies potentially interested in chatting with you

  3. We'll reach out if there's a company interested in talking to you.