- Ready to take responsibility for all the dimensions of application reliability – including availability, performance, efficiency, latency, change handling, monitoring, emergency response, and scope/capacity planning.
- Handle production Incidents and drive Development/Operation teams.
- Empowered to fix application issues in production and make sure minor errors are not creating major issues.
- Passion to Investigate issue and find out the root cause. Work on software/tool recommendation and collaborate across team within the organization.
- proactively diagnose the problem using holistic knowledge-set – and then get busy with coding a permanent fix, rewriting a process or working with third parties to ensure that lessons are learnt and the problem never recurs again.
- Employ automation tools (such as Chef or Puppeteer) to automate repetitive tasks and releases for a more efficient workflow