DOF Production Support and Application Monitoring Support Team to support the below:

Core Infrastructure

-Creation and configuration of non-prod environments for all in scope applications.

-Implementation of the ELF for non-production environments

-Triage and resolution of non-prod environment related issues

-Deployment of application baselines to non-production environments for all in scope applications

-Certificate renewals

-Configuration and maintenance of the CI/CD pipelines

-Performance environment (PT WYN, PT DOR) creation, configuration, deployment and support.

-DB activates for non-prod: Install/Maintain application schemas, DB issue resolution, DB configuration, DB maintenance scripts

-GIT administration, Artifactory, SonarQube, ETL admin, UCD admin, UCD scripts, Jenkin configuration,Logstash, ELK, Jmeter

-On and off boarding of new users to access the non-prod environments.

-Implementation of the ELF for production environments

-Deployment of application baselines to production for all in scope applications

-Certificate renewals for prod environments

-MOP updates and reviews for all production deployments

-Configuration and maintenance of the CI/CD pipelines

-Production patching activities for in scope applications

-Production monitoring – (Liveliness Probe, BM worker node, DataGrid, SOSS, POD restarts)

-DB activities for Production: Install/Maintain application schemas, DB issue resolution, DB configuration, DB maintenance scripts, Optimization

-GIT administration, Artifactory, SonarQube, ETL admin, UCD admin, UCD scripts, Jenkin configuration, MDM server admin, Logstash, ELK, Jmeter

-On and off boarding of new users to access the non-prod environments.

AMS Resources

-Manage traffic diversion during deployments

-Validation of code deployment success via backdoor sanity in OM

-Post deployment health monitoring

-Hourly post deployment reporting

-Production patching activities for in scope applications

-Production monitoring – (Liveliness Probe, BM worker node, DataGrid, SOSS, POD restarts)

-Report on System Health Metrics using Dynatrace

-Monitor and action the alert using Bell Monitoring Tools (Dynatrace, BAM, Grafana)

-Monitor of DB server to verify through daily sanity check

-Verify Table Space status and warn if it?s reaching capacity

-Verify Disk Space status and warn if it?s reaching capacity

-Verify Memory and Processor usage and warn if it?s reaching capacity

Production Monitoring:

-Diagnosing and tracking Incidents and problems with Severity Critical (P1) and High (P2) through to Resolution

-Providing the required Production Logs or access to Production Logs to analyze the incidents.

-Provide the Root Cause Analysis for all Critical Incidents.

-Repairing data and associated work caused by invalid data where validation code does not exist or where a -documented Incident caused by a transaction results in failures.

-Providing workarounds for Critical and High Incidents

-Updating relevant system, configuration or process documentation.

-Document and promptly notify Bell of any emergency changes required.

-Participate in AMS Operations Governance meetings (assumed to be bi-weekly)

-Responding to Application-related questions, performing data extraction as required

-Handling ad-hoc requests from end users for information, queries, or reports.

-Providing holiday support coverage

-Performing peak period monitoring and reporting for specific critical applications

-Perform daily health checks for Critical applications.

Apply Job!

Post Views: 1

More Remote Jobs