Cloud Managed Services for Azure Kubernetes Service (AKS)

Annex B

2. Annex B – SCOPE OF SERVICE

2.1 Scope

Sitecore shall provide the Managed Services on a 24/7/365 basis as set forth in this Annex B.

In the event of an issue that cannot be resolved by the Managed Services Team, the Managed Services Team will reach out for further investigation to the Azure Provider or respective Sitecore Solution Partner.

2.2 Azure Infrastructure Support

2.2.1 SERVICE LEVEL AGREEMENT

Sitecore will use commercially reasonable efforts to ensure the Azure Infrastructure listed in Annex C is available twenty-four (24) hours a day, seven (7) days a week, subject to the following:

  • Azure Infrastructure availability guarantee is covered by Azure’s Service Level Agreement for individual Azure services availability which can be found at https://azure.microsoft.com/en-us/support/legal/sla/.
  • The availability commitment is subject to the Uptime Exclusions.

2.2.2 DELIVERABLES

The Managed Services Team will provision the Azure Infrastructure and implement the necessary processes and monitoring tools to provide the following Azure Infrastructure Support:

  • Monitoring
    • Domain monitoring
      • Uptime
      • Traffic (VM network interfaces, load balancers or third party)
    • Virtual machines
      • Percentage CPU
      • Network IN
      • Network OUT
      • Disk Reads operations per second
      • Disk Writes operations per second
      • Memory, if available
      • Change tracking on specific files
    • Azure SQL databases
      • Blocked by firewall
      • CPU percent
      • Physical data read
      • Storage percentage used
      • Deadlocks
      • DTU consumption (if not vCORE)
      • Failed connections
      • Log write percentage
      • Sessions percentage
      • Connection successful
      • Workers percentage
    • Application gateways/Web Application Firewalls
      • Throughput
      • Unhealthy host count
      • Total requests
      • Failed requests
      • Response status
      • Current connections - depends on the Web Application Firewalls SKU and normal traffic metrics
    • Azure Kubernetes Service
      • Node CPU
      • Node memory working set percentage
      • Node Disk percentage
      • Pods Count
      • Pods OutOfMemory container count
      • Containers CPU exceeded percentage
      • Containers memory working set exceeded percentage
  • Monthly reporting
    • Traffic (ex. Load balancers, Application Insights, Google Analytics)
    • Root Cause Analysis
  • Health and Maintenance
    • Backup schedule – See Annex E – Backup/Restore
    • Monitor and apply OS and application patches when applicable to supported environment(s) during Scheduled Maintenance – for AKS, patching will be done from customer by publishing a new docker image to ACR
    • Patches and updates to either software or OS will first be deployed to Development/ Staging environment for Customer approval prior to scheduling Production deployment.
  • Security
    • Patch servers regularly during Scheduled Maintenance when applicable
    • Use Just-in-time access to lock down all administrative ports
    • Use multi-factor authentication to additionally secure access to Azure Portal and Azure DevOps
    • Configure antivirus scans depending on anti-malware software preferences to identify infections and fix where needed. For Kubernetes, appropriate tooling will be discussed on a case by case basis.
    • Customer may not bring third-party software to be installed on the virtual machines except after discussion and upon express approval from Managed Services Team.
  • Monitoring Azure Usage
    • Via the Azure Portal, monitor total account usage and bandwidth for any overages and make recommendations on any account adjustments needed, provided that the Managed Services Team is assigned “Billing Reader” or “Owner” role at subscription level in Azure.
  • Azure Infrastructure Support does not cover:
    • Penetration testing and vulnerability assessment
    • Performance testing
    • Continuous integration/ Continuous deployment
    • Azure Infrastructure Support does not include providing Azure Resource Manager (“ARM”) templates or terraform/ansible scripts for the Azure Infrastructure

2.3 Sitecore Application Support

2.3.1 SERVICE LEVEL AGREEMENT

Sitecore will use commercially reasonable efforts designed to ensure the Sitecore Application is available twenty-four (24) hours a day, seven (7) days a week, but which commitment is subject to the Uptime Exclusions:

2.3.2 DELIVERABLES

  • Monitoring
    • Uptime monitoring, which is typically performed using Azure Application Insights.
    • Define and configure monitoring tools and alert triggers.
  • Application Support
    • Create new Sitecore servers, Kubernetes Clusters and databases
    • Assist with Sitecore configuration
    • Log review and analysis during Scheduled Maintenance
    • Cache optimization
    • Sitecore index maintenance
    • Custom configurations (i.e. configure GeoIPs)
    • Log review and analysis after deployments, up to 3 times per calendar month
    • Assist backup restore
    • Troubleshooting slow performance
    • Maintenance of known DB tables (event_queue, publish_queue)
    • Monitor databases for fragmentation and defragment as needed
    • Archive and remove application logs older than 30 days or as explicitly agreed
    • Assist in communicating with Sitecore Product Support Team
    • Assistance in communicating with Cloudflare, Akamai or other vendors, if necessary
    • Assistance in communicating with Searchstax or other SaaS search vendors, if necessary.
  • Sitecore Application Support does not cover:
    • Sitecore Application upgrades, patch creation or other work that requires code updates and deployments.

2.4 Incident Management Services

Sitecore shall provide services to address Incidents impacting the Azure Infrastructure or Sitecore Application.

2.4.1 DELIVERABLES

  • Managed Services Team will respond to Incidents as outlined in Annex F – Communication Process.
  • When an Incident meets the criteria as defined in 1.2 Priority Definitions, Managed Services Team shall investigate the Incident and use commercially reasonable efforts to mitigate the impact.
  • Critical Incidents will automatically activate the on-call process to involve Managed Services Team and ensure fastest possible resolution. Customer can also activate this process by sending an email to the critical distribution list provided during the Onboarding Phase. Non-critical incidents are handled during business hours and are not subject to the 24/7 coverage as noted in table 2.4.2-Incident Response Times below.

2.4.2 INCIDENT RESPONSE TIMES

Priority

Incident severity

Initial Managed Services Team response time

1

Critical

30 minutes

2

Major

120 minutes

3

Minor

8 business hours (09:00 – 17:00 EST/EDT)

4

Low

2 business days (09:00 – 17:00 EST/EDT)