This article provides troubleshooting steps when you receive a CPU critical alert for Magento Commerce in New Relic. Immediate action is required to remedy the issue.
The alert will look something like the following, depending on the alert notification channel you selected.
Affected products and versions
Magento Commerce Cloud Pro.
You will receive a managed alert in New Relic if you have signed up to Managed alerts for Magento Commerce and one or more of the alert thresholds have been surpassed. These alerts were developed by Magento to give customers a standard set using insights from Support and Engineering.
- Abort any deployment scheduled until this alert is cleared.
- Put your site into maintenance mode immediately if your site is or becomes completely unresponsive. For steps refer to DevDocs Installation Guide > Enable or disable maintenance mode. Make sure to add your IP to the exempt IP address list to ensure that you are still able to access your site for troubleshooting. For steps, refer to DevDocs Maintain the list of exempt IP addresses.
- Launch additional marketing campaigns which may bring additional pageviews to your site.
- Run indexers or additional crons which may cause additional stress on CPU or disk.
- Do any major administrative tasks (i.e., Magento Admin, data imports / exports).
- Clear your cache.
Your site may become non-responsive, (if you are not already experiencing a site outage) if you do any of the "Don't" actions before you have investigated and solved the cause of the alert.
Follow these steps to identify and troubleshoot the cause.
Because this is a critical alert, it is highly recommended you complete Step 1 before you try to troubleshoot the issue (Step 2 onwards).
- Check if a Magento support ticket exists. For steps, refer to KB Track your support tickets. Support may have received a New Relic threshold alert, created a ticket and started working on the issue. If no ticket exists, create one. The ticket should have the following information:
- Use New Relic APM's Transaction page to identify transactions with performance issues:
- Sort transactions by ascending Apdex scores. Apdex refers to user satisfaction to the response time of your web applications and services. A low Apdex score can indicate a bottleneck (a transaction with a higher response time). Usually it is related to the database, Redis, or PHP. For steps, refer to New Relic View transactions with highest Apdex dissatisfaction.
- Sort transactions by highest throughput, slowest average response time, most time-consuming, and other thresholds. For steps, refer to New Relic Find specific performance problems.
- If you are still struggling to identify the source, use New Relic APM's Infrastructure page to identify resource heavy services. For steps, refer to New Relic Infrastructure monitoring Hosts page > Processes tab.
- If you identify the source, SSH into the environment to investigate further. For steps, refer to DevDocs Magento Commerce Cloud > SSH into your environment.
- If you are still struggling to identify the source:
- Review recent trends to identify issues with recent code deployments or configuration changes (for example, new customer groups and large changes to the catalog). It is recommended that you review the past 7 days of activity for any correlations in code deployments or changes.
- Consider checking for and disabling flat catalogs. For steps, refer to Slow performance, slow and long running crons.
- If you suspect that you are experiencing a DDoS attack, try blocking bot traffic. For steps, refer to How to block malicious traffic for Magento Commerce Cloud on Fastly level.