Welcome to the era of instant gratification. Users expect that their applications will work quickly and correctly the first time, whether it is Office 365 or a mobile shopping transaction. And they won’t wait for a fix.
Unfortunately, application problems are increasingly difficult to diagnose in today’s cloud and mobile environments. With the prevalence of DevOps, the number of release cycles has increased, which in turn introduces more opportunity for problems. Microservices, containers, and virtual machines (VMs) have increased both agility and application complexity.
Application dependency mapping in these scenarios becomes increasing difficult as many of the services are short-lived and dynamic. However, application performance monitoring can help cut through the noise to find bottlenecks quickly and improve the overall digital experience.
Where’s the Problem?
With slow mobile and cloud applications, sluggishness is often not the root cause, but rather a symptom of an underlying infrastructural issue hidden from view. Issues that occur within end user devices, the network, the cloud, web servers, application servers, infrastructure, and the app can all cause mobile application outages or slowdowns. Let’s look at each of these potential problems in detail:
- End-user devices (smartphone, tablet, laptop, desktop) – Although the closest to the customer, the device is often the most difficult to diagnose. There are so many variations in devices/carriers that it’s hard to pinpoint the problem without using end user experience monitoring. Typical problems include: device malfunction, OS failure/OS out of date, and geographic/carrier issues.
- Network (yours and/or cloud provider’s) – Backhauling network traffic to a central data center for security and data protection can also impact performance, especially in cloud-based and mobile applications. More data-intensive applications such as video and rich media can also slow the network. Typical problems include: excessive retransmission, network congestion, network latency, packet loss, and jitter.
- Cloud services (IaaS, PaaS, SaaS) –If you use services from AWS, Microsoft Azure, or Google Cloud, your application could suffer performance slowdowns when any of these underlying services are impacted. Compounding the problem is that often only part of the stack has migrated to the cloud, and it may be difficult to diagnose if the problem is within the services that you control. It can be difficult to monitor performance, ensure consistent user experience, and enforce SLAs. Typical problems include: regional outages/slowdowns and lack of failover strategies.
- Infrastructure (Server/VM) – Server problems contribute to many major outages, and infrastructure configuration problems are common in most of these instances. Typical problems include: configuration error, device malfunction, outdated OS, CPU saturation, overcommitted hypervisors, load balancers, and poor performing database queries/overloaded database
- Web server – Nothing causes a problem faster than a web server error. Link or page errors can immediately halt an application in its tracks. Typical problems include: missing link, page not found, and internal server error.
- APIs – Specific APIs will be unique to your application, but some of them to watch include: user authentication/single sign-on, pricing and merchandising, supply chain and logistics, payment gateways and billing systems, and advertising APIs.
- App itself – Microservices and DevOps practices have accelerated release cycles and introduced greater complexity and inter-dependencies. Typical problems include: bad data call, memory leak, microservices failure, issue with downstream web services, authentication error, and code error.
Application Performance and Mobile Devices
Ensuring application performance on mobile devices is more difficult than on laptops or PCs. This due in part to the wide variation in devices, carriers and O/S. IT cannot install a monitoring agent on employee-owned devices due to privacy concerns and, of course, IT cannot install agents on its customers’ smartphones.
Instrumenting the app itself enables IT to monitor performance while steering clear of privacy issues. Key metrics to measure include: crashes, errors, service performance response time, network, battery, and signal strength. It is also important to monitor the end-to-end business workflow, such as the time to process a claim, in order to manage service level agreements and identify problems before users are impacted.
Developers should instrument internal applications before deploying them in an app store. For third-party apps, many application performance monitoring vendors provide a wrapper that instruments mobile apps without tagging the code.
When a slowdown does occur, prudent organizations follow a consistent process to identify the root cause in mobile applications:
1) Isolate problems to the code, network, or infrastructure layers, and use code-level stack trace to speed resolution.
2) Analyze the performance of the app across device and OS versions, geographies, and carriers to identify trends.
3) Track usage, crashes, errors, HTTP performance, and volume relative to thresholds and geography.
4) Compare the performance of mobile apps across geographies, carriers, devices, and OS versions to optimize performance.
5) Trace transactions from the user, over the network, and into the backend.
6) Reconstruct incidents to fix issues across data centers, cloud services, and containers/microservices.
Troubleshoot problems by monitoring internal and public-facing mobile apps, assessing performance by geographic location and drilling down to analyze key metrics.
Fixing slowdowns means IT is in reactive mode. To improve the digital experience, application performance monitoring should provide proactive insights. By setting performance thresholds at the transaction level, operations and DevOps teams can remedy problems before users are impacted and proactively enforce SLAs with providers.
Optimizing the Cloud and Mobile Experience
Cloud and mobile application slowdowns happen in the best of circumstances. To remediate the situation as quickly as possible, monitor your transactions using end-user experience monitoring coupled with application performance monitoring, network monitoring, and infrastructure monitoring. This way, you can get ahead of the finger pointing and isolate application problems quickly. Combat slow mobile and cloud apps through better visibility from the end-user/device perspective with full stack monitoring extending to the network, infrastructure, and the app itself.
To learn how Best-in-Class companies are using a network data approach to meet performance challenges in real time, and the steps your organization can take to build a real-time platform for addressing application performance problems, check out this in-depth research report by Aberdeen’s Jim Rapoza.
Gayle Levin is director of solutions marketing at Riverbed Technologies.