The real-time monitoring aspects of ControlUp and the incredible power of its Script and Automation Actions are so impressive that they can sometimes overshadow the value of ControlUp Insights.
ControlUp Insights uses machine learning to provide recommended sizing of your virtual machines based on their usage. Are they over spec’d, under spec’d or right-sized? ControlUp Insights can tell you. Additionally, ControlUp uses it in their forecasts. Host forecasting is done by leveraging machine learning to see if the usage of hosts is trending up or down and to what level. You can use this information to plan for hardware acquisitions, allowing you to keep performance high by avoiding congestion.
ControlUp Insights is a fantastic tool, and I want to show you some of the metrics I use to help me do my job better.
Published App Usage Details
Published App Usage Details has become my most-used metric. I love that I can get usage details for all my apps published across four different Citrix Virtual Apps and Desktops sites quickly and easily. One of the obvious benefits is knowing how much demand there is for each app I have published, but I also find this information really useful for gatekeeping the UAT process (User Acceptance Testing).
When I’m on a conference call and an app owner gives the greenlight for their app to move to production saying it has been thoroughly tested and I can see it has been launched once for a grand total of 3 minutes of use, I know to ask more probing questions and to push back. When an app owner states 2k users will be using an app and Insights tells me it’s had 300 unique users in the last month, I know the numbers don’t lie, the app owners do and it’s time to scale down my VDAs since the demand isn’t what they claimed it would be. ☺
Branch Mapping and Protocol Latency
Since the time the COVID-19 pandemic reared its ugly head, my workplace (like most others) has sent a large swath of our workforce to work from home. In spite of it all, we also opened a new campus during these times. I found Branch Mapping within ControlUp—along with the Protocol Latency metrics—to be incredibly powerful. It lets us see what latency is like at individual sites, but even more importantly, we can see latency across all of our sites plotted out on a single chart that shows which sites deviate from the norm of all other sites. Tagging subnets and/or IP ranges with their locations—including for those working remotely—can be monotonous in large sites, particularly if you have to go through your AD Sites and Services manually to retrieve the information, BUT the juice is definitely worth the squeeze.
While you can see session latency, slow logons, and more in real time via the ControlUp Console (thanks to its excellent real-time monitoring), in ControlUp Insights, you can also see network trends via protocol latency for a single site—or even for all sites—to compare and see which locations have subpar performance to empower you to take corrective actions. Specific to the work-from-home surge, if you see spikes for remote workers (e.g. it may indicate an ISP issue or something amiss with a person’s home network setup).
See Scout Bees for an excellent way to detect ISP related issues in real-time and alert on them plus a lot more
Host Forecasts
We recently had an undertaking that involved moving to new hosts with newer processors. While ControlUp Insights provides some great information for right-sizing your environment, we didn’t lean on it for this purpose, as our hardware—Citrix VDA distribution (and pretty much everything sizing-wise)—follows guidance from the vendor, who provides our most business-critical app.
However, what I found insightful throughout the host replacement project was the Hosts Forecast feature. There has been something deeply satisfying about seeing our RAM and CPU usage percentages going down (and they’re forecasted to go down even further)!
These forecasts can be useful not just for large sweeping changes like hardware replacements, but also for changes like firmware upgrades, anti-virus upgrades, and more. If you target these changes to groups of machines on a subset of hosts before deploying widely, you can see what the potential impact will be when you do deploy it more widely. You can also, of course, go from your host-level data into computer stats and into user session details to get a complete picture.
Logon Duration
Every monitoring and analytics product worth its salt provides data on user logon durations. The Analyze User Logon Script Action helps ControlUp stand out on this front, as it provides unmatched granularity and detail. But did you know that Insights provides a high-level overview of the average logon duration over a selected period of time for your organization or for a specific set of machines or sites AND it shows you how your performance compares with others in the community?? The bulk of ControlUp customers use the cloud-based version that provides a way for ControlUp to provide a community baseline for all of us.
I have a Citrix site where we don’t do any profile management—all data is stored within the main app’s database. Most app settings are configured on the back end and not through Group Policy. As you might imagine, the logon duration is super short. We look like rock stars compared to the community, but that’s not the reality for most apps or desktops.
We have other sites where we do use FSLogix Profile Containers, more group policy objects and Citrix WEM. We’ve got full virtual desktops with several apps that run on startup. This is closer to the reality and when built initially and not fine-tuned, we look a lot less impressive, but having that baseline to compare against gives great perspective and a realistic goal for which to aim.
On the topic of products like FSLogix and WEM, the data from ControlUp can help you determine how to configure what should be excluded from the Profile Containers, what processes to optimize in WEM and more.
Virtual Expert Findings
The first dashboard you see when you login to ControlUp Insights is your Top Insights. This shows you a large range of important metrics that are worth reviewing as part of your daily health checks. I typically look at the average logon durations, for which users have the longest logon durations, total user and session counts, as well as machine resource usage. Looking at these every day will familiarize you with what is normal—and what is abnormal—for your environment and provide a quick indicator that something requires your attention.
Personally, I generate an email report using a data dump from ControlUp via the Export Schedule feature. This is a great way to extend and customize a daily report plus provide information to others who either do not have access to Insights or only care about a few metrics making access to Insights over the top for their needs.
The Top Insights dashboard is a lot more powerful than my static email report and empowers me to drill deeper into each metric, so I can determine what I need to act on.
If I see that one person consistently has longer logon durations, I can dive into their logon and solve the problem (e.g. someone possibly dragging something huge into their session or a home network issue). If I see that a certain process happens suddenly or is the most CPU-intensive when it usually isn’t, I can dive into that (e.g. an updater process or some sort of data grabber/phone home mechanism that could and should be disabled in an enterprise environment).
I already mentioned the Environment Sizing dashboard, which is excellent for planning and fine-tuning, but there is also an Environment Assessment dashboard that, when combined, can show you what percentile you are in for CPU and RAM usage/sizing versus other organizations, where your areas of focus for fine tuning should be, and best of all very direct recommendations e.g., Add 2GB of memory, Remove 2 CPUs etc.
What I love about ControlUp is that it doesn’t just give you useful information to look at and digest, it presents that information in a way that is easily actionable and of course thanks to Automated Actions, we can also potentially automate some of our fixes. Whether it’s addressing our top Windows Event errors, addressing slow logons or network performance or even corrective actions for disk activity, ControlUp empowers me to be proactive and efficient which makes my job easier and ultimately our colleagues’ work experience better.
Thanks to Melissa and Trentent from ControlUp. Trentent for providing sanitized screenshots that I can share and Melissa for such a great job cleaning up my grammar!
Photo by Patrick Ward on Unsplash