Finding end-user models

Harder than hunting for needles in the haystack

« Back to Blog

The SAS Institute platform is a powerful tool that offers a host of data aggregation, data cleansing and analytical tools to your analyst community. Because of this breadth of capability, you may have analysts or teams creating what regulatory groups would consider models subject to model governance and compliance review. Furthermore, many SAS workloads qualify under the CECL, CCAR and IFRS9 regulatory guidelines as End User Computing workloads, which need to be inventoried, reviewed and placed under a governance scope.

Do you know who these teams and analysts are? Can you prove that your model governance process has identified all these models? Chances are, many of these models and End User Computing instances have gone unidentified, in many cases because they’re hiding in plain sight. SAS workloads can be created, executed and results generated through multiple means; there isn’t just one way to execute a SAS workload.

Our clients have developed a variety of approaches to identify these SAS analytics workloads themselves, with mixed results.

  • Some send out surveys to every analyst for self-identification of SAS users and their workloads. This relies on the willingness of the analysts to participate, leading to potential biases. We’ve seen examples of some users who want to actively avoid spending the time to participate in these surveys, or who put in the minimum amount of time, leading to gaps in the evidence.
  • Some clients run an inventory of SAS users registered in the central SAS administrative tier. However, not every user needs to be registered to execute a SAS workload, many registered SAS users are actually inactive, and only a few are both active and registered. This approach also doesn’t yield any information about the type of workloads they execute, who else in the organization uses the outputs of these models or their data assets.
  • Some clients scan their file systems for SAS files, but many users store their files on hard drives, or have archived them, or store them in SAS project files (which are binary and encoded). Furthermore, some of these files haven’t been executed in years, and a file scan won’t tell you that, or whether it’s analytically relevant.
  • A very few clients scan their SAS server logs, but these logs only include the actively-run workloads, not those workloads that ran six months ago, or those that run on personal workstations.
  • Finally, most SAS workloads are not actually analytical in nature, and this requires sifting through hundreds or thousands of workloads before a manual review turns up a few that are actually of interest.

Corios has developed a software and consulting services approach, called Corios Rosetta, to systematically identify all these workloads, categorize them into analytics, data transformation and other relevant buckets, down to the individual line-of-code detail.

We see clients needing three key capabilities: identifying analytic workloads by content type, identifying users who build analytic workloads, and reviewing the code of the analytics workloads in detail for selected users. Here are some examples of each capability.

In the first image below, the Keyword Report provides a listing of the categories of SAS workloads. We’ve highlighted the Analytics User category on the left, with a focus on statistical models, and displayed on the right the listing of all jobs that use linear regression, plus sample lines of code from the workloads that include that technique.

In the next capability, the top table in the High Value Workloads report displays the various scores we apply to all users and their workloads, highlighting Score 17 for “High Value Analytics”, and identifying Max Devault (note: this is a pseudonym) as a high value analytics user in the Risk and Asset Analytics team. His 15 workloads are itemized in the table at the bottom of the screen.

Following the breadcrumbs on Max Devault’s workloads, the Workload Syntax report shows an example of the specific analytic workload code that Max wrote for one of his workloads from the prior report. If you look closely, you can see the guts of a equity pricing model taking shape.

If you find these examples interesting, click here to read more about Corios Rosetta and how we can help you identify and catalog all your analytics workloads for model governance and End User Computing compliance.

Robin Way

The Founder and President of Corios, Robin’s professional passion lies in democratizing and demystifying the science of applied analytics. An established thought leader fueled with 30 years’ experience in the design, development, execution and improvement of applied analytics models, Robin welcomes every opportunity to move the analytics conversation forward.

Connect with him on LinkedIn , or reach out to Corios to get in touch.