Tag Archives: BRE BRMS Adaptive

Forensic Analytics and the search for “robust” solutions

12 Jan

Happy New Year!

This entry has been sitting in my “to publish” file for some time. There is much more to be said on the topic. however, in the interest of getting it out … enjoy!

=======================================================

This entry was prompted by the article in the INFORMS ANALYTICS Magazine article titled Forensic Analytics: Adapting to a Growing Pandemic by Priti Ravi who is a senior manager with Mu Sigma and specializes “in providing analytics-driven advisory services to some of the largest retail, pharmaceutical and technology clients spread across the United States.”

Ms. Ravi writes a good article that left me hanging. Her conclusion was that the industry lacks access to sophisticated and intelligent monitoring equipment, and there exists a need for a “robust fraud management systems” that “offer a collective set of techniques” to implement a “complex adaptive approach.” I could not agree more. However, where are these systems? Perhaps even what are these systems?

Adaptive Approaches

To the last question first. What is a Complex Adaptive Approach? If you Google the phrase, the initial entries involve biology and ecosystems. However, wikipedia’s definition encompasses medicine, business and economics (amongst others) as areas of applicability. From an analytics perspective, I define complex adaptive challenges as those that  are impacted by the execution of the analytics – by doing the analysis, the observed behaviors change. This is inherently true of fraud as the moment perpetrators  understand (or believe) they can be detected, behavior will change. However, it also applies to a host of other type of challenges: criminal activity, regulatory compliance enforcement, national security; as well as things like consumer marketing and financial investment.

In an article titled Images & Video: Really Big Data the authors (Fritz Venter the director of technology at AYATA; and Andrew Stein the chief adviser at the Pervasive Strategy Group. define an approach they call “prescriptive analytics” that is ideally suited to adaptive challenges. They define prescriptive analytics as follows:

“Prescriptive analytics leverages the emergence of big data and computational and scientific advances in the fields of statistics, mathematics, operations research, business rules and machine learning. Prescriptive analytics is essentially this chain of transformations whereby structured and unstructured big data is processed through intermediate representations to create a set of prescriptions (suggested future actions). These actions are essentially changes (over a future time frame) to variables that influence metrics of interest to an enterprise, government or another institution.”

My less wordy definition:  adaptive approaches deliver a broad set of analytical capabilities that enables a diverse set of integrated techniques to be applied recursively.

What Does the Robust Solution Look Like?

Defining adaptive analytics this way, one can identify characteristics of the ideal “robust” solution as follows:

  • A solution that builds out a framework that supports the broad array of techniques required.
  • A solution that is able to deal with the the challenges of recursive processing. This is very data and systems intensive. Essentially for every observation evaluated, the system must determine whether or not the observation changes any PRIOR observation or assertion.
  • A solution that engages users and subject matter experts to effectively integrate business rules. In an environment where traditional predictive analytic models have a short shelf life (See Note 1), engaging with the user community is often the mechanism to quickly capture environmental changes. For example, in the banking world, tracking call center activity will often identify changes in fraud behavior faster than a neural network set of models. Engaging the User in the analytical process will require user interfaces, and data visualization approaches that are targeted at the user population, and integrate with the organization’s work processes. Visualization will engage non technical users to help them apply their experience and intuition to the data to expose insights. The census bureau has an interesting page, and if you look at Google Images, you can get an idea of visualization approaches.
  • A solution that provides native support for statistical and mathematical functions supporting activities associated with data mining : clustering, correlation, pattern discovery, outlier detection, etc.
  • A solution that structures unstructured data: categorize, cluster, summarize, tag/extract. Of particular importance here is the ability to structure text or other unstructured data into taxonomies or ontologies related to the domain in question.
  • A solution that persists data with the rich set of metadata required to support complex analytics. While it is clearer why unstructured data must be organized into a taxonomy / ontology, this also applies to structured data. Organizing data consistently across the variety of sources allows non obvious relationships to be exposed, and application of more complex analytical approaches.
  • A solution that is relatively data agnostic  – data will come from many places and exist in many forms. The solution must manage the diversity and provide a flexible way to integrate new data into the analytical framework.

What are Candidate Tools ?

And now to the second question: where are these tools? It is hard to find tools that claim to be “adaptive analytic” tools; or “prescriptive analytics” tools or systems in the sense that I have described them above. I find it interesting that over the last five years, major vendors have subsumed complex analytical capabilities into a more easily understandable components. Specifically, you used to be able to find Microsoft  Analytical Services easily on their site. Now it is part of MS SQL Server as SSAS; much the same way that the reporting service is now part of the database offer as SSRS (reporting services). There was a time a few years ago when you had to look really hard on the MS site to find Analytical Services. Of course since then Microsoft has integrated various BI acquisitions into the offer and squared away their marketing communication. Now their positioning is squarely around  BI and the database. Both of these concepts are easier to sell at the executive level, than the notion of prescriptive or adaptive analytics.

The emergence of databases and appliances optimized around analytics has simplified the message on the data side. everyone knows they need a database, and now they have one for analytics. At the decision maker level, that is a much easier decision than trying to figure out what kind of analytical approach the organization is going to adopt. People like Teradata have always supported analytics through the integration of SAS and now R as in-database functionality. However, Greenplum, Neteeza and others have incorporated SAS and the open source analytical “R” . In addition, we have seen the emergence (not new but much more talked about it seems) of the columnar database. The one I hear about most is the Sybase IQ product; although there have been a number of posts on the topic on here, here, and here.

My point here is that vendors have too hard a time selling complex analytical solutions, and have subsumed the complex capabilities into the concepts that are easier to package, position and communicate around; namely; database products and Business Intelligence products. The following are product sets that are candidates for the integrated approach. We start with the big players first and work towards that are less obviously candidates.

SAS

The SAS Fraud Framework provides an integration of all the SAS components that required to implement a comprehensive analytics solution around adaptive challenges (all kinds of fraud, compliance, money laundering, etc. as examples). This is a comprehensive suite of capabilities that spans all activities: data capture, ingest, and quality; analytics tools (including algorithm libraries), data visualization and reporting / BI capabilities. Keep in mind that SAS is a company that sells the building blocks, and the Fraud Framework is just that, a framework within which customers can build out capabilities. This is not a simple plug and play implementation process. It takes time and investment and the right team within the organization. The training has improved, and it is now possible to get comprehensive training.

As with any implementation of SAS, this one comes with all the caveats associated with comprehensive enterprise systems that integrate  analytics into the fabric of an organization. The Gartner 2013 BI report indicates that SAS “very difficult to implement”. This theme echoes across the product set.  Having said that   when it comes to integrated analytic of the kind we have been discussing all, of the major vendors suffer from the same implementation challenges – although perhaps for different reasons.

Bottom line however, is that SAS is a company grounded in analytics – the Fraud Framework has everything needed to build out a first class system. However, the corporate culture builds products for hard core quants, and this is reflected in the Gartner comments.

IBM

IBM is another company that has the complete offer. They have invested heavily in the analytics space, and between their ETL tools; the database/ appliance and Big Data capabilities; the statistical product set that builds off SPSS; and, the Cognos BI suite users can build out the capabilities required. Although these products are being integrated into a seamless set of capabilities, they remain somewhat separate and this probably explains some of the implementation challenges reports. Also, the product side of the IBM operation does not necessarily speak with the Global Services side of the house.

I had thought when IBM purchased Systems Research & Development (SRD) in 2005 that they were going to build out capabilities that SRD and Jeff Jonas had developed. Jeff heads up the Entity Analytics group within IBM Research, and his blog is well worth the read. However, the above product set appears to have remained separated from the approaches and intellectual knowledge that came with SRD. This may be on purpose – from a marketing perspective, buy the product set, and then buy IBM services to operationalize the system is not a bad approach.

Regardless, as the saying goes, no one ever got fired for buying IBM” probably still holds true. However, like SAS beware of the implementation! Any one of the above products (SPSS, Cognos, and Infosphere) require attention when implementing. However, when integrating as an operational whole, project leadership needs to ensure that expectations as to the complexity and time frame are communicated.

Other Products

There are many other product sets and I look forward to learning more about them. Once I post this, someone is going to come back and mention “R” and other open source products. There are plenty out there. However, be aware that while the products may be robust, many are not delivered as an integrated package.

With respect to open source tools, it is worth noting that the capabilities inherent in Hadoop – and the related products, lend themselves to adaptive analytics in the sense that operators can consistently re-link and re-index on the fly without having to deal with where and how the data is persisted. This is key in areas like signals intelligence, unstructured data analysis, and even structured data analysis where the notion of semantic equivalence is shifting. This is a juicy topic all by itself and worthy of a whole blog entry.

Notes:

  1. Predictive analytics relies on past observations to predict future observations. In an adaptive environment, the inputs to those predictive models continually change as a result of the outputs using the past observations.
Advertisement

TIBCO – Buys another company

1 Apr

TIBCO buys another company in the analytics space. I have always thought that with Spotfire, the Enterprise Service Bus business, and the acquisition of Insightful some years ago, TIBCO had the makings of a company that was putting together the Big Data analytical stack. With this purchase, the have added a geo capability. Will they ever get all these pieces integrated to create a solutions package – like SAS’s Fraud Framework? Not sure why they have not done that to date. It may just be that it is too hard to sell complete solutions, and it is easier to get in the door with a point solution? Anyway – I like Spotfire, and anything they do to build out the back end is good stuff. Price point still seems a little high for a point solution, but they seem to be making it work for them, so who am I to argue… interesting to see how this plays out.

See also here – as they post in the MDM Magic Quadrant as well.

Business Rules Engines & “Prescriptive Analytics”

24 Dec

Someone asked me the other day about how to best evaluate business rule engines (BRE) or business rules management systems (BRMS). The following were the quick notes. BRE are part of a solution. I like what this month’s INFORM magazine said when talking about “Prescriptive Analytics” (page 14); which I find falls into the same category of “Adaptive Analytics”. They define prescriptive analytics as follows:

“Prescriptive analytics leverages the emergence of big data and computational and scientific advances in the fields of statistics, mathematics, operations research, business rules and machine learning. Prescriptive analytics is essentially this chain of transformations whereby structured and unstructured big data is processed through intermediate representations to create a set of prescriptions (suggested future actions). These actions are essentially changes (over a future time frame) to variables that influence metrics of interest to an enterprise, government or another institution.”

With that in mind to frame a discussion of BRMS…

Why was the BRE created? Different companies have approached things differently, which has resulted in differing feature sets. TIBCO has a rules engine that was built around their Enterprise Service Bus (ESB) business; SAS has a rules engine feature that is organized around processing rules within the construct of the various analytical packages (money laundering; fraud); Streambase has a rules engine optimized around applying “real time” rules in streaming data within the financial trading data.

Many BRE / BRMS started as workflow or content management tools, and conflate the idea of a rules engine with workflow related management. This is important to understand as certain applications of BRE need to be extracted away from the workflow management aspect of things

The above tends to frame why BRE products are built the way they are. However, there are a core set of capabilities that exist within all rules engines:

Rule Management. Does the BRMS provide the rules authoring environment that is appropriate for the particular installation? One is looking for the ability of the rules authoring approach to support a level of abstraction around how business entities are identified and how rules are applied to those entities (Business Entity Model).

How does the BRMS support the ability to evaluate current rules and re-use rules, or portions of rules? This is related to how the BRMS supports the evolution of rules in order to optimize them.

Processing functions: When creating rules, what functions exist within the rules authoring component of the engine that can be called on by the user? This discussion falls into two categories:

  1. Functions that are universal – think of these as functions that are similar to those available excel: sum; count; average; absolute value, etc.
  2. Functions that are external to the rules engine but whose invocation is handled by the tool.

The latter is perhaps most important from an analytical perspective in that few rules engines appear to have capabilities to apply a complex rule based on a native set of functions. Additionally, the ability to call external functions, or integrate non-BRE logic, impacts a BRE’s ability to deal with file based data and the associated processing that is part of the Hadoop / MapReduce approach to the Big Data challenge.

How rules are triggered. How does the tool support rule management from a processing perspective? In complex event processing there needs to be a way to manage when rules are executed. This becomes important when a collection of rules must be executed only on receipt of all the data required. This particular perspective is framed by how the solution has been architected. In general, there is a desire to reduce the input / output activity with the data source(s) if that activity does not produce an actionable result.

Sourcing the input. What functions are native to the BRE that provides access to data? Are there connectors to all the major databases; unstructured sources; web sites (open source); file systems? This is perhaps less of an issue than it used to be as the notion of data virtualization has evolved and tools are available to easily format data for delivery to the BRE/

Interfaces. What interface capabilities exist to all of the BRE components? Does the BRE have well published API’s that expose the functions required to effectively integrate the tool into the overall workflow. The ability to reach into the system to expose information of interest is critical. This is most important from an operational perspective. In the management of complex event processing rules, the ability to look into the system and create a simplified view of what can be an overwhelming level of data is critical.

See also:

  • December Informs magazine article on analysis of imagery and “Presecriptive Analytics” This frames the discussion on the combination of capabilities that needs to exist in a decision support capability for complex and adaptive problems; a rules based approach is one f the capabilities.
  • Semantic Reasoning
  • Complex Event Processing

%d bloggers like this: