Tag Archives: visualization

The addition of analytical functions to databases

14 Nov

The trend has been for database vendors to integrate analytical functions into their products; thereby moving the analytics closer to the data (versus moving the data to the analytics). Interesting comments in the article below on Curt Monash’s excellent blog.

What was interesting to me, was not the central premise of the story that Curt does not  “think [Teradata’s] library of pre-built analytic packages has been a big success”, but rather the BI vendors that are reportedly planning to integrate to those libraries: Tableau, TIBCO Spotfire, and Alteryx. This is interesting as these are the rapid risers in the space, who have risen to prominence on the basis of data visualization and ease of use – not on the basis of their statistical analytics or big data prowess.

Tableau and Spotfire specifically focused on ease of use and visualization as an extension of Excel spreadsheets. They have more recently started to market themselves as being able to deal with “big data” (i.e. being Hadoop buzzword compliant). With the integration to a Teradata stack and presumably integrating front end functionality into some of these back end capabilities, one might expect to see some interesting features. TIBCO actually acquired an analytics company. Are they finally going to integrate the lot on top of a database? I have said it before, and I will say it again, TIBCO has the ESB (Enterprise Service Bus), the visualization tool in Spotfire and the analytical product (Insightful); hooking these all together on a Teradata stack would make a lot of sense – especially since Teradata and TIBCO are both well established in the financial sector. To be fair to TIBCO, they seem to be moving in this direction, but it has been some time since I used the product).

Alteryx is interesting to me in that they have gone after SAS in a big way. I read their white paper and downloaded the free product. They keep harping on the fact that they are simpler to use than SAS, and the white paper is fierce in its criticism of SAS. I gave their tool a quick run through, and came away with two thoughts: 1) the interface while it does not require coding/script as SAS does, cannot really be called simple; and 2) they are not trying to do the same things as SAS. SAS occupies a different space in the BI world than these tools have traditionally occupied. However,…

Do these tools begin to move into the SAS space by integrating onto foundational data capabilities? The reason SAS is less easy to use than the products of these rapidly growing players is that the rapidly growing players have not tackled the really tough analytics problems in the big data space. The moment they start to tackle big data mining problems requiring complex and recursive analytics, will they start to look more like SAS? If you think I am picking on SAS, swap out SAS for the IBM Cognos, SPSS, Netezza, Streams, Big Insights stack, and see how easy that is! Not to mention the price tag that comes with it.

What is certain is that these “new” players in the Statistical and BI spaces will do whatever they can to make advanced capabilities available to a broader audience than traditionally has been the case with SAS or SPSS (IBM). This will have the effect of making analytically enhanced insights more broadly available within organizations – that has to be a good thing.

Article Link and copy below

October 10, 2013

Libraries in Teradata Aster

I recently wrote (emphasis added):

My clients at Teradata Aster probably see things differently, but I don’t think their library of pre-built analytic packages has been a big success. The same goes for other analytic platform vendors who have done similar (generally lesser) things. I believe that this is because such limited libraries don’t do enough of what users want.

The bolded part has been, shall we say, confirmed. As Randy Lea tells it, Teradata Aster sales qualification includes the determination that at least one SQL-MR operator — be relevant to the use case. (“Operator” seems to be the word now, rather than “function”.) Randy agreed that some users prefer hand-coding, but believes a large majority would like to push work to data analysts/business analysts who might have strong SQL skills, but be less adept at general mathematical programming.

This phrasing will all be less accurate after the release of Aster 6, which extends Aster’s capabilities beyond the trinity of SQL, the SQL-MR library, and Aster-supported hand-coding.

Randy also said:

  • A typical Teradata Aster production customer uses 8-12 of the prebuilt functions (but now they seem to be called operators).
  • nPath is used in almost every Aster account. (And by now nPath has morphed into a family of about 5 different things.)
  • The Aster collaborative filtering operator is used in almost every account.
  • Ditto a/the text operator.
  • Several business intelligence vendors are partnering for direct access to selected Teradata Aster operators — mentioned were Tableau, TIBCO Spotfire, and Alteryx.
  • I don’t know whether this is on the strength of a specific operator or not, but Aster is used to help with predictive parts failure applications in multiple industries.

And Randy seemed to agree when I put words in his mouth to the effect that the prebuilt operators save users months of development time.

Meanwhile, Teradata Aster has started a whole new library for relationship analytics.

Advertisement

Data Visualisation » Martin’s Insights

23 Apr

This is a good article on data visualization. The author indicates in his considerations section that “real data can be very difficult to work with at times and so it must never be mistaken that data visualisation is easy to do purely because it is more graphical.” This is a good point. In fact in some respects determining what the right visualization is can be harder than simply working with the data directly – however, much harder to communicate key insights to a diverse audience.

What rarely gets enough attention is that in order to create interesting visualizations, the underlying data needs to be structured and enhanced to feed the visualizations appropriately. The recent Boston bombing where one of the bombers slipped through the system due to a name misspelling recalled a project years ago where we enhanced the underlying data to identify “similarities” between entities (People, cars, addresses, etc.) For each of the entities, the notion of similarity was defined differently; for addresses it was geographic distance; for names it was semantic distance; for cars, it was matching on a number of different variables; and for text narratives in documents we used the same approach that the plagiarism tools use. In this particular project a name misspelling, and the ability to tune the software to resolve names based on our willingness to accept false positives, allowed us to identify linkages that identified  networks. Once the link was established we went back and validated the link. In the above example, the amount of metadata generated to create a relatively simple link chart was significant – the bulk of the work. In terms of data generated, it is not unusual for data created to dwarf the original data set – this is especially true if there are text exploitation and other unstructured data mining approaches used.

So … Next time the sales guy shows you the nifty data visualization tool, ask about the data set used, and how massaged it needed to be.

http://www.martinsights.com/?p=492&goback=%2Egde_4298680_member_232053156

TIBCO – Buys another company

1 Apr

TIBCO buys another company in the analytics space. I have always thought that with Spotfire, the Enterprise Service Bus business, and the acquisition of Insightful some years ago, TIBCO had the makings of a company that was putting together the Big Data analytical stack. With this purchase, the have added a geo capability. Will they ever get all these pieces integrated to create a solutions package – like SAS’s Fraud Framework? Not sure why they have not done that to date. It may just be that it is too hard to sell complete solutions, and it is easier to get in the door with a point solution? Anyway – I like Spotfire, and anything they do to build out the back end is good stuff. Price point still seems a little high for a point solution, but they seem to be making it work for them, so who am I to argue… interesting to see how this plays out.

See also here – as they post in the MDM Magic Quadrant as well.

Link

Microsoft Powerpivot

17 Feb

Microsoft Powerpivot

I have always thought that Tableau was initially just a cool way to create Excel pivot tables and populate the results in a graphic – something you can do in Excel, but was  a lot easier in Tableau. Is Powerpivot the  MS answer to these tools that have leveraged excel, but have not used the excel visualization capabilities or been willing/able to write the VB code to get Excel to do what you want it to do?

I do not have Office 2013, but look forward to playing with this when I do.

The next thing – how visualizations interact with Big Data?

29 Jan

If big data is to be really useful for the masses, it needs to be distilled down to something that can be intuitively understood on mobile devices

Very positive outlook on the BI space – Cloud will bring it to the masses; visualization will make it understandable.

%d bloggers like this: