Archive | Visualization RSS feed for this section

The addition of analytical functions to databases

14 Nov

The trend has been for database vendors to integrate analytical functions into their products; thereby moving the analytics closer to the data (versus moving the data to the analytics). Interesting comments in the article below on Curt Monash’s excellent blog.

What was interesting to me, was not the central premise of the story that Curt does not  “think [Teradata’s] library of pre-built analytic packages has been a big success”, but rather the BI vendors that are reportedly planning to integrate to those libraries: Tableau, TIBCO Spotfire, and Alteryx. This is interesting as these are the rapid risers in the space, who have risen to prominence on the basis of data visualization and ease of use – not on the basis of their statistical analytics or big data prowess.

Tableau and Spotfire specifically focused on ease of use and visualization as an extension of Excel spreadsheets. They have more recently started to market themselves as being able to deal with “big data” (i.e. being Hadoop buzzword compliant). With the integration to a Teradata stack and presumably integrating front end functionality into some of these back end capabilities, one might expect to see some interesting features. TIBCO actually acquired an analytics company. Are they finally going to integrate the lot on top of a database? I have said it before, and I will say it again, TIBCO has the ESB (Enterprise Service Bus), the visualization tool in Spotfire and the analytical product (Insightful); hooking these all together on a Teradata stack would make a lot of sense – especially since Teradata and TIBCO are both well established in the financial sector. To be fair to TIBCO, they seem to be moving in this direction, but it has been some time since I used the product).

Alteryx is interesting to me in that they have gone after SAS in a big way. I read their white paper and downloaded the free product. They keep harping on the fact that they are simpler to use than SAS, and the white paper is fierce in its criticism of SAS. I gave their tool a quick run through, and came away with two thoughts: 1) the interface while it does not require coding/script as SAS does, cannot really be called simple; and 2) they are not trying to do the same things as SAS. SAS occupies a different space in the BI world than these tools have traditionally occupied. However,…

Do these tools begin to move into the SAS space by integrating onto foundational data capabilities? The reason SAS is less easy to use than the products of these rapidly growing players is that the rapidly growing players have not tackled the really tough analytics problems in the big data space. The moment they start to tackle big data mining problems requiring complex and recursive analytics, will they start to look more like SAS? If you think I am picking on SAS, swap out SAS for the IBM Cognos, SPSS, Netezza, Streams, Big Insights stack, and see how easy that is! Not to mention the price tag that comes with it.

What is certain is that these “new” players in the Statistical and BI spaces will do whatever they can to make advanced capabilities available to a broader audience than traditionally has been the case with SAS or SPSS (IBM). This will have the effect of making analytically enhanced insights more broadly available within organizations – that has to be a good thing.

Article Link and copy below

October 10, 2013

Libraries in Teradata Aster

I recently wrote (emphasis added):

My clients at Teradata Aster probably see things differently, but I don’t think their library of pre-built analytic packages has been a big success. The same goes for other analytic platform vendors who have done similar (generally lesser) things. I believe that this is because such limited libraries don’t do enough of what users want.

The bolded part has been, shall we say, confirmed. As Randy Lea tells it, Teradata Aster sales qualification includes the determination that at least one SQL-MR operator — be relevant to the use case. (“Operator” seems to be the word now, rather than “function”.) Randy agreed that some users prefer hand-coding, but believes a large majority would like to push work to data analysts/business analysts who might have strong SQL skills, but be less adept at general mathematical programming.

This phrasing will all be less accurate after the release of Aster 6, which extends Aster’s capabilities beyond the trinity of SQL, the SQL-MR library, and Aster-supported hand-coding.

Randy also said:

  • A typical Teradata Aster production customer uses 8-12 of the prebuilt functions (but now they seem to be called operators).
  • nPath is used in almost every Aster account. (And by now nPath has morphed into a family of about 5 different things.)
  • The Aster collaborative filtering operator is used in almost every account.
  • Ditto a/the text operator.
  • Several business intelligence vendors are partnering for direct access to selected Teradata Aster operators — mentioned were Tableau, TIBCO Spotfire, and Alteryx.
  • I don’t know whether this is on the strength of a specific operator or not, but Aster is used to help with predictive parts failure applications in multiple industries.

And Randy seemed to agree when I put words in his mouth to the effect that the prebuilt operators save users months of development time.

Meanwhile, Teradata Aster has started a whole new library for relationship analytics.

Advertisement

Primer on Big Data, Hadoop and “In-memory” Data Clouds

25 Aug

This is a good article. There have been a number of articles recently on the hype of big data, but the fact of the matter is that the technology related to what people are calling “big data” is here to stay, and it is going to change the way complex problems are handled. This article provides an overview. For those looking for products, this has a good set of links.

This is a good companion piece to the articles by Wayne Eckerson referenced in this post

Debate over NSA collecting information … can the media begin to report substantively?!!

7 Jun

So this business of the NSA collecting data should come as no surprise to anyone. The media is having a field day! The issue is whether or not the intel community is doing this legally. Has the FISA court done anything illegal? The court is guided by a set of rules that are mean to be transparent, known to the non-intel world, and approved by Congress. Did they follow these rules when allowing the NSA to collect what it collected? Did they restrict the use of that data appropriately? These are the questions — remember after 9-11 when everyone was asking (with outrage) why we had not connected the dots, but back to my area of concern…

CNN has pulled out their favorite privacy pundit – Jim Harper from the CATO Institute. Jim is well spoken, and very learned in the field of privacy and policy. However, he makes a statement in this interview that I find incredible – he says that collecting all the data from every American’s phone calls “can’t possibly be useful for link-based investigation.” Really, I cannot think of a better way of using phone call data than in linked based analysis. Methinks you need to stick with policy Jim!! Anyone out there care to explain this comment?

As a matter of policy, there are probably some questions to be answered.  The FISA courts have been criticized for approving everything without question. I would like the news agencies to focus on that, and whether or not the court is working as envisioned to protect our privacy.

Have a look at this post that is homeland security oriented – they are harvesting things differently here, but… same privacy concerns.

The Making of an Intelligence-Driven Organization

6 Jun

Interesting presentation – but really liked the Prezi – if you have not seen one of these have a look

The discussions/handout covered many points including:

  • As a discipline, intelligence seeks to remain an independent, objective advisor to the decision maker.
  • The realm of intelligence is that judgment and probability, but not prescription.
  • The Intelligence product does NOT tell the decision maker what to do, but rather, identifies the factors at play, and how various actions may affect outcomes.
  • Intelligence analysts must actively review the accuracy of their mind-sets by applying structured analytic techniques coupled with divergent thinking
  • Critical thinking clarifies goals, examines assumptions, discerns hidden values, evaluates evidence, accomplishes actions, and assesses inferences/conclusions
  • Networking, coordinating, cooperating, collaboration, and multi-sector collaboration accomplish different goals and require different levels of human resources, trust, skills, time, and financial resources – but worth it to ensure coverage of issues.
  • Counterintelligence and Security to protect your own position
  • and more….

I liked the stages of Intelligence Driven Organizations in the Prezi.

Data Visualisation » Martin’s Insights

23 Apr

This is a good article on data visualization. The author indicates in his considerations section that “real data can be very difficult to work with at times and so it must never be mistaken that data visualisation is easy to do purely because it is more graphical.” This is a good point. In fact in some respects determining what the right visualization is can be harder than simply working with the data directly – however, much harder to communicate key insights to a diverse audience.

What rarely gets enough attention is that in order to create interesting visualizations, the underlying data needs to be structured and enhanced to feed the visualizations appropriately. The recent Boston bombing where one of the bombers slipped through the system due to a name misspelling recalled a project years ago where we enhanced the underlying data to identify “similarities” between entities (People, cars, addresses, etc.) For each of the entities, the notion of similarity was defined differently; for addresses it was geographic distance; for names it was semantic distance; for cars, it was matching on a number of different variables; and for text narratives in documents we used the same approach that the plagiarism tools use. In this particular project a name misspelling, and the ability to tune the software to resolve names based on our willingness to accept false positives, allowed us to identify linkages that identified  networks. Once the link was established we went back and validated the link. In the above example, the amount of metadata generated to create a relatively simple link chart was significant – the bulk of the work. In terms of data generated, it is not unusual for data created to dwarf the original data set – this is especially true if there are text exploitation and other unstructured data mining approaches used.

So … Next time the sales guy shows you the nifty data visualization tool, ask about the data set used, and how massaged it needed to be.

http://www.martinsights.com/?p=492&goback=%2Egde_4298680_member_232053156

TIBCO – Buys another company

1 Apr

TIBCO buys another company in the analytics space. I have always thought that with Spotfire, the Enterprise Service Bus business, and the acquisition of Insightful some years ago, TIBCO had the makings of a company that was putting together the Big Data analytical stack. With this purchase, the have added a geo capability. Will they ever get all these pieces integrated to create a solutions package – like SAS’s Fraud Framework? Not sure why they have not done that to date. It may just be that it is too hard to sell complete solutions, and it is easier to get in the door with a point solution? Anyway – I like Spotfire, and anything they do to build out the back end is good stuff. Price point still seems a little high for a point solution, but they seem to be making it work for them, so who am I to argue… interesting to see how this plays out.

See also here – as they post in the MDM Magic Quadrant as well.

Link

Microsoft Powerpivot

17 Feb

Microsoft Powerpivot

I have always thought that Tableau was initially just a cool way to create Excel pivot tables and populate the results in a graphic – something you can do in Excel, but was  a lot easier in Tableau. Is Powerpivot the  MS answer to these tools that have leveraged excel, but have not used the excel visualization capabilities or been willing/able to write the VB code to get Excel to do what you want it to do?

I do not have Office 2013, but look forward to playing with this when I do.

The next thing – how visualizations interact with Big Data?

29 Jan

If big data is to be really useful for the masses, it needs to be distilled down to something that can be intuitively understood on mobile devices

Very positive outlook on the BI space – Cloud will bring it to the masses; visualization will make it understandable.

%d bloggers like this: