Archive by Author

Gartner 2012 MDM Survey

22 Feb

Given what just put out about TIBCO, thought that I should put this out as they show up here under MDM as well.

Magic Quadrant for Master Data Management of Customer Data Solutions

18 October 2012 ID:G00233198
Analyst(s): John Radcliffe, Bill O’Kane

VIEW SUMMARY

Organizations selecting a solution for master data management of customer data still face challenges. Overall, functionality continues to mature, but some of the well-established vendors’ messages are getting more complex. Meanwhile, less-established vendors continue to build up their credentials.

Link

Microsoft Powerpivot

17 Feb

Microsoft Powerpivot

I have always thought that Tableau was initially just a cool way to create Excel pivot tables and populate the results in a graphic – something you can do in Excel, but was  a lot easier in Tableau. Is Powerpivot the  MS answer to these tools that have leveraged excel, but have not used the excel visualization capabilities or been willing/able to write the VB code to get Excel to do what you want it to do?

I do not have Office 2013, but look forward to playing with this when I do.

Link

Another reason why Data Management and Analysts cannot lead separate lives

14 Feb

Another reason why Data Management and Analysts cannot lead separate lives

I found this article interesting in that it points out why the bridge between the data side of the house and the analytical side must be well established – if the data team implements a design that does not support analytics, it has material impacts. I know this is blindingly obvious, but ….

I have recently been in a number of discussions where the attitude was we are going to build the data warehouse using best practices and years of experience, and it really does not matter what you are going to do with the data. I know it crazy, but… you know what I am talking about – we see it all the time.

The article itself tests performance on a columnar versus relational approach to persisting data, and has some surprising results – 4,100% improvement! I would be interested in other studies that have looked at the difference between different data architectures when performing analytical tasks.

Good Perspective on Analytics

11 Feb

Interesting post on healthcare fraud – a couple of good points – traditional view of predictive analytics assumes you know what you are predicting – Fraud is adaptive – it changes. As a result being predictive requires a more diverse approach that involves a range of analytical approaches.

Gartner BI & Analytics Magic Quadrant is out…

10 Feb

The report can be obtained here. Along with some industry analysis hear

Well the top right quadrant is becoming a crowded place.

Gartner BI Quadrant 2013

I have not had time to really go over this and compare it to last year’s but the trends and challenges that we have been seeing are reflected in this report; some interesting points:

  1. All of the Enterprise level systems are reported to be hard to implement. This is not surprise – what always surprises me is that companies blame this on one company or another – they are all like that! It has to be one of the decision criterion when selecting one of the comprehensive tool sets.
  2. My sense is that IBM is coming along – and is in the running for the uber BI / Analytics company. However, the write up indicates that growth through acquisition is still happening. This has traditionally led to confusion in the product line and difficulty in implementation. This is especially the case when you implement in a big data or streaming environment.
  3. Tibco and Tableau continue to go head to head. I see Spotfire on top from a product perspective with its use of “R”, the purchase of Insightful and building on its traditional enterprise service bus business. HOWEVER, Gartner calls out the cost model as something that holds Spotfire back. This is especially true when compared to Tableau. My sense is that if TIBCO is selling an integrated solution, then they can embed the cost of the BI capabilities in the total purchase and this is how they are gaining traction. Regardless – Spotfire is a great product and TIBCO is set to do great things, but their price point sets them up against SAS and IBM, while their flagship component sets them up against Tableau at a lower price point. My own experience is that this knocks them out of the early stage activity, and hence they are often not “built in” to the later stage activity.
  4. SAS Continues to dominate where analytics and  Big Data are involved. However, interesting to note that Gartner calls out that they are having a hard time communicating business benefit. This is critical when you are selling a enterprise product at a premium price. Unlike IBM who can draw on components that span the enterprise, SAS has to build the enterprise value proposition on the analytics stack only – this is not a problem unique to SAS – building the value proposition for enterprise level analytics is tough.
  5. Tableau is the darling of the crowd and moves into the Gartner Leader’s Quadrant for the first time. The company has come out with a number of “Big Data” type features. They have connectors to Hadoop, and the article refers to in-memory and columnar databases. While these things are important, and the lack of them was holding the company back from entering certain markets, it is a bit at odds with their largest customer segment, and their traditional positioning approach. Addressing larger and a more integrated approach takes them more directly into the competitive sphere of the big guys (SAP, IBM and SAS), and also into the sphere of TIBCO Spotfire.
  6. It would be interesting to run the Gartner analysis along different use cases (Fraud, Risk Management, Consumer Market Analysis, etc.) In certain circles one hears much of companies like Palantir that has a sexy interface and might do well against Spotfire and Tableau, but is not included here.  Detica is another company that may do well. SAS would probably come out on top in certain areas especially with the new Visual Analytics component. There are probably other companies that have comprehensive BI solutions for particular markets – If anyone has information on these types of solutions, I would be interested in a comment.

More to follow – and there is probably much more to say as things continue to evolve at a pace!

Link

It is about what you do with the data!!

7 Feb

Hadoop and Big Data – it is about what you do with the data!!

Some good videos from TechTarget and Wayne Eckerson – for a data guy he talks a lot about analytics.

The next thing – how visualizations interact with Big Data?

29 Jan

If big data is to be really useful for the masses, it needs to be distilled down to something that can be intuitively understood on mobile devices

Very positive outlook on the BI space – Cloud will bring it to the masses; visualization will make it understandable.

We need to think about ETL differently!

26 Jan

This blog was started to write about analytics – so here I go again on ETL! Seems that if you are working on Big Data things, it always starts with the data, and in many respects that is the thing that is most difficult – or perhaps requires the most wrenching changes – See this Creating an Enterprise Data Strategy for some interesting facts on data strategies.

ETL is a chore at the best of times. Analysts are generally in a rush to get data into a format that supports the analytical task of the day. Often this means taking the data from the data source and performing the data integration required to make the data analytically ready. This is often done at the expense of any effort by the data management folks to apply controls oriented at data quality issues.

This has created a tension between the data management side of the house and the analytical group. The data management folks are focused on getting data into an enterprise Warehouse or DataMart in a consistent format with data defined and structured in accordance with definitions and linkages defined through the data governance process. Analysts on the other hand – especially those engaged in adaptive type of analytical challenges – seem always to be looking at data through a different lens. Analysts often want to apply different entity resolution rules; want to understand new linkages (implies new schema); and, generally seek to apply a much looser structure to the data in order to expose insights that are often hidden by the enterprise ETL process.

This mismatch in requirements can be addressed in many ways. However, a key starting step is to redefine the meaning of ETL within an organization. I like the definition attributed to Michael Porter where he defines a “Lifecycle of Transformation” that shows how data is managed from the raw or source state through to application in a business context (Larger Image)

Value Chain of Transformation

Value Chain of Transformation

I am pretty sure that Michael Porter does not think of himself as an ETL person, and the article  (Page 14) I obtained this from indicates that this perspective is not ETL. However, I submit that the perspective that ETL stops once you have data in the Warehouse or the DataMart is just too limiting, and creates a false divide. Data must be both useable and actionable – not just useable. By looking at the ETL challenge across the entire transformation (does that make ETL TL TL TL …?), practitioners are more likely to meet the needs of business users.

Related discussions for future entries:

  • Wayne Eckerson has a number of articles on this topic. My favorite: Exploiting Big Data Strategies for Integrating with Hadoop by Wayne Eckerson; Published: June 1, 2012.
  • The limitations placed on analytics through the application of a schema independent of the analytical context is one of the drawbacks of “old school” RDBMS. The ability of a file based Hadoop / mapreduce oriented analytical environment to apply the schema later in the process is a key benefit of Hadoop/Mapreduce.

Open Source versus COTS

26 Jan

Public Sector Big Data: 5 Ways Big Data Must Evolve in 2013

Much of this article rings true. However, the last section requires some explanation:

“One could argue that as open source goes in 2013, Big Data goes as well. If open source platforms and tools continue to address agency demands for security, scalability, and flexibility, benefits within from Big Data within and across agencies will increase exponentially. There are hundreds of thousands of viable open source technologies on the market today. Not all are suitable for agency requirements, but as agencies update and expand their uses of data, these tools offer limitless opportunities to innovate. Additionally, opting for open source instead of proprietary vendor solutions prevents an agency from being locked into a single vendor’s tool that it may at some point outgrow or find ill-suited for their needs.”

I take exception to this in that the decision to go open source versus COTS is really not that simple. It really depends on a number of things: the nature of your business; the resources you have available to you; and the enterprise platforms and legacy in place to name a few. If you implement a COTS tool improperly you can be locked into using that tool – just the same as if you implement an Open Source tool improperly.

How locked in you are to any tool is largely a question of how the solution is architected! Be smart and take your time ensuring that the logical architecture ensures the right level of abstraction that ensures a level of modularity; and thus flexibility. This article talks about agile BI architectures – we need to be thinking the same way system architectures.

My feeling is that we are headed to a world where COTS products work in conjunction with Open Source – currently there are many examples of COTS products that ship with Open Source components – how many products ship with a Lucene indexer for example?