Lots of discussion on Ai Governance & Privacy this week

15 Sep

Sub Committee hearing on oversight of Ai here

I like this quote: “I recommend that since so many risks of AI systems come from within
relationships where people are on the bad end of an information asymmetry, lawmakers
should implement broad, non-negotiable duties of loyalty, care, and confidentiality as
part of any broad attempt to hold those who build and deploy AI systems accountable.”

It seems to me that if one follows this logic, we end up with principle based legislation that will present challenge in building control models. It will take time for best practices to emerge. Do we end up with something that looks like GDPR but for Ai?

Blumenthal & Hawley Announce Bipartisan Framework on Artificial Intelligence Legislation

Comprehensive framework would establish an independent oversight body, allow enforcers & victims to seek legal accountability for harms, promote transparency, & protect personal data

The good thing about the way that this is written up is that many of the data and PII best practices already on the books are captured – i.e. transparency and how children’s data is managed are the two that caught my eye.

SB-362 Data broker registration: accessible deletion mechanism.(2023-2024)

Much wailing and gnashing of teeth here. This is one of those things that in principle sounds great, but in practice will be complex – maybe in this day and age that applies to all privacy data management. My biggest issue surrounds what organizations do until this all gets sorted out – what does “good” look like from the regulator perspective?

This is summed up in the following from Alex LaCasse at the IAPP “”From a purely practical perspective, in a relatively short time period, there are now many varying privacy laws that require companies to quickly and wholly change their operations and technical infrastructure, let alone their business practices that are reliant on data,” Kelley Drye & Warren Partner Alysa Hutnik, CIPP/US, said. “In the meantime, companies are devoting millions to revamp their operations to comply with these laws in good faith, knowing that realistically their interpretation of these laws may be off, and many more millions of dollars will need to be spent to course-correct based on future regulations and regulatory guidance.”

I am reminded of a comment a lawyer friend made back in 2017 when GDPR was all the rage: “if you wait until the details are sorted out in court, then you will not have wasted millions – far cheaper to pay me $60k to defend this position than to do a system upgrade and have to re do it every time legal opinions are released” (and yes he said $60k which sound too low to me!)

Form a practical perspective, I keep coming back to the core privacy principles – which basically align to GDPR and CCPA Rights and Obligations. We need to be able to execute on those rights at some level, and get those foundations in place, and be in a position to fine tune when the details emerge.

Core Principles:

  • Lawfulness, fairness and transparency
  • Purpose limitation
  • Data minimization
  • Accuracy
  • Storage limitation
  • Integrity and confidentiality (security)
  • Accountability

Managing Software Development – The Debate Continues …

13 Sep

This post falls into the category of amusing stuff to do over morning coffee.

McKinsey wrote an article that is creating much discussion. I am sure this was written in good faith in an attempt to solve a management problem associated with software development teams. That article is here The response from Dave Farley – YouTube link below is pretty funny.

I no longer manage software teams, and when I last did, agile was just beginning. However, I love to keep track of all of the discussion about management techniques – generally those that are either trying to fix the mistakes of agile, or debunk it entirely. This is relevant from a data perspective, as data and applications are obviously tied together and any data project is going to get tied up into a DevSecOps or a DataOps discussion – both seeking to be agile in one form or another.

My high level take away from the McKinsey article was that they were trying to focus on the soft issues or capabilities – indeed this is the issue Farley had with the article – that the metrics were not measurable. This is typical McKinsey – they are a management consulting company after all. Farley argues that one can stick with established metrics and measures and do better. His point is that it is the production of the team that counts. However, still does not get at understanding the management challenge at the individual level. Nowadays, I would have to think that team productivity is impacted when people leave, and thus within the scope of the McKinsey discussion is the idea of retention – not a bad idea. Are you looking things as a manager – in which case retention matters, or as a developer – in which case quality and speed matter are what should be measured.

Either way, have a read and a look – amusing musings!

Business Framework for Analytics Implementation

14 Sep

Updated 9/14/20 with new links. It is a bit ironic that I linked to the Dataversity site, and they do not use persistent identifiers to label their data assets, so all my links are dead. Note to practitioners – if you are not using persistent identifiers your institutional knowledge captured in data assets lasts as long as the identifier!

I went looking for this deck as I was having a discussion on governance that is as old as the hills; essentially how do you link data governance activities to the business activity to address – why does data governance exist?

The other discussion that got me looking at this article again was how we go about building an operating model for organizations where the Governance team is doing more than responding to quality requests – how does the team proactively address data issues?

Both of these are tied to the article below. The Hoshin Framework (at least as it is presented below) ties strategic initiatives all the way down to identified data capabilities that can be addressed proactively to support the business strategy. 

A note on the spreadsheet. This spreadsheet is not for the faint of heart. The spreadsheet supports the thought exercise used to shape discussions and your communication with stakeholders. The key point to take away is that the spreadsheet gives you the ability to relate governance budget to strategic goals, funded programs, current project and metrics. Think of it as the audit worksheets – no one ever sees those, and the auditor reports out only the results.

Original Post.

In my previous post I discussed some analytical phrases that are gaining traction. Related to that I have had a number of requests for the deck that I presented at the Enterprise Dataversity  – Data Strategy & Analytics Forum.  I have attached the presentation here. NOTE: This presentation was done a few years ago while I was with CMMI (Now ISACA) as a result it is tied to their Data Management Maturity Model. I talked about analytics, and my colleague on the presentation addressed data maturity.

Also, while I am posting useful things that people keep asking for, here are a set of links that Jeff Gentry did on management frameworks for a Dataversity Webinar. Of particular interest to me was the mapping of the Hoshin Strategic Planning Framework to the CMMI Data Management Maturity Framework. The last link is the actual excel spreadsheet template.

Links:

  1. Webinar Recording: http://www.dataversity.net/cdo-webinar-cdo-interview-with-jeff-gentry-favorite-frameworks/. Here is link to deck.
  2. Link to Using Hoshin Frameworks. Hoshin is bigger than just this matrix, and is a heavy process for most people. However, the following gives you soem background: http://www.slideshare.net/Lightconsulting/hoshin-planning-presentation-7336617
  3. Hoshin Framework linked to DMM: Data Analytics Strategy and Roadmap Template 20160204D.xlsx

What am I doing…?

17 Nov

Someone asked the other day if I was still blogging – the answer is yes, but… During the summer I joined the teaching team at University of Maryland to teach a course in data governance and data quality in the Graduate Information Management Program. Between creating the course content and teaching the course, it has consumed all of the creative energy that I normally put into the blog articles.

Hang tight. I will be posting again soon.

Data Prep – More than a Buzzword?

25 Feb

“Data Prep” has become a popular phrase over the last year or so – why? At a practical level, data preparation tools are providing the same functionality that traditional ETL (extract, transform, load) tools provide. Are data prep tools just a marketing gimmick to get organizations to buy more ETL software? This blog seeks to address why data prep capabilities have become a topic of conversation within the data and analytics communities.

Traditionally, data prep has been viewed as slow and laborious, often associated with linear, rigid methodologies. Recently, however, data prep has become synonymous with data agility. It is a set of capabilities that pushes the boundaries of who has access to data, and how they can apply it to business challenges. Looked at this way, data prep is a foundational capability for digital transformation, which I define as the ability of companies to evolve in an agile fashion in some key dimension of their business model. The business driver of most transformation programs is to fundamentally change key business performance metrics, such as revenue, margins, or market share. Viewed in this way, data prep tools are a critical addition to the toolbox when it comes to driving key business metrics.

Consider the way that data usage has evolved, and the role that data prep capabilities are playing.

Analytics is maturing. Analytics is not a new idea. However, for years it was a function relegated to Operations Research (OR) folks and statisticians. This is no longer the case. As BI and reporting tools grew more powerful and increasingly enabled self service for end users, users began asking questions that were more analytical in nature.

Data-Driven decisions require data “in context.” Decision-making and the process that supports it require data to be evaluated in the context of the business or operational challenge at hand. How management perceives an issue will drive what data is collected and how it is analyzed. In the 1950’s and 1960’s, operations research drove analytics, and the key performance indicators were well established. These included time in process, mean time to failure, yield and throughput. All of these were well understood and largely prescriptive. Fast forward to now. Analytics is broadly applied and used well beyond the scope of operations research. New types of analysis driven in large part by social media trends are much less prescriptive and value is driven by context. Examples include: key opinion leader, fraud networks, perceptual mapping, and sentiment analysis.

Big data is driving the adoption of machine learning. Machine learning requires the integration of domain expertise with the data in order to expose “features” within the data that enhance the effectiveness of machine learning algorithms. The activity that identifies and organizes these features is called “feature engineering.” Many data scientists would not equate “data preparation” with feature engineering, yet there is a strong correlation to what an analyst does. A business analyst invariably creates features as they prepare their data for analysis: 1) observations are placed on a time line; 2) revenue is totaled by quarters and year; 3) customers are organized by location, by cumulative spend, and so on. Data Prep in this context is the organization of data around domain expertise, and is a critical input to the harnessing of big data through automation.

Data science is evolving and data engineering is now a thing. Data engineering focuses on how to apply and scale the insights from data science into an operational context. It’s one thing for a data scientist to spend time organizing data for modest initiatives or limited analysis, but for scaled up operational activities involving business analysts, marketers and operational staff, data prep must be a capability that is available to staff with a more generalized skill set. Data engineering supports building capabilities that enable users to access, prepare and apply data in their day-to-day lives.

“Data Prep” in the context of the above is enabling a broader community of data citizens to discover, access, organize and integrate data into these diverse scenarios. This broad access to data using tools that organize and visualize is a critical success factor for organizations seeking the business benefits of digitally enabling their organization. Future blogs will drill down on each of the above to explore how practitioners can evolve their data prep capabilities and apply them to business challenges.

The topic of protecting personal information will grow in importance in 2019

19 Nov
IAPP Annual Report 2018
For those interested in the protection of personal information, the IAPP has an interesting – albeit rather hefty – IAPP-EY Annual Privacy Governance Report 2018, and the NTIA has released its comments from industry on pending privacy regulation. I noted that the IAPP report indicates most solutions are still almost all or entirely manual. I am not sure how this does not become a management nightmare as organizations evolve their data maturity to align operations and marketing more. Data management as a process discipline and some degree of automation are going to be critical capabilities to ensure personal information is protected. There are simply too many opportunities for error when this is done manually. 
I recently published an article in TDAN on automating data management and governance through machine learning. It is not just about ML, other capabilities will be required. However, as long as organizations rely on manual processes only, it opens up risk and places the burden on management to enforce policies that are often resisted as they are perceived as a burden on actually doing business. Data management as a process discipline in conjunction with automated processes will reduce operational overhead and risk.

Architecting the Framework for Compliance & Risk Management

24 Oct

Really quick visit to the Data Architecture Summit this year. I wish I could have stayed longer, but I had to get back to a project.

My presentation was on creating audit defensibility that ensures practices are compliant and performed in a way that is scalable, transparent, and defensible; thus creating “Audit Resilience.” Data practitioners often struggle with viewing the world from the auditor’s perspective. This presentation focused on how to create the foundational governance framework supporting a data control model required to produce clean audit findings. These capabilities are critical in a world where due diligence and compliance with best practices are critical in addressing the impacts of security and privacy breaches.

Here is the deck. This was billed as an intermediate presentation and we had a mixed group of business folks and IT people with good questions and dialogue. I am looking forward to the next event.

Agile – we just keep trying to make it work!

3 Aug

In the summer of 2013, I must have been thinking about Agile approaches to development as I wrote two blogs on the topic:

I was interested to see that Martin Fowler released an article on yet another approach to fixing what is wrong with agile; the Agile Fluency Model. The article provides a good comprehensive write up on this approach. However, go back to look at the links in the above blogs. There are a number of amusing ones. This one from Martin Fowler titled Flaccid Scrum, and these two very amusing ones here and here.  They all refer to the same set of challenges facing how agile is implemented.

I am not sure I have anything to add to the debate. however, I do note that successful teams invariably: 1) involve a white board; 2) engage in lively and dynamic dialogue around the challenge; and 3)  have team members with an intuitive user centric understanding of the problems the team seeks to solve.

I guess I am also surprised that we are still talking about how to “do” agile!

Link to agile Fluency Model Diagnostic

Update: Interesting article here by Joshua Seckel