Avoiding Data-Tunnel-Vision and Achieving Human-Machine Symbiosis

Avoiding Data-Tunnel-Vision and Achieving Human-Machine Symbiosis

Photo by Tom Dahm on Unsplash

\
The classic human error is assuming we’re doing a good job, because we have nothing better to compare ourselves to.

\
Humans operate with small data sets and execute flawed analytical reports. Machines operate with far larger ones and are capable of carrying out advanced analysis, achieving far more accurate results.

\
Thus, for example, far too often, we’ll see someone or some people go to a certain college, get a certain degree, and then have a successful career and assume it will work for us as well. But in practice, our conclusion may be flawed, due to a number of reasons, including:

\

  • Humans aren’t able to work well with large data sets, so our results, are not always reflective of the many other statistical outcomes.

    \

  • Even when we do have a larger data-set than usual, our brains aren’t able to complete advanced analysis, not to mention the bias that often misleads us.

    \

  • Even though we’re capable of seeing it, we often fail to separate correlation and causation.

\
In the context of AI transformations, employees often don’t see the urgent need for improving their tools, methods, and tech stack, because they are only seeing the data-context they have in front of them, not all the data they are missing, or, the larger data-context.

\

In other words, the exponential creation rate of data means that we can be so far behind, we aren’t even capable of recognizing what we are missing.

\
Often, we, as humans, choose to filter out large amounts of data based on generic axioms because the alternative requires us to analyze an impossible amount of data.

\
While the advantage of having a small data set is that we can perform higher quality manual analysis, we often lose out on lots of information from the data we don’t include in our analysis.

\

Example Case: Binary Segmentation of Data:

Imagine a human analyst tasked with categorizing data into “interesting” and “non-interesting” categories based on his corporation’s strategic goals. Given that he can’t analyze all the data manually, he filters it out into a select group.

\
This may be done randomly (in the worst case), or based on certain axioms, informed guesses, or past indicators. However, in doing such, the analyst is undertaking a number of strategic risks –

\

  1. By working with a smaller data-set, the analyst is getting less information. While he may be able to categorize the data into the two groups with 100% accuracy, he is missing out on loads of data points that might also be “interesting” but just not taken into account when doing the initial filtration.

    \

  2. Analysts often create closed loops of rules for the filtration of data. Given that past behaviour doesn’t always indicate future behaviour, this allows us to miss many interesting and relevant data points, in a manner that will never allow them to be revealed.

    \

  3. Working with small and closed-loop data sets can create a false sense of certainty in the validity of our results with no substantial way of knowing what we are missing. It’s easy to filter data when we have known questions. But what about when we have unknown questions with answers of value.

    \
    In other words, analysts may not actually be able to define in advance what will be interesting and so they base their filtration on dangerously limited assumptions.

    \
    Even with known questions – our assumptions as to the relevant data sources may be incorrect, or based on probability, not the quality of the individual “interesting” results we’d be missing out on.

\

How ML and AI Engines can Solve This:

Assuming the human analyst is good at his job, we can presume he has a high rate of accuracy (precision). For the sake of this example, let’s say he has a 100% rate of accuracy in segmenting the data.

\
On the other hand, a decently trained classification/segmentation Machine Learning engine may only get to 85% accuracy (precision) after training. Our initial response, as humans, is to oppose the use of the algorithm, saying, only accurate results are of value.

\
However, if a machine was able to review all the data (let’s say, 100X the human reviewed data), it would likely uncover new sources of information we didn’t even know existed, not to mention subjects that are “interesting” that we may not have even known about.

\
Quite often, 85% accuracy on 10,000 data points will be more valuable to a company than 100% accuracy on 100 data points. In other words, our instinctive definition of accuracy is skewed due to the small context in which we are seeing it. This is true when dealing with pre-existing categories of “interesting” topics, but even more so when we may not be able to predict what all the “interesting” topics will be.

\
Furthermore, while we may think we can foresee all the categories that are “interesting”, we may be wrong.

\
In cases of known questions, we can at least make an informed assumption as to how to filter the data for manual analysis. But in cases of unknown questions, the value of an engine that learns from vast amounts of data-points, not to mention, of one that has the capability to operate in an un-supervised manner, is immense.

\
While the results of this may require human review as a quality control, that task will usually be far more cost-effective than having the human execute a flawed analysis on a minimal data-set in the first place.

\
Often, we turn to manual analysis because it gives us a sense of security that within the small context of the data-set we are analyzing, we have an almost perfect rate of accuracy (precision). However, this is a false sense of security.

\

As managers and leaders, we have to train ourselves to look at the larger data context, and constantly suspect we are missing something. A failure to criticize, in this case, can be detrimental to an operation.

\
None of this is to say that there is no place for human analysis, but even when we require a highly-accurate analysis task done, how we choose the small data set out of the larger one is not a task for humans.

\
This is one for machines. Ironically, today, our first and most dangerous filtration is done by humans, usually not based on conclusive evidence. Even if the accuracy rate of an algorithm isn’t good enough, it will still be far better at alerting us to the vast array of diverse “interesting” topics hidden in the data, which the analyst can then use as a guide for deeper manual analysis.

\

Creating Human-Machine Symbiosis:

We need to create human-machine analysis teams that work together with one another.

\
Ones where machines guide the humans in selecting the data which is most likely to be relevant for high accuracy human analysis, and humans guide machines in defining the underlying logic and technology, train them, and measure them, constantly.

\

Looking ahead, it’s critical for managers to be able to separate between the role of the machine and the role of the human. Doing this well will allow humans and machines to work in conjunction, as “human-machine teams”.

\
If in the past, one of the manager’s key responsibilities was to assign the most appropriate team member to every task, today, that role has expanded to include evaluating which part of each task should be human-centric, and which should be machine-centric.

\
Those able to bridge the gap between the humanities-oriented task of putting these technologies into a human and business context with the technological know-how to guide their implementation, will be best positioned to lead a succesful transformation.

\
We must always attempt to “know the unknowable”, even within our data. But the key to achieving that is first recognizing that our we all suffer from data-tunnel-vision and in doing so, we can attempt to achieve human-machine symbiosis that will transform an operation.

Leave a Reply

Your email address will not be published.