Tag Archives: Data Science

Analytics Defined

For this post, I provide a link to my article entitled ‘Analytics Defined: A Conceptual Framework’ that was published in the June, 2016 issue of ‘ORMS Today’. A PDF copy of the article has been placed on the Indiana University – Purdue University Fort Wayne website. The link follows:

http://www.ipfw.edu/centers/business-analytics/pdf/DefiningAnalytics.pdf

The core ideas in this article were first presented in four posts in this blog:

These ideas were further developed for a panel discussion on 11/1/15 (https://cld.bz/KAj90ao#104/z [bottom of page]) and an invited presentation on 11/3/15 (https://cld.bz/KAj90ao#269/z) at the INFORMS National Meeting in Philadelphia.

Problem Centricity

Based on a recent discussion on LinkedIn titled “OR and Data Science”, there appears to be quite a bit of uncertainty surrounding the question of how operations research and data science compare to each other. This uncertainty is surprising since the disciplines of operations research and data science focus on different issues and have different objectives. These differences become evident when you examine the educational backgrounds of operations research analysts and data scientists, the capabilities that employers require them to possess, and the type of projects that they work on.

Educational Background

The following table compares the core course list of the University of California Berkeley Master of Information and Data Science program with the North Carolina State University Master of Operations Research program.

Comparison of Courses

Operations Research (NC State)Data Science (UC Berkeley)
Introduction to Operations ResearchResearch Design and Application for Data and Analysis
Introduction to Mathematical ProgrammingExploring and Analyzing Data
Linear ProgrammingStoring and Retrieving Data
Design and Analysis of AlgorithmsApplied Machine Learning
Algorithmic Methods In Nonlinear ProgrammingData Visualization and Communication
Dynamic Systems and Multivariable Control IExperiments and Causal Inference
Computer Methods and ApplicationsBehind the Data: Humans and Values
Probability and Stochastic Processes IScaling Up! Really Big Data
Stochastic Models In Industrial EngineeringApplied Regression and Time Series Analysis
Nonlinear ProgrammingMachine Learning at Scale
Integer ProgrammingSynthetic Capstone Course
Dynamic Programming
Probability and Stochastic Processes II
Applied Stochastic Models In Industrial Engineering
Queues and Stochastic Service Systems
Computer Simulation Techniques
Stochastic Simulation Design and Analysis

It should be noted that there is essentially no overlap between these two lists. Moreover, the operations research program focuses on mathematical modeling of systems and optimization, while the data science program focuses on acquiring, managing and analyzing data and using it for prediction.

Required Skills

The job skills for a data scientist and a decision scientist (operations research) that are  listed on the COBOT Systems (an analytics startup company) website are shown below:

Decision Scientist (Operations Research) – Apply Operations Research & Decision Analytics

Linear Programming (Scheduling, Transportation, Assignment), Dynamic Programming, Integer Programming, Simulation, Queuing, Inventory, Maintenance, Decision Trees/Chains, Markov Chains, Influence Diagrams, Bayesian Networks, Incentive Plans, AHP, MCDM, Game Theory

Data Scientist – Apply Statistics & Data Analytics

Clustering, Classification Trees, Correlations, Multiple Regression, Logistic Regression, Forecasting, Sampling & Surveying, Reliability, Data Mining, Design of Experiments, Statistical Quality Control, Statistical Process Control, Machine Learning, Data Visualization

As can be seen, there are no common items on these lists! And, as in the case of the masters programs, the emphasis for operations research is on systems modeling and optimization, while the emphases for data science is on statistical analysis and prediction.

Type of Projects

The following table lists operations research projects described in Impact Magazine (British OR Society), and data science projects mentioned by Anthony Goldbloom (founder of Kaggle) in a YouTube video:

Comparison of Projects

Operations Research (Impact)Data Science (Kaggle)
Effectively allocate new product inventory to retail storesDetermine when a jet engine needs servicing
Optimally schedule customer service representativesPredict whether a chemical compound will have molecular activity
Reduce the processing time of a cancer screening testDetect whether a specific disease is present in an image of the eye
Create a fair schedule for a sports leaguePredict which type of used car will be easiest to sell

Again, a comparison of these projects tells the same story: operations research projects involve improving or optimizing a system, while data science projects involve analyzing data to make a prediction.

The Fundamental Difference

The preceding comparisons highlight the fundamental difference between operations research and data science:

Operations research is a problem centric discipline, in which a mathematical model of a problem or system is created to improve or optimize that problem or system;

Data science is a data centric discipline, in which a mathematical model of a dataset is created to discover insights or make a prediction.

The Consequences Of Obscurity

In a recent blog post, Polly Mitchell-Guthrie, when referring to an operations research project at UPS, wrote: “Does it really matter what we call it, if people value what was done and want to share the story? If it leads to the expansion of OR I don’t care if its called analytics.” In a 2011 blog post Professor Michael Trick went even further, stating: “The lines between operations research and business analytics are undoubtedly blurred and further blurring is an admirable goal.”

The desire for an association with the very popular, and wildly hyped terms, analytics and business analytics, is perhaps, understandable. Unfortunately, these terms are associated, not with the problem-centric paradigm of operations research, but with the data-centric world of IT/big data. I have been told by people who would know — an entrepreneur in the analytics space and a leader of a data science team — that when executives and IT leaders talk about analytics, they do not include operations research.

The terms analytics and business analytics are strongly associated with the word data: gathering it, cleaning it, mining it, analyzing it, presenting it, and attempting to gain insights from it. These activities in turn, are closely associated with disciplines such as statistics, data science, computer science, and information technology. As a result, the analytics universe is diverse, and much larger than the operations research community. (Interestingly, Professor Trick, in the above mentioned post, acknowledges that:  “We are part of the business analytics story, but we are not the whole story, and I don’t think we are a particularly big part of the story.”)

Blurring the distinction between operations research and analytics would obscure the distinctive approach, and unique capabilities, of operations research, creating a situation in which operations research no longer has a unique identity, and becomes lost in a larger data-centric universe that is characterized by extreme, data-focused publicity. Were this to occur, you should consider how the following questions would be answered:

  • Will students decide to spend years of their lives studying operations research? Will they even know that such a discipline exists?
  • Will universities continue to offer programs in operations research? Will they continue to require MBA students to take operations research courses? Will they continue to hire professors who specialize in operations research?
  • Will companies form new operations research groups, or maintain existing ones? Will IT leaders decide to add operations research analysts to their data science teams? Will jobs and consulting assignments exist for operations research analysts?

I am afraid that obscurity will not lead to the “expansion of OR”; it will lead OR into oblivion.

What Is Analytics?

There is a surprising admission in an article entitled ‘What Is Analytics?’:

“It’s not likely that we’ll ever arrive at a conclusive definition of analytics…”

In an article entitled ‘Operational research from Taylorism to Terabytes: A research agenda for the analytics age’ the authors state:

“…may be the lack of any clear consensus about analytics’ precise definition, and how it differs from related concepts.”

The failure to construct a single definition that encompasses the meaning of analytics is not surprising: the word analytics is used in three different ways, with three separate meanings, and therefore, analytics requires three separate definitions:

  • analytics is used as a synonym for statistics or metrics. Examples are website analytics (how many views or clicks) or scoring analytics (number of points scored per 100 possessions).
  • analytics is used as a synonym for data science. Examples are data analytics, predictive analytics, or operations research and advanced analytics [the preceding phrase refers to two separate things: operations research and data science(advanced analytics)].
  • analytics is used to represent all of the quantitative decision sciences. This is the Davenport ‘Competing on Analytics’ usage.

Once it is recognized that three definitions are needed, it becomes possible to answer questions about analytics that previously caused problems. For example:

Question – Is analytics a discipline?

Answer – no, yes, no

The answer depends on which meaning of analytics we are referring to:

  • analytics = statistics/metrics. No. This is a type of measurement, is context sensitive, and essentially involves counting.
  • analytics = data science. Yes. Data science can be considered to be a discipline that combines elements of statistics and computer science.
  • analytics = all quantitative decision sciences. No. Analytics represents disciplines, but is not itself a discipline. (See Confusion Over Analytics)

So, not only can we arrive at a conclusive definition of analytics, we can (and must) arrive at three conclusive definitions of analytics!

Should We Re-Brand Operations Research?

There are some in the operations research community who want to re-brand operations research. They would like to be called analytics professionals. The reasoning behind this appears to be the following:

Operations research is not that popular;

Analytics is very popular;

They would like to be popular, so;

They will call themselves analytics professionals, and then;

They will be popular.

Here, I will not dwell on the flawed premises, or faulty logic embodied in this reasoning. Instead, I will focus on the consequences of a successful re-branding. When considering these consequences, we should keep the following points in mind:

  • While those promoting analytics have trouble defining it, they are in agreement that it encompasses many different disciplines (see Confusion Over Analytics), such as statistics, computer science, data science, big data, business intelligence and operations research.
  • Operations research represents a tiny fraction of the IT/analytics universe.
  • The existence of generic analytics professionals would imply that there is no longer a meaningful distinction to be made between the ‘former’ disciplines of statistics, computer science and operations research.

To help you envision a post re-branding period, I offer two scenarios. In both, an IT executive is speaking to the leader of what was once an operations research group, but is now an analytics group after being re-branded. Remember, operations research no longer exists!

Scenario A

“Alice, I am assigning you and your team to be part of our data quality initiative.”

“But, Sir.”

“No buts Alice, big data is our priority — we must have high quality data!”

Six months later….

“Well done Alice. You and your team have reduced the error rate by 6%. I’m going to make this assignment to data quality permanent.

Scenario B

“Tom, I am assigning you and your team to our text analytics initiative.”

“But Sir.”

“No buts Tom, our competitors are all heavily involved in this area — we will not be left behind!”

Six months later….

“Tom, you and your team don’t seem to be up to the task — all of your projects are months behind schedule. I’m going to have to let you and your team go. Report to human resources and pickup your termination package.”

Conclusion

So, in one case those who have re-branded survive, and in the other case they do not. In both cases, the practice of operations research ends.

Will Operations Research Survive?

There have been some troubling signs: a 2010 article in OR/MS Today suggested that analytics would subsume operations research; a 2013 LinkedIn discussion asked “Will Big Data end Operations Research?”; and ominously, even INFORMS seems to be distancing itself from operations research.

How should we react to this? Should we:

  • Take early retirement, move to Vermont, and open a bed and breakfast?
  • Claim to be analytics professionals, and hope no one asks us about Hadoop or NoSQL?
  • Return to school to study data science?

No! None of the above will be necessary. To understand why, it is necessary to go back to first principles.

The original meaning of the name operational research (what operations research is called in Great Britain, where it was invented) was literally, scientific research on operations. The name was meant to distinguish scientific research on operations, from scientific research on the underlying technology of some product, e.g. radar. In the late 1930‘s the British Government funded scientific research directed towards creating radar equipment with sufficient range and precision to locate attacking aircraft. They also initiated an operations research study to determine the most effective way to deploy the radar stations, and integrate them into an effective air defense system.

This type of scientific research, and the scientific method upon which it is based, is a problem solving paradigm. Operations research is the application of this problem solving paradigm to the solution of operational and management problems.

During the summer of 1940, this paradigm arguably saved Great Britain from defeat. Today, as the Edelman Competition routinely demonstrates, this paradigm creates benefits so great, that they transform entire organizations. And, it is because of this paradigm that operations research can create value that can be created in no other way. This value — lower costs, higher profits, military advantage, more efficiency, better service — was needed in 1940, is in evidence all around us today, and will be in demand for as long as human civilization persists.

So, there is no cause for alarm. Just continue ‘Doing Good with Good OR’.

Analytics: A Conceptual Framework

In the 12/17/14 INFORMS Today Podcast, Glenn Wegryn observes that analytics is divided into two distinct camps. He notes that they tend to come from different organizational backgrounds and he describes them in the following way:

  • Data Centric – use data to find interesting insights and information to predict or anticipate what might happen;
  • Decision Centric – understand the business problem, then determine the specific methodologies and information needed to solve the specific problem.

That analytics appears to be divided into distinct camps should not be surprising, since, as I explained in Confusion Over Analytics, analytics should be understood as a conceptual grouping of the quantitative decision sciences as a whole. Therefore, it is to be expected that disciplines within the quantitative decision sciences have distinctive backgrounds, methods and approaches.

The data centric/decision centric categorization can be a useful way to think about analytics, since two disciplines contained within the analytics conceptual grouping fit these categories perfectly: data science (data centric); operations research (decision/problem centric). Using this categorization, a framework can be constructed, within which, the various types of analytics, data science, and operations research can be  related to each other in a logically consistent way.

Diagram of an Analytics Framework

Analytics Framework

Both common uses of the term analytics appear in the preceding diagram: to represent statistics and computer science and to represent all the quantitative decision sciences. This conceptual framework highlights a promising area for collaboration between data science and operations research (prescriptive analytics), while recognizing that most prescriptive quantitative analysis does not require intensive data analysis.