Decision support in data analysis
/The title of this post may seem reversed, as usually we talk about data analysis as the foundation for decision support systems, which are heavily based on data analysis methods, including statistical analysis, machine learning, data mining or visualization. This relation is however bi-directional - as modern analysis methods become more advanced, they also become more complex and difficult to use in their full potential. Decision support elements can redefine data analysis experience in simple (non-experienced user working with base algorithms) as well as advanced (experienced user with advanced algorithms) usage scenarios.
We live in a data analysis world. More and more data are available about us, our surrounding systems and environments (our bodies, homes, environments; our activities in real life and online). Very often the question is no longer what else to measure, but what to do with the data that we already have or can easily access. Technical solutions follow, as collecting, storing and processing massive amount of data gets continuously cheaper, while algorithms become more sophisticated and effective. Data analysis seems to be more available than ever and its applications start to touch each and every aspect of our lives, providing insight into the behavior of societies, organizations and individuals.
Data analysis is everywhere
Data analysis is no longer the domain of researchers. It has become a big business focusing on extracting non-obvious information from available data (e.g. consumer preferences from shopping behaviors). On a daily basis we are overloaded with charts, reports, and direct or indirect recommendations. These are the products of data analysis that are aimed at answering generic questions, usually disconnected from personal context, and often delivered as an addition to purchased hardware or service. It is a different story if we want to look for answers to our own questions. This still requires significant effort, access to tools, data in right format and - most of all – data analysis skills (not only in the strictly technical sense).
Data analysis is essentially a decision process aimed at solving a problem through data exploration, transformations, application of analysis methods and eventually utilization of the results. Practical data analysis covers at least two areas: a domain of a problem (questions, measures, expected results) and the data analysis space (tools, methods, constraints). With the increasing complexity of both areas, it is more difficult to find a data analyst (or a team) with strong competencies on both sides. And finding both is critical, since a domain problem needs to be translated to data analysis problem for analysis, and results needs to be brought back to the domain space.
Data analysis needs decision support
The main reason for adding decision support to the data analysis experience is to shift focus from an analysis space to a problem domain. A user should spend most of his/her time working on the actual problem, selecting the right questions, controlling the process, and interpreting the results so they can be used in practice. Decision support is aimed at empowering new users to use data analysis effectively, starting with basic recommendations, through providing interactive assistance, and eventually giving contextual support also in the scope of a problem domain. And it can go further by enabling new analysis scenarios beyond the original configuration of an individual working with a dataset.
The core functionality of analysis decision support can be divided into 4 groups of goal-oriented tasks. The first group includes protecting the user from common mistakes, like applying an analysis method to a data stream of incorrect type. The second group is about explaining and providing insight into data quality, process status or candidate results. The next group includes guiding through key decision points related to the selection of specific data analysis methods, or just the application of the suitable analysis template. The last group of tasks is aimed at automating the whole process, detecting interesting characteristics of the data, and eventually delivering not only answers, but also recommendations for the right questions.
Analysis decision support will enable new scenarios
Obviously more advanced scenarios have stronger technical requirements, not only in the scope of analysis, but also in data management and human-computer interactions. Intelligent decision support for data analysis requires more information about input data streams, and effective mechanisms for managing data flows between local context, shared spaces, and external systems (with emphasis on privacy and security requirements). The user experience must be built around a user, be customizable and adaptive, and be based on individual requirements and preferences (including accessibility). These requirements become even more interesting when we move from individual to group, organizational or research scenarios.
In the first post of this blog we mentioned "a new type of framework" we’re working on. After this post we can be a little more precise and describe this project as an analysis decision support framework. This is still not a complete description and we will keep expanding it in the upcoming posts. It however provides a nice emphasis on the importance of intelligent decision support as one of the key elements of the framework we are implementing. It starts with the core engineering requirements, focus on time-oriented data and data analysis experience. But with application of decision support and other elements we can aim towards new and very exciting scenarios.