The combine flexibility, creativity,and domain knowledge with the enormous

The information overload problem stems from the
computerization of information (information age) gotten from numerous data
sources: Health, Insurance, Finance, Education, Security to name a few. Simply
put, information (data) gotten from everyday activity is of value to decision
makers, analysts and individuals. Recently, we are faced with more information
that we can handle. In the face of this information flood, we become less
informed; this is because the amount of data produced and stored has become increasingly
greater than our ability to extract meaningful information. According to
Endsley (2000) there seems to exist an information gap as more data does not
necessarily equal more information. To combat (confront) the information
overload problem, there is need to build systems which harness the incredible
perceptual and cognitive abilities of the human with the computational powers
of modern computers. However, the information overload problem is not
inevitable. One way to support this is through the use of visual analytics.



The basic idea of visual analytics is to visually represent
information, allowing the human to directly interact with the information, to
gain insight, draw conclusions, and ultimately make better decisions. involving
the human in the loop as opposed to fully automatic techniques reduces the risk
of errors, bias(es), improves productivity and user acceptance (Endsley –
Designing for situation awareness). The use of appropriate visual
representations and metaphors to present information best supports the human
cognitive process as it reduces complex cognitive work needed to perform
certain tasks. Visual analytics is more than only visualization, rather can be
seen as an integral approach combining visualization, human-computer interaction
and data analysis. Visual analytics tools and techniques are used to derive
insight from massive, dynamic, and often conflicting datasets by providing
timely,defensible, and comprehensible assessments. For better informed
decisions, it is essential to include humans in the data analysis process to
combine flexibility, creativity,and domain knowledge with the enormous storage
capacity and the computational power of today’s computers.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!

order now



In general, visual analytics is defined as “the science of
analytical reasoning facilitated by interactive visual interfaces”. To be more
precise, visual analytics is an iterative process that involves information
gathering, data-preprocessing, knowledge representation, interaction and
decision making. The ultimate goal is to gain insight in the problem at hand
which is described by vast amounts of scientific, forensic or business data
from heterogeneous sources. To reach this goal, visual analytics combines the
strengths of machines with those of humans. Visual analytics supports these
processes through its unification and integration of previously independent
approaches of visualization, statistics and data mining to solve big data
challenges in a holistic manner. The main idea of visual analytics is to create
software that facilitates the analytical reasoning process by leveraging the
human capacity to perceive, understand and reason about complex information and
events. The earliest known research following this idea was the Research Agenda
for Visual Analytics – “Illuminating the Path” (Thomas, 2005) in the
wake of the 9/11 terrorist attacks focusing on US Homeland Security. Since
then, the growth of visual analytics applications have been extended to solve
complex big data issues in various fields of health, government, astrophysics,
cyber-security, education, transportation, business and finance to name a few.

figure of scope of visual analytics here

The scope of visual analytics (Fig 1) combines approaches
from information and scientific visualization, data management, statistical
data mining and automated analysis, knowledge discovery and human factors which
aid in the communication between human and computer – as well as the decision making







Visual Data Mining: An Introduction and Overview (PDF
Download Available). Available at accessed Jan 12 2018.



section{Data Preparation}

In the modern world we live in, we are faced with
increasingly rapid amounts of data. This data is mostly stored in its raw state
without filtering or cleaning which is unsuitable and erroneous for the tasks
ahead. Cleaning data implies we have dirty data. Data often comes in a variety
of forms that are difficult to work with i.e: incomplete data, various formats,
duplicated rows. Using data without cleaning poses risks of data quality and
uncertainty as the data is inconsistent and messy.


            R.A. Fisher
and C.R. Rao also stress about cross-examination of data as an important stage
before beginning with analysis tasks. When dealing with incomplete data novel
methods are available which perform validation checks and advanced analytical
techniques to help outline the quality of data. Considering that the plot of
the tasks ahead is drawn from this data, it is important for an analyst to
check these steps. These methods are needed as they help spot missing values in
data, replacement of missing data (Svolba, 
2015) Bad data quality is an enormous blow for an analyst. If data is
considered unfit for further processing (often about data completeness status)
it causes a postponement or cancellation of the project. It also causes lack of
trust and high level of uncertainty in analysis results. Data quality can be
improved using analytical and statistical methods. However, providing values
for missing data can often hide important facts such as intentionally omitted
data or indicate faulty sensors. Tools Openrefine


 As a result, analysts
may be able to determine more easily when expected information is missing;
sometimes the fact that information is missing offers important clues in the
assessment of the situation


 There seems to be an
overlap between the fields in the visual analytics scope thereby enabling
seamless integration of infrastructure




  Dealing with large
scales of data that more often than not exceeds the limits of what can be
displayed on standard conventional desktop displays, to avoid issues such as
cluttered displays or workspaces, data mining techniques such as filtering,
aggregation, principal component analysis and reduction methods are used to
reduce or compress the amount of data viewed as only a small portion can be
displayed. Scalability in general is a key challenge in visual analytics as it
decides not only the computational techniques and algorithms but also the
appropriate rendering techniques. Visual Analytics is tasked with providing an
overview of datasets, while maximizing the amounts of details at the same time
to gain insights (Keim – challenges in visual data analysis).


  Using wall-sized
displays, it is possible to compare several hundreds of these growth matrices,
However, some datasets like those of Facebook, Google, IBM and various other
sectors contain billions of rows of records which cannot be viewed by a
magnitude of several wall displays. Having no way to adequately explore these
large datasets() which have been collected due to their potential usefulness,
the data becomes useless and the databases become data “dumps”



  Visual analytics
aims at integrating the user in the exploration process by leveraging our
perceptual and cognitive abilities to exploring large datasets. Visual
representations translate data into various visual forms that highlight features
in the data such as anomalies and commonalities and supports user interaction
to be able to analyze and understand these data trends. The goal is to present
the user with appropriate visual representations or metaphors of the data that
closely match the information being represented thereby allowing for insight
and hypotheses generation to aid the sensemaking process.

  An important process
is not just analyzing data using different algorithms and combinations but also
interpreting which visualization components are best suited to analyze a
particular dataset




subsection{Geo-spatial Analysis}


This data type is in reference to movement and/or
positioning of of objects on a map or chart. These data sources include
geographical measurements, GPS data and remote tracking sensors which consists
of two dimensions; longitude and latitude plotted using x-y coordinates on a
map or chart. These types of data help to create a sense of spatial and
situational awareness.


Scalability also poses a risk as the number of data points
often to be visualized causes cluttered views. To deal with the threat of big
data, data points are measured and aggregated as units and then depicted by
their density (encoded with colour or size)  





subsection{Temporal Analysis}

Temporal analysis seeks methods to exploit the temporal
nature of real world data to help with the identification of patterns, trends
and correlations of data elements over time. These data are animated against
time to provide a narrative of a sequence of events and show how certain
elements evolve over time.


As temporal related data is a function of time, we face
complexities of scale as often we may wish to look for trends during hourly,
daily, monthly and others that occur on a yearly basis.




subsection{Network Analysis}

This set of data consists objects, called actors or nodes
and connections between these called edges, which model real-life simulations.
This type of data is intended to show relationships between entities and often
displayed in an hierarchal format. Examples range from electronic power grid
connections, e-mail exchanges, social network communications, transportation
networks, customer shopping behavior. Basic measures of density, centrality and
proximity help with the discovery of interesting insights in the data and can
be used to make inferences.



subsection{Text Analysis}

Textual data consists 
of documents, multimedia web contents and hypertext. Textual data types
differ from most as they cannot be easily depicted as numbers and therefore
most standard visualization techniques cannot be applied . The use of word
clouds  which help identify keywords,
document snippets etc. are examples of ways to visualize information retrieval
of text-based techniques.



section{Interactive Techniques}

The path to exploratory information discovery was expressed
by Schneiderman(): Overview first, zoom and filter, details on demand. These
actions can also be described as the key framework of information foraging
actions. Visual representations alone do not satisfy the users analytical
needs. Interaction techniques are required to support the dialogue between the
user and the data as it reveals insightful information, for instance by zooming
in on particular subsets of the data or considering a change in the underlying
visual metaphor. These interactive tools provide a mechanism of communication
amongst users and the visualization systems. Common interactive tools for
visual analytics applications include:




section{Sensemaking – Simon Attfield}


Sensemaking is the cognitive process of developing
interpretations of the world. The use of interactive visualizations in visual
analytics is of importance in enabling the sensemaking process as users
continually interact with informations systems to develop a mental model or
picture of a problem domain or activity. Klein (1999) developed a
classification of the human sensemaking process into two separate parts;
Naturalistic and Normative.


subsection*{Naturalistic Sensemaking}

This operates when the sensemaker reverses the order
perceived from one or more consequents and then in turn infers a possible
outcome. The significance of this process is in the ability to draw inferences
from limited information. Observing certain visualizations can enable users
make predictions or hypotheses based on prior knowledge. Naturalistic
sensemaking is subject to false interpretations and biases.

subsection*{Normative Sensemaking}




The power of the sensemaking process comes from devising
external aids that enhance our cognitive abilities. Visual analytics
facilitates the reasoning process process by  
visually representing information and allowing human interaction
directly with these representations to gain insights and conclusions that will
ultimately lead to better decision making. Pirolli and Card (1995) explain that
the sensemaking process takes place in two loops; information foraging and
sensemaking loop. The information foraging loop is the process of manipulation
and transformation of data to reveal insights whilst the sensemaking loop
involves the reviewing and organization of these insights generated in the
information foraging loop for effective communication and action. During data
analysis tasks, analyst engage in derivation and confirmation of hypotheses by
interactively exploring data using various techniques listed above. Nowadays,
the modern challenge is not the acquiring and analyzing of data to derive new
knowledge but rather understanding and analyzing the results of our analyses. (M
Gladwell, 2009 ) The visualization pipeline model and visual analytics process
model focus on exploring and gaining insight into data. However, little support
is offered by visual analytics systems to capture findings (into evidence
files), organize these findings (into schemas), construct arguments to validate
hypotheses, and present these


I'm Neil!

Would you like to get a custom essay? How about receiving a customized one?

Check it out