A mind mapping-based
literature review (MMBLR) approach to study the Big Data topic
Joseph Kim-keung Ho
Independent Trainer
Hong Kong, China
Abstract: Big Data emerged in the early 2000s as a topic
in eCommerce. It is examined in this paper by means of the mind mapping-based
literature review (MMBLR) approach of Ho (2016) to unveil its knowledge
structure. The exercise indicates that
its knowledge structure comprises four main themes with associated ideas,
viewpoints and empirical findings. Besides, the article illustrates how to
conduct a mind mapping-based literature review. All in all, the article offers
academic and pedagogical values on the topics of Big Data and the MMBLR
approach. The exercise itself provides also a stimulating intellectual learning
experience to the literature reviewer.
Keywords: Big Data, literature review, mind map, the
mind mapping-based literature review (MMBLR) approach
Please cite the article as: Ho, J.K.K. 2016. “A mind mapping-based
literature review (MMBLR) approach to study the Big Data topic” Joseph KK Ho e-resources blog October 9
(url address: http://josephho33.blogspot.hk/2016/10/a-mind-mapping-based-literature-review.html).
Introduction
The topic of Big Data is a
relatively recent one, having emerged in the early 2000s. It is of academic and
pedagogical interest to the writer who has been a lecturer on ecommerce. In
this article, the writer presents his literature review findings on Big Data
using the mind mapping-based literature review (MMBLR) approach. This approach
was proposed by this writer this year and has been employed to review the
literature on a number of topics, such as supply chain management, strategic
management accounting and customer relationship management (Ho, 2016). The
overall aims of this exercise are to:
1. Render an image of the knowledge structure of
Big Data via the application of the MMBLR approach;
2. Illustrate how the MMBLR approach can be
applied in literature review, especially in preliminary literature review.
The findings from the
review exercise offer academic and pedagogical values to those who are
interested in the topics of Big Data, literature review and the MMBLR approach.
Other than that, this exercise facilitates this writer’s intellectual learning
on these three topics. The next section makes a brief introduction on the MMBLR
approach. After that, an account of how it is applied to study Big Data is
presented.
An application of the mind mapping-based
literature review (MMBLR) approach
The mind mapping-based
literature review (MMBLR) approach was developed by this writer this year (Ho,
2016). It makes use of mind mapping as a complementary literature review
exercise (see the Literature on mind
mapping Facebook page and the Literature
on literature review Facebook page) on thematic analysis. The MMBLR approach
is made up of two steps. Step 1 is a thematic analysis on the literature of the
topic chosen for study. Step 2 makes use of the findings from step 1 to produce
a complementary mind map. The MMBLR approach is a relatively straightforward
and brief exercise to study an academic topic. It is also an interpretive
exercise in the sense that different reviewers with different research interest
and intellectual background will inevitably select somewhat different ideas,
facts and findings in their thematic analysis (i.e., step 1 of the MMBLR
approach). Also, to conduct the approach, the reviewer needs to perform a
literature search beforehand. Apparently, what a reviewer gathers from a
literature search depends on what library/e-library facility is available to
the reviewer. The next section presents the writer’s findings from the MMBLR
approach step 1; afterward, a companion mind map is provided based on the MMBLR
approach step 1 findings.
A thematic analysis on the Big Data
literature
Step 1of the MMBLR approach is
a thematic analysis on the literature of the topic under investigation (Ho,
2016). In our case, this is the Big Data topic. The writer gathers some
academic articles from some universities’ e-libraries as well as via the Google
Scholar. With the academic articles gathered, the writer conducted a literature
review on them to gather a set of ideas, viewpoints, concepts and findings
(called points here). These points from the Big Data literature are then
grouped into four themes here. The thematic analysis endeavour is interpretive.
Some of the themes are further divided into sub-themes. The thematic analysis
findings are as follows:
Theme 1:
Definitions and characteristics of Big Data
Point 1.1.
“Big data is a term that describes large volumes of high velocity, complex and variable
data that require advanced techniques and technologies to enable the capture, storage, distribution,
management, and
analysis of the information”
(TechAmerica Foundation’s Federal Big Data Commission, 2012);
Point 1.2.
“Laney (2001)
suggested that Volume, Variety,
and Velocity (or the Three V’s) are the three dimensions of challenges in data
management. The Three V’s have emerged as a common framework to describe big
data” (Gandomi and Haider, 2015);
Point 1.3.
“Big Data is a loosely defined term used to
describe data sets so large and complex that they become awkward to work with
using standard statistical software..” (Snijders, Matzat
and Reips, 2012);
Point 1.4.
“big data are often characterized by relatively “low
value density”. That is, the data received in the original form usually has a
low value relative to its volume. How-ever, a high value can be obtained by
analyzing large volumes of such data” (Gandomi and Haider, 2015);
Point
1.5.
“We define Big Data as a
cultural, technological, and scholarly phenomenon that rests on the interplay
of: (1) Technology … (2) Analysis … (3) Mythology
…” (Boyd
and Crawford, 2012);
Point
1.6.
“Big Data is less about data
that is big than it is about a capacity to search, aggregate, and
cross-reference large data sets…” (Boyd and Crawford, 2012);
Point 1.7.
“The fast evolution of big data
technologies and the ready acceptance of the concept by public and private
sectors left little time for the discourse to develop and mature in the
academic domain” (Gandomi and Haider, 2015);
Theme 2:
Business and technological trends related to Big Data
Theme 2.1:
Business-related
Point 2.1.1.
“….organizations are swimming in
an expanding sea of data that is either too voluminous or too unstructured to
be managed and analyzed through traditional means. Among its burgeoning sources
are the clickstream data from the Web, social media content (tweets, blogs,
Facebook wall postings, etc.) and video data from retail and other settings and
from video entertainment…” (Davenport, Barth and Bean, 2012);
Point 2.1.2.
“….
In business, economics and other fields … decisions will increasingly be based
on data and analysis rather than on experience and intuition” (Lohr, 2012);
Point 2.1.3.
“..A key tenet of big data is that the world
and the data that describe it are constantly changing, and organizations that
can recognize the changes and react quickly and intelligently will have the
upper hand…” (Davenport, Barth and Bean, 2012);
Point 2.1.4.
“…. Over time, we believe big data may well
become a new type of corporate asset that will cut across business units and
function much as a powerful brand does, representing a key basis for
competition. If that’s right, companies need to start thinking in earnest about
whether they are organized to exploit big data’s potential and to manage the
threats it can pose” (Brown, Chui and Manyika, 2011);
Point 2.1.5.
“As the volume of data
explodes, organizations will need analytic tools that are reliable, robust and
capable of being automated. At the same time, the analytics, algorithms and
user interfaces they employ will need to facilitate interactions with the
people who work with the tools…” (Davenport, Barth and Bean, 2012);
Theme 2.2:
Technology-related
Point 2.2.1.
“..the computer tools for gleaning knowledge and
insights from the Internet era’s vast trove of unstructured data are fast
gaining ground.” (Lohr, 2012);
Point 2.2.2.
“..…Over the past few years,
nearly all major companies, including EMC, Oracle, IBM, Microsoft, Google,
Amazon, and Facebook, etc. have started their big data projects..” (Chen, Mao and Liu, 2014);
Point 2.2.3.
“It is estimated that the business data volume
of all companies in the world may double every 1.2 years ….. The continuously
increasing business data volume requires more effective real-time analysis so
as to fully harvest its potential…” Chen, Mao and Liu, 2014);
Point 2.2.4.
“Although major innovations in
analytical techniques for big data have not yet taken place, one anticipates
the emergence of such novel analytics in the near future. For instance,
real-time analytics will likely become a prolific field of research because of
the growth in location-aware social media and mobile apps.” (Gandomi and Haider, 2015);
Theme 3: Management
practices and challenges related to Big Data
Theme 3.1:
Associated technology-related
Point
3.1.1.
“..The development of cloud computing provides
solutions for the storage and processing of big data. On the other hand, the
emergence of big data also accelerates the development of cloud computing…” (Chen, Mao and Liu, 2014);
Point
3.1.2.
“..…..At
present, the data processing capacity of IoT [internet of things] has fallen
behind the collected data and it is extremely urgent to accelerate the
introduction of big data technologies to promote the development of IoT.” (Chen, Mao and Liu, 2014);
Point
3.1.3.
“In the IoT [internet of things]
paradigm, an enormous amount of networking sensors are embedded into various
devices and machines in the real world.…
The big data generated by IoT has different characteristics compared
with general big data..” (Chen, Mao
and Liu, 2014);
Point
3.1.4.
“the specialized tools of Big
Data also have their own inbuilt limitations and restrictions. For example,
Twitter and Facebook are examples of Big Data sources that offer very poor
archiving and search functions…” (Boyd and Crawford, 2012);
Theme 3.2:
The data management-related
Point
3.2.1.
“Most DBMSs
[database management systems] are designed for efficient transaction processing:
adding, updating, searching for, and retrieving small amounts of information in
a large database…..…The trouble comes when we want to take that accumulated
data, collected over months or years, and learn something from it and naturally
we want the answer in seconds or minutes! The pathologies of big data are
primarily those of analysis…” (Jacobs, 2009);
Point
3.2.2.
“The latest advances of
information technology (IT) make it more easily to generate data. … Therefore,
we are confronted with the main challenge of collecting and integrating massive
data from widely distributed data sources…” (Chen, Mao and Liu, 2014);
Point
3.2.3.
“Presently, Hadoop is widely used
in big data applications in the industry, e.g., spam filtering, network
searching, clickstream analysis, and social recommendation…” (Chen,
Mao and Liu, 2014);
Point
3.2.4.
“In the big data paradigm, the
data center not only is a platform for concentrated storage of data, but also
undertakes more responsibilities, such as acquiring data, managing data,
organizing data, and leveraging the data values and functions.” (Chen, Mao and Liu, 2014);
Point
3.2.5.
“Data collection is to utilize
special data collection techniques to acquire raw data from a specific data
generation environment. Four common data collection methods are shown as
follows. – Log files….– Sensing…– Methods
for acquiring network data -Libpcap-based
packet capture technology..” (Chen,
Mao and Liu, 2014);
Point
3.2.6.
“Large data sets from Internet
sources are often unreliable, prone to outages and losses, and these errors and
gaps are magnified when multiple data sets are used together…” (Boyd and Crawford, 2012);
Point
3.2.7.
“The first challenge brought
about by big data is how to develop a large scale distributed storage system
for efficiently data processing and analysis…” (Chen, Mao and Liu, 2014);
Point 3.2.8.
“Data can frequently be
collected passively, without much effort or even awareness on the part of those
being recorded. And because the cost of storage has fallen so much, it is
easier to justify keeping data than discarding it,” observe Viktor Mayer-Schönberger
and Kenneth Cukier…..” (Hayashi,
2014);
Point
3.2.9.
“..distributed
analysis of big data comes with its own set of “gotchas.” One of the major
problems is nonuniform distribution of work across nodes…” (Jacobs, 2009);
Point 3.2.10.
“It
is estimated that the analytics-ready structured data forms only a small subset
of big data. The unstructured data, especially data in video format, is the
largest component of big data that is only partially archived” (Gandomi and Haider, 2015);
Point 3.2.11. “Traditional data management systems are not capable of
handling huge data feeds instantaneously. This is where big data technologies
come into play. They enable firms to create real-time intelligence from high
volumes of ‘perish-able’ data” (Gandomi and Haider, 2015);
Point
3.2.12. “….. Data is not only becoming more available
but also more understandable to computers…” (Lohr, 2012);
Theme 3.3:
The data analysis-related
Point
3.3.1.
“..Big Data introduces two new popular types
of social networks derived from data traces: ‘articulated networks’ and
‘behavioral networks’…” (Boyd
and Crawford, 2012);
Point
3.3.2.
“An anthropologist working for
Facebook or a sociologist working for Google will have access to data that the
rest of the scholarly community will not’. Some companies restrict access to
their data entirely; others sell the privilege of access for a fee; and others
offer small data sets to university-based researchers. This produces
considerable unevenness in the system…” (Boyd and Crawford, 2012);
Point
3.3.3.
“….Understanding networks and network formation
is a core topic in complexity research and its underlying sociological and
social-psychological processes should receive more attention in the analysis of
Big Data…” (Raine and Wellman, 2012);
Point
3.3.4.
“Too often, Big Data enables
the practice of apophenia: seeing patterns where none actually exist, simply
because enormous quantities of data can offer connections that radiate in all
directions….” (Boyd
and Crawford, 2012);
Point
3.3.5.
“researchers have the tools and
the access, while social media users as a whole do not. Their data were created
in highly context-sensitive spaces, and it is entirely possible that some users
would not give permission for their data to be used elsewhere…” (Boyd and Crawford, 2012);
Point
3.3.6.
“..….Because large data sets
can be modeled, data are often reduced to what can fit into a mathematical
model. Yet, taken out of context, data lose meaning and value..” (Boyd and Crawford, 2012);
Point
3.3.7.
“As Gitelman (2011) observes,
data need to be imagined as data in the first instance, and this process of the
imagination of data entails an interpretative base: ‘every discipline and
disciplinary institution has its own norms and standards for the imagination of
data’…” (Boyd and Crawford, 2012);
Point
3.3.8.
“..…Big Data and whole data are
also not the same. Without taking into account the sample of a data set, the
size of the data set is meaningless…..” (Boyd and Crawford, 2012);
Point
3.3.9.
“in order to enable effective data analysis, we shall
pre-process data under any circumstances to integrate the data from different
sources..” (Chen, Mao and Liu, 2014);
Point 3.3.10.
“the following techniques represent a relevant subset of
the tools available for big data analytics. .. Text analytics …. Audio analytics …. Video analytics….. Social media analytics…. Predictive
analytics” (Gandomi and Haider, 2015);
Point
3.3.11. “..we can divide data analysis research into
six key technical fields, i.e., structured data analysis, text data analysis,
web data analysis, multimedia data analysis, network data analysis, and mobile
data analysis…” (Chen, Mao and Liu, 2014);
Point 3.3.12.
“The overall process of extracting insights from big data can be broken
down into five stages ….. These five stages form the two main sub-processes:
data management and analytics” (Gandomi and Haider, 2015);
Point
3.3.13. “If you have a random way of showing people
different things on your website, then you can pretty quickly, with a very
small number of observations, start to figure out what’s working and what
isn’t. In real time, you can begin to refine your presentation…” (Ransbotham, 2012);
Point
3.3.14. “Research insights can be found at any level,
including at very modest scales. In some cases, focusing just on a single
individual can be extraordinarily valuable…” (Boyd and Crawford, 2012);
Theme
3.4: General management-related
Point
3.4.1.
“Executives interested in leading a big data transition can start with
two simple techniques. First, they can get in the habit of asking “What do the
data say?” when faced with an important decision and following up with
more-specific questions such as “Where did the data come from?,” “What kinds of
analyses were conducted?,” ….. Second, they can allow themselves to be
overruled by the data …” (McAfee and Brynjolfsson, 2012);
Point
3.4.2.
“…Five Management Challenges… a transition to
using big data….. Leadership…. Talent management. … Technology. … Decision making. …. Company culture. …” (McAfee and Brynjolfsson, 2012);
Point
3.4.3.
“The more
companies characterized themselves as data-driven, the better they performed on
objective measures of financial and operational results…” (McAfee and Brynjolfsson, 2012);
Point
3.4.4.
“..…companies that learn to take
advantage of big data will use realtime information from sensors, radio
frequency identification and other identifying devices to understand their
business environments at a more granular level, to create new products and
services, and to respond to changes in usage patterns as they occur…” (Davenport,
Barth and Bean, 2012);
Point 3.4.5.
“Mayer-Schönberger and Cukier
explain three new imperatives: 1. Use all the data, not just a sample… 2. Accept
messiness [Inaccuracies in measurements] 3. Embrace
correlation” (Hayashi, 2014);
Point 3.4.6.
“Big data are worthless in a vacuum. Its potential value is unlocked only
when leveraged to drive decision making. To enable such evidence-based decision
making, organizations need efficient processes to turn high volumes of
fast-moving and diverse data into meaningful insights” (Gandomi and Haider, 2015);
Point
3.4.7.
“Through research on the five
core industries that represent the global economy, the McKinsey report pointed
out that big data may give a full play to the economic function, improve the
productivity and competitiveness of enterprises and public sectors, and create
huge benefits for consumers.” (Chen, Mao
and Liu, 2014);
Point
3.4.8.
“we can identify big data’s key elements.
First, companies can now collect data across business units and, increasingly,
even from partners and customers (some of this is truly big, some more granular
and complex). Second, a flexible infrastructure can integrate information and
scale up effectively to meet the surge. Finally, experiments, algorithms, and
analytics can make sense of all this information….” (Brown,
Chui and Manyika, 2011);
Point
3.4.9.
“Some literature … discuss
obstacles in the development of big data applications. The key challenges are
listed as follows: Data representation… Redundancy
reduction and data compression…
Data life cycle management… Analytical
mechanism… Data confidentiality… Energy management… Expendability
and scalability… Cooperation” (Chen, Mao and Liu, 2014);
Theme 4:
Policy considerations on Big Data
Point
4.1.
“.. major developed countries, including the US
and UK, are preparing diverse policies and measures that include bolstering
R&D investments and fostering experts to activate the big data industry in
a bid to become competitive in the smart ecosystem environment.” (Kwon, Kwak and Kim, 2015);
Point
4.2.
“Byung-Yeol et al. (2013) [Jang, 2013] emphasized
the importance of creating convergence services based on big data.” (Kwon, Kwak and Kim, 2015);
Point
4.3.
“Kyu-nam (2014) [Kim, 2014]
insisted that, in order for the big data industry to generate value as a future
growth engine, we should establish a structural framework based on social
consensus beforehand” (Kwon, Kwak and Kim,
2015);
Point
4.4.
“… Becoming data scientists
requires the convergence of various educational fields such as mathematics, science,
statistics, IT, and business.” (Kwon, Kwak and Kim,
2015);
Point 4.5.
“..the use of big data and predictive analytics
raises a number of difficult issues. One very hot topic is privacy concerns. In
2012, Target ignited a media firestorm after consumers learned that the company
was using its quantitative methods to predict which customers were pregnant” (Hayashi, 2014);
Referring to Figure 1, there are
four main themes, namely, “Definitions and characteristics of Big Data” (theme 1),
“Business and technological trends related to Big Data” (theme 2), “Management practices
and challenges related to Big Data” (theme 3), and “Policy considerations on Big
Data” (theme 4). Themes 2 and 3 have sub-themes. Each of the themes has a set
of associated points (i.e., idea, viewpoints, concepts and findings). Together
they provide an organized way to comprehend the knowledge structure of the Big
Data theme. The referencing indicated on the points identified informs the
readers where to find the academic articles to learn more about the details on
these points. The process of conducting the thematic analysis is an exploratory
as well as synthetic learning endeavour on the literature. Now that the
structure of the themes, sub-themes and their associated points are finalized,
the reviewer is in a position to move forward to step 2 of the MMBLR approach.
The MMBLR approach step 2 finding, i.e., a companion mind map, is presented in
the next section.
Mind mapping on the Big Data theme
By adopting the findings from
the MMBLR approach step 1 on the Big Data topic, the writer constructs a
companion mind map shown as Figure 1.
Referring to the mind map on
Big Data, the topic label is shown right at the centre of the map as a large
blob. Four main branches are attached to it, corresponding to the four themes
identified in the thematic analysis. In the same vein, two branches, associated
with themes 2 and 3, have sub-branches, which represent the sub-themes
recognized in the thematic analysis findings (i.e., the MMBLR approach step 1).
The links and ending nodes with key phrases represent the points from the
thematic analysis. As a whole, the mind map renders an image of the knowledge
structure on Big Data based on the thematic analysis findings, see also the Literature on big data Facebook page for
additional information on Big Data. Constructing the mind map is part of the
learning process on literature review on the reviewer’s part. On the whole, the
mind mapping process is speedy and entertaining. The resultant mind map also
serves as a useful presentation and teaching material. This mind mapping
experience confirms the writer’s previous experience using on the MMBLR
approach (Ho, 2016).
Concluding remarks
The MMBLR approach to study Big
Data provided here is mainly for its practice illustration as its procedures
have been refined via a number of its employment on an array of topics (Ho,
2016). This article does not introduce new steps nor new ideas to the approach.
In this respect, the exercise reported here primarily offers some pedagogical
value as well as some stimulated learning on Big Data. Nevertheless, the
thematic findings and the image of the knowledge structure on Big Data in the
form of a mind map should also be of academic value to those who research on
this topic.
Bibliography
1. Boyd, D. and K.
Crawford. 2012. “Critical questions for Big Data” Information,
Communication & Society 15(5): 662-679 (DOI:
10.1080/1369118X.2012.678878).
2. Brown, B., M. Chui and J. Manyika. 2011. “Are you ready
for the era of ‘big data’?” McKinsey
Quarterly October: 1-12.
3. Chen, M., S. Mao and Y. Liu. 2014. “Big Data:
A Survey” Mobile Nets Appl 19:
171-209.
4. Davenport,
T.H., P. Barth and R. Bean. 2012. “How ‘big Data’ Is Different” MIT Sloan Management Review 54(1) Fall:
43-46.
5. Gandomi, A. and M. Haider.
2015. “Beyond the hype: Big data concepts, methods, and analytics” International Journal of Information Management
35, Elsevier: 137-144.
6. Gitelman, L. 2011. “Notes for the Upcoming
Collection ‘Raw Data’ is an Oxymoron” [Online] (url address:
https://files.nyu.edu/lg91/public/) (Visited at July 23, 2011).
7.
Hayashi, A.M. 2014. “Thriving in a Big Data World” MIT Sloan Management Review 55(2) Winter: 35-39.
8.
Ho, J.K.K. 2016. Mind mapping for
literature review – a ebook, Joseph KK Ho publication folder October 7 (url
address:
http://josephkkho.blogspot.hk/2016/10/mind-mapping-for-literature-review-ebook.html).
9. Jacobs, A. 2009. “The Pathologies of Big Data” Communications of the ACM 52(8) August:
36-44.
10. Jang, B.Y., et al., 2013. “Big data-based
converged service development policies” STEPI. Science & Technology Policy 23 (3): 4–16.
11. Kim, K.N. 2014. “Big Data 2.0 Era, Key Issues
and Political Implications” Korea Information Society Development Institute.
12. Kwon, T.H., J.H. Kwak and K. Kim. 2015. “A study on the establishment of
policies for the activation of a big data industry and prioritization of
policies: Lessons from Korea” Technological
Forecasting & Social Change 96, Elsevier: 144-152.
13. Laney, D. 2001. “3-d data management:
controlling data volume, velocity and variety” META Group Research Note, February 6.
14. Literature on big data Facebook page, maintained by Joseph, K.K. Ho (url address:
https://www.facebook.com/Literature-on-big-data-1780021068946904/).
15. Literature on literature review Facebook page, maintained by Joseph,
K.K. Ho (url address: https://www.facebook.com/literature.literaturereview/).
16. Literature on mind mapping Facebook page, maintained by Joseph, K.K.
Ho (url address: https://www.facebook.com/literature.mind.mapping/).
17. Lohr, S. 2012. “The Age of Big Data” The New York Times February 11.
18. McAfee, A. and E. Brynjolfsson. 2012. “Big
Data: The Management Revolution” Harvard
Business Review October: 60-68.
19. Raine, L. and B. Wellman. 2012. Networked.
The new social operating system, MIT Press. Cambridge.
20. Ransbotham, S. 2012. “Why Detailed Data Is As Important
As Big Data” Interviewed by Kiron, D. MIT
Sloan Management Review April: 1-5.
21. Snijders, C., U. Matzat and U. Reips. 2012. “’Big Data”:
Big Gaps of Knowledge in the Field of Internet Science” International Journal of Internet Science 7(1): 1-5.
22.
TechAmerica Foundation’s Federal
Big Data Commission. 2012. “Demystifying bigdata:
A practical guide to transforming the business of Government” (url address: http://www.techamerica.org/Docs/fileManager.cfm?f=techamerica-bigdatareport-final.pdf).
The pdf version can be downloaded from: https://www.academia.edu/29023049/A_mind_mapping-based_literature_review_MMBLR_approach_to_study_the_Big_Data_topic
ReplyDelete