Scientific training in the era of big data: A new pedagogy for graduate education

Research output: Contribution to journalReview article

  • 1 Citations

Abstract

The era of "big data" has radically altered the way scientific research is conducted and new knowledge is discovered. Indeed, the scientific method is rapidly being complemented and even replaced in some fields by data-driven approaches to knowledge discovery. This paradigm shift is sometimes referred to as the "fourth paradigm" of data-intensive and data-enabled scientific discovery. Interdisciplinary research with a hard emphasis on translational outcomes is becoming the norm in all large-scale scientific endeavors. Yet, graduate education remains largely focused on individual achievement within a single scientific domain, with little training in team-based, interdisciplinary data-oriented approaches designed to translate scientific data into new solutions to today's critical challenges. In this article, we propose a new pedagogy for graduate education: data-centered learning for the domain-data scientist. Our approach is based on four tenets: (1) Graduate training must incorporate interdisciplinary training that couples the domain sciences with data science. (2) Graduate training must prepare students for work in data-enabled research teams. (3) Graduate training must include education in teaming and leadership skills for the data scientist. (4) Graduate training must provide experiential training through academic/industry practicums and internships. We emphasize that this approach is distinct from today's graduate training, which offers training in either data science or a domain science (e.g., biology, sociology, political science, economics, and medicine), but does not integrate the two within a single curriculum designed to prepare the next generation of domain-data scientists. We are in the process of implementing the proposed pedagogy through the development of a new graduate curriculum based on the above four tenets, and we describe herein our strategy, progress, and lessons learned. While our pedagogy was developed in the context of graduate education, the general approach of data-centered learning can and should be applied to students and professionals at any stage of their education, including at the K-12, undergraduate, graduate, and professional levels. We believe that the time is right to embed data-centered learning within our educational system and, thus, generate the talent required to fully harness the potential of big data.

LanguageEnglish (US)
Pages12-18
Number of pages7
JournalBig Data
Volume5
Issue number1
DOIs
StatePublished - Mar 1 2017

Fingerprint

Education
Curricula
Students
Medicine
Data mining
Big data
Pedagogy
Graduate education
Economics
Industry

Keywords

  • big data analytics
  • big data infrastructure design
  • big data training
  • business intelligence
  • data science
  • graduate education
  • scientific discovery
  • team science

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Information Systems and Management

Cite this

@article{d207e9b620fb4fcb8f65cab7420eba9a,
title = "Scientific training in the era of big data: A new pedagogy for graduate education",
abstract = "The era of {"}big data{"} has radically altered the way scientific research is conducted and new knowledge is discovered. Indeed, the scientific method is rapidly being complemented and even replaced in some fields by data-driven approaches to knowledge discovery. This paradigm shift is sometimes referred to as the {"}fourth paradigm{"} of data-intensive and data-enabled scientific discovery. Interdisciplinary research with a hard emphasis on translational outcomes is becoming the norm in all large-scale scientific endeavors. Yet, graduate education remains largely focused on individual achievement within a single scientific domain, with little training in team-based, interdisciplinary data-oriented approaches designed to translate scientific data into new solutions to today's critical challenges. In this article, we propose a new pedagogy for graduate education: data-centered learning for the domain-data scientist. Our approach is based on four tenets: (1) Graduate training must incorporate interdisciplinary training that couples the domain sciences with data science. (2) Graduate training must prepare students for work in data-enabled research teams. (3) Graduate training must include education in teaming and leadership skills for the data scientist. (4) Graduate training must provide experiential training through academic/industry practicums and internships. We emphasize that this approach is distinct from today's graduate training, which offers training in either data science or a domain science (e.g., biology, sociology, political science, economics, and medicine), but does not integrate the two within a single curriculum designed to prepare the next generation of domain-data scientists. We are in the process of implementing the proposed pedagogy through the development of a new graduate curriculum based on the above four tenets, and we describe herein our strategy, progress, and lessons learned. While our pedagogy was developed in the context of graduate education, the general approach of data-centered learning can and should be applied to students and professionals at any stage of their education, including at the K-12, undergraduate, graduate, and professional levels. We believe that the time is right to embed data-centered learning within our educational system and, thus, generate the talent required to fully harness the potential of big data.",
keywords = "big data analytics, big data infrastructure design, big data training, business intelligence, data science, graduate education, scientific discovery, team science",
author = "Jay Aikat and Carsey, {Thomas M.} and Karamarie Fecho and Kevin Jeffay and Ashok Krishnamurthy and Mucha, {Peter J.} and Arcot Rajasekar and Ahalt, {Stanley C.}",
year = "2017",
month = "3",
day = "1",
doi = "10.1089/big.2016.0014",
language = "English (US)",
volume = "5",
pages = "12--18",
journal = "Big Data",
issn = "2167-6461",
publisher = "Mary Ann Liebert Inc.",
number = "1",

}

TY - JOUR

T1 - Scientific training in the era of big data

T2 - Big Data

AU - Aikat,Jay

AU - Carsey,Thomas M.

AU - Fecho,Karamarie

AU - Jeffay,Kevin

AU - Krishnamurthy,Ashok

AU - Mucha,Peter J.

AU - Rajasekar,Arcot

AU - Ahalt,Stanley C.

PY - 2017/3/1

Y1 - 2017/3/1

N2 - The era of "big data" has radically altered the way scientific research is conducted and new knowledge is discovered. Indeed, the scientific method is rapidly being complemented and even replaced in some fields by data-driven approaches to knowledge discovery. This paradigm shift is sometimes referred to as the "fourth paradigm" of data-intensive and data-enabled scientific discovery. Interdisciplinary research with a hard emphasis on translational outcomes is becoming the norm in all large-scale scientific endeavors. Yet, graduate education remains largely focused on individual achievement within a single scientific domain, with little training in team-based, interdisciplinary data-oriented approaches designed to translate scientific data into new solutions to today's critical challenges. In this article, we propose a new pedagogy for graduate education: data-centered learning for the domain-data scientist. Our approach is based on four tenets: (1) Graduate training must incorporate interdisciplinary training that couples the domain sciences with data science. (2) Graduate training must prepare students for work in data-enabled research teams. (3) Graduate training must include education in teaming and leadership skills for the data scientist. (4) Graduate training must provide experiential training through academic/industry practicums and internships. We emphasize that this approach is distinct from today's graduate training, which offers training in either data science or a domain science (e.g., biology, sociology, political science, economics, and medicine), but does not integrate the two within a single curriculum designed to prepare the next generation of domain-data scientists. We are in the process of implementing the proposed pedagogy through the development of a new graduate curriculum based on the above four tenets, and we describe herein our strategy, progress, and lessons learned. While our pedagogy was developed in the context of graduate education, the general approach of data-centered learning can and should be applied to students and professionals at any stage of their education, including at the K-12, undergraduate, graduate, and professional levels. We believe that the time is right to embed data-centered learning within our educational system and, thus, generate the talent required to fully harness the potential of big data.

AB - The era of "big data" has radically altered the way scientific research is conducted and new knowledge is discovered. Indeed, the scientific method is rapidly being complemented and even replaced in some fields by data-driven approaches to knowledge discovery. This paradigm shift is sometimes referred to as the "fourth paradigm" of data-intensive and data-enabled scientific discovery. Interdisciplinary research with a hard emphasis on translational outcomes is becoming the norm in all large-scale scientific endeavors. Yet, graduate education remains largely focused on individual achievement within a single scientific domain, with little training in team-based, interdisciplinary data-oriented approaches designed to translate scientific data into new solutions to today's critical challenges. In this article, we propose a new pedagogy for graduate education: data-centered learning for the domain-data scientist. Our approach is based on four tenets: (1) Graduate training must incorporate interdisciplinary training that couples the domain sciences with data science. (2) Graduate training must prepare students for work in data-enabled research teams. (3) Graduate training must include education in teaming and leadership skills for the data scientist. (4) Graduate training must provide experiential training through academic/industry practicums and internships. We emphasize that this approach is distinct from today's graduate training, which offers training in either data science or a domain science (e.g., biology, sociology, political science, economics, and medicine), but does not integrate the two within a single curriculum designed to prepare the next generation of domain-data scientists. We are in the process of implementing the proposed pedagogy through the development of a new graduate curriculum based on the above four tenets, and we describe herein our strategy, progress, and lessons learned. While our pedagogy was developed in the context of graduate education, the general approach of data-centered learning can and should be applied to students and professionals at any stage of their education, including at the K-12, undergraduate, graduate, and professional levels. We believe that the time is right to embed data-centered learning within our educational system and, thus, generate the talent required to fully harness the potential of big data.

KW - big data analytics

KW - big data infrastructure design

KW - big data training

KW - business intelligence

KW - data science

KW - graduate education

KW - scientific discovery

KW - team science

UR - http://www.scopus.com/inward/record.url?scp=85016420133&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85016420133&partnerID=8YFLogxK

U2 - 10.1089/big.2016.0014

DO - 10.1089/big.2016.0014

M3 - Review article

VL - 5

SP - 12

EP - 18

JO - Big Data

JF - Big Data

SN - 2167-6461

IS - 1

ER -