Awash in Data

Awash in Data, is a free open-source e-book authored by the fabulous Tim Erickson. Designed for both teachers and students, this rich, interactive resource walks users through introductory lessons in data science with CODAP (Common Online Data Analysis Platform). Embedded CODAP documents bring each lesson to life. Give it a go – create a data science club at your school!

From Awash in Data…

Introduction

This book leads you through a few introductory lessons in data science. You can think of it as a self-guided textbook for teachers or students. If you’re a teacher, you might assign chapters, problems, or projects for students to read and do.

In this book, you will use CODAP, the Common Online Data Analysis Platform, to do your data analysis. CODAP is free and web-based, that is, it runs in your browser. You do not even need to sign in or make an account to use it.

Smelling Like Data Science

Data Science is becoming a big deal in our society. It’s a hot profession with lots of well-paid jobs. Even if you are not a data scientist, data science is in your life. Every time you do a web search, get directions on your phone, or see a recommendation for a movie, a song, or a brand of ketchup, somebody did some data science to bring you that information. When you hear about the latest unemployment figures or the trends in income inequality, that comes from data science too. And when you hear people worry that some technological convenience will lead to greater surveillance and loss of privacy, once again, data science would make it possible.

It sounds like we had better learn about data science—but it must be super complicated, right? Data science involves huge data sets, and sophisticated computing techniques like machine learning and artificial intelligence. Doesn’t it take years of study—and buckets of talent—to learn that stuff?

As with anything, it does take years to be an expert. And experts may disagree about talent. But there are underlying ideas and ways of thinking that you can experience right now—and that’s what we hope this book will give you. We’ll use medium-sized data sets—at most a few thousand cases at a time, along with a few common-sense techniques, and a drag-and-drop data platform, to help you get an idea of what data science, for lack of a better term, smells like. When you’re done, you’ll be able to use that “sniff test” to recognize a data science problem; you’ll have a better idea what went into the data that you see and use, making you a more critical and competent citizen; and you’ll be better able to study data science in earnest, if you so desire.

How should we start?

And even before that, what is data science, really?

As of the Spring of 2020, the COVID Spring, no one really knows. Everyone pretty much agrees that it lives somewhere in the borderlands between Statistics and Computer Science.

We can use our emerging sniff test to recognize it. We’ll use two main ideas:

  • A data science problem often begins with a feeling of being awash in data.
  • Data science uses data moves to manipulate data.