1:30-2:50, M/W, Mini 4
Instructor: David Newbury
Phone: (773)-547-2272
Email: dnewbury@andrew.cmu.edu
We live in a world awash with data. Cheap storage, ubiquitous sensors, and an ever-more-connected world provide an enormous amount of information every day. As humanity moves from an environment of data scarcity to one of data overload, it is important to understand what can (and what should) be done with the ever-increasing amounts of information that society produces. Digital technology gives us the capability to manipulate and store that information, but communication of ideas is still a task for humans. How can we use those qualities that make us human to better understand our data-driven world?
Data visualization is the art of understanding how humans understand our visual environment, and of applying that knowledge to the visual representation of information. Through the technique of data visualization we can express the stories that underlie the data, and identify specific elements that allow us to explain, persuade, and inform ourselves and others.
There are two major practices of data visualization. One is exploratory data visualization, where visualization is used to understand new information. The other is explanatory data visualization, where the visualization is used to communicate a specific understanding of data. While the lines between these two practices are not well-defined, this course will focus primarily on explanatory data visualization.
This course will also focus on small data, by which I mean datasets that fit within a spreadsheet or within a computer's memory. It will focus on visualization, not data manipulation. And it will focus on techniques rather than on technologies. While software is an essential part of data visualization, the goal of this course is to focus on communicating information to people, and specifc software is merely one means to an end.
At the end of this course, you will:
At the end of this course, you will have a complete case study, from brief to visualization, demonstrating your solution to a data visualization problem in a given domain.
This course is designed for graduate-level students with an interest in data visualization and communication. It does not assume or require any programming abilities beyond the ability to write basic spreadsheet formulas. It does not require sophisticated statistical abilities or math skills. It does require competence with spreadsheets and the ability to write clearly.
The course will also require a strong aptitude for self-guided learning of software tools. While we will demonstrate software like Tableau and Carto, the course will not focus on teaching you how to use specific software tools; instead, it will help you understand what tools you might want to use and direct you towards resources to learn those tools. I am always willing to help answer specific questions, but the focus of this course will be on the visualizations themselves, not on the tools we use to help make those visualizations.
As the required textbook, we will use Data Visualisation, A Handbook for Data Driven Design by Andy Kirk (ISBN 978-1-4739-1214-4). This will serve as a primary reference material for much of what we cover in class. It's available on Amazon, which is probably the most convenient way to get it.
(Note that there is a hardcover version of this book, but it's VERY expensive and intended for libraries. There's also a e-book version of it, which is slightly cheaper. Doesn't matter to me which one you get, but you will need to have a copy of it.)
In addition to this book, we will also heavily use online resources and examples, which will be available as links on the course website.
All software required for use in this course will be freely available, or freely available for non-commercial use. Other software can be used, of course, but will not be required.
Project Type | Number of Graded Parts | Total Grade % |
---|---|---|
In-Class Quizzes | 10 of 11 quizzes (3% each) | 30% |
Technology Reviews | 4 of 6 workshops (5% each) | 20% |
Final Project | 5 components (10% each) | 50% |
Data visualization is fundamentally a subjective experience, and as such this will not be an exam-heavy course. Instead, much of the work will focus on projects, with additional grading of written critiques of existing visualizations. The course will focus on understanding the techniques of data visualization through developing creative visualizations using new tools to solve defined projects, and through deep analysis of existing visualizations.
There will be three categories of assignments for this course:
At the beginning of every class, we will have a short in-class quiz where we answer questions about a visualization. These questions will be based on criteria discussed in the readings for that class. These quizzes will not be graded. Towards the end of each class period, there will be a second quiz, using the same questions, but about a second visualization. These will be completed in-class as group work, where each group will consist of three students. The group quiz will be graded.
There will be 11 of these quizzes, and you can drop your lowest score. Each of these quizzes will be worth 3% of your final grade, and they can only be done in class. There will be not be an opportunity to make these up after class, so if you must miss class for illness or family emergency, please let me know before the class to make arrangements.
Each Wednesday thoughout the course we will be exploring a software tool for creating data visualizations. There will be an in-class demonstration and mini-workshop where you will be introduced to the week's tool, and there will also be a take-home assignment for each workshop that builds upon the work done in class. There will be six of these assignments, and you can drop your lowest two scores.
Week | Date | Tool | Link |
---|---|---|---|
Week 1 | March 22 | Google Sheets | https://www.google.com/sheets/ |
Week 2 | March 29 | Highcharts.js | http://www.highcharts.com |
Week 3 | April 5 | D3.js | https://d3js.org |
Week 4 | April 12 | Tableau Public | https://public.tableau.com/s/ |
Week 5 | April 19 | P5.js | https://d3js.org |
Week 6 | April 26 | Carto | https://carto.com |
These take-home explorations are designed to provide you a structure for exploring the tool. I don't expect that you will master the tool in a single workshop, so these will be evaluated on effort and exploration, not on proficiency. They will involve customizing the work done in class, and as such are designed to be done individually. Projects that are substantially the same with other student's work will be given a 0.
The completed explorations are due the Monday following the workshop.
The second half of the course will be be focused on working through the process of creating a single data communication project. We will take this project from initial analysis, through creation of a brief, identification of techniques, creating visualizations, and final a presentation. These visualization projects will be done as individual projects, but I highly recommend working in groups around the same underlying dataset, though on different briefs. This will allow teamwork and collaboration, while still allowing for individual variance.
The available datasets are:
There will be five parts to the communication project:
Each of these components will be worth 10% of your grade. Additional information about the details of these assignments will be provided later in the course.
Late work will not be accepted.
Laptops will be permitted in class, and will be required for use on Wednesdays. Please be respectful of the class and your classmates with your use of laptops by avoiding distracting content. If you feel the need to use you laptop for non-course purposes, please sit in the back two rows of the class. Phones should be on silent, and obviously you shouldn't talk on the phone during class. Laptops and phones can not be used during quizzes.
If in-classroom technology becomes disruptive to the class, this policy will be revisited.
There is a graded in-class quiz as part of every class, so it is in your best interest to attend. Because you can drop one quiz, you can miss one class without penalty, but I suggest reserving this for an actual emergency. If you must miss class for illness, family emergency, or other valid reason, please let me know before the class to make arrangements.
We'll use the standard Heinz grading scale for this class.
Letter | Score | Letter | Score | Letter | Score |
---|---|---|---|---|---|
A+ | 99.0-100% | B+ | 88.0-90.9% | C+ | 78.0-80.9% |
A | 94.0-98.9% | B | 84.0-87.9% | C | 74.0-77.9% |
A- | 91.0-93.9% | B- | 81.0-83.9% | C- | 71.0-73.9% |
Most classes will have slides, and those slides will be made available after every class. If you feel the need to record the class, discuss this with me prior to recording. Any recordings made during class are for personal use only, and cannot be distributed in any fashion or format.
Please, don't eat in the class—it's distracting. Beverages are fine—I'll be drinking coffee, so it's only fair that you can, too.
Due to the sleeping habits of small children and professional obligations, I strongly prefer that you email me instead of calling or texting. I will make every effort to respond to emails as soon as I can.
If you wish to request an accommodation due to a documented disability, please inform your instructor and contact Disability Resources at [email protected] or 412-268-2013 as soon as possible.
Homework must be individual work unless otherwise stated. You are encouraged consult each other on clarification, technical and conceptual issues, and on interpreting the data but you must do individual problem solving and derive your own solutions, including your own computer and design work. If you have any question concerning whether an act is appropriate please consult me or the appropriate university official before acting. The minimum penalty for cheating on an assignment is zero credit for the work submitted, and the maximum penalty is being failed for the course.
You are responsible for being familiar with the university standard for academic honesty and plagiarism. Please see the CMU Student Handbook for information. In order to deter and detect plagiarism, online tools and other resources are used in this class.
David Newbury is a developer and consultant who has been working at the intersection of technology and the arts for the past 15 years. He currently runs the Art Tracks project at the Carnegie Museum of Art, a research project exploring data standards and visualization around the history of ownership of art. He is also leading development efforts for the American Art Collaborative, a multi-museum exploration of Linked Data and its role in cultural heritage. David works regularly with the Frank-Ratchye Studio for Creative Inquiry at CMU, most recently on the Terrapattern project, a visual-similarity search project for satellite photography. He runs the Pittsburgh New Media Arts Meetup, taught interactive Design at the University of Illinois, is an instructor for the Western Pennsylvania Regional Data Center's Data 101 Seminar Series, and regularly lectures on data visualization and technology use in artistic practice.
Take care of yourself. Do your best to maintain a healthy lifestyle this semester and take some time to relax. This will help you achieve your goals and cope with stress.
All of us benefit from support during times of struggle. You are not alone. There are many helpful resources available on campus and an important part of the college experience is learning how to ask for help. Asking for support sooner rather than later is often helpful.
If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, we strongly encourage you to seek support. Counseling and Psychological Services (CaPS) is here to help: call 412-268-2922 and visit their website at http://www.cmu.edu/counseling/. Consider reaching out to a friend, faculty or family member you trust for help getting connected to the support that can help.