What is data? The word means many things to many people. (Consider "data" as it relates to your phone contract, for instance!) For our purposes, a definition we like is "units of information observed, collected, or created in the course of research".
Data observed, collected, or cerated for research purposes can be numbers, text, images, audio clips, and video clips. But in this section on using data as sources, we're going to concentrate on numerical data.
So using numeric data in those portions of your final product that require evidence can really strengthen your argument for your argument for your answer to your research question. At other times, even if data is not actually necessary, numeric data can be particularly persuasive and sharpen the points you want to make in other portions of your final product devoted to, say, describing the situation surrounding your research question.
For example, for a term paper about the research question "Why is there a gap in the number of people who qualify for food from foodbanks and the number of people who use foodbanks?," you could find data on the website of Feeding America, the nation's largest network of foodbanks. Some of that data may be the number of people who get food from a foodbank annually, with the number of seniors and children broken out. Those data won't answer your research question, but they will help you describe the situation around that question and help your audience develop a fuller understanding.
Similarly, for a project with the research question "How do some birds in Australia use "smart" hunting techniques to flush out prey, including starting fires?," you might find a journal article with data about how many people have observed these techniques and estimates of how frequently the techniques are used and by how many bird species.
There are two ways of obtaining data:
Numeric search data can be found all over the place. A lot of it can be found as part of another source- such as books; journal, newspaper, and magazine articles; and web pages. In these cases, the data do not stand alone as a distinct element, but instead are part of the larger work.
When searching for data in books and articles and on web pages, terms such as statistics or data may or may not be useful search terms. That's because many writers don't use those terms in their scholarly writing. They tend to use the words findings or results when talking about the data that could be useful to you. In addition, statistics is a separate discipline and using that term will turn up lots of journals in that area, which won't be helpful to you. So use the search terms data and statistics with caution, especially when searching library catalogs.
Even without using those search terms, many scholarly sources you turn up are likely to contain data. Once you find potential sources, skim them for tables, graphs, or charts. These items are displays or illustrations of data gathered by researchers. However, sometimes data and interpretations are solely in the body of the narrative text and may be included in sections called "Results" or "Findings". (That shouldn't keep you from displaying the data in charts, graphs, or tables as you like in your own work, though. See Data Visualization later in this section).
If the data you find in a book, article, or web page is particularly helpful and you want more, you could contact the author to request additional numeric research data. Researchers will often discuss their data and its analysis – and sometimes provide some of it (or occasionally, all). Some may link to a larger numeric research data set. However, if a researcher shares his or her data with you, it may be in a raw form. This means that you might have to do additional analysis to make it useful in answering your question.
Depending on your research question, you may need to gather data from multiple sources to get everything you need to answer your research question and make your argument for it.
For instance, in our example related to foodbanks above, we suggested where you could find statistics about the number of people who get food from American foodbanks. But with that research question ("Why is there a gap in the number of people who qualify for food from foodbanks and the number of people who use foodbanks?"), you would also need to find out from another source how many people qualify for foodbanks based on their income and compare that number with how many people actually use foodbanks.
Finding Data, Data Depositories, and Directories
Sometimes the numeric research data you need may not be in the articles, books, and web sites that you've found. But that doesn't mean that it hasn't been collected and packaged in a useable format. Governments and research institutions often publish data they have collected in discipline-specific data depositories that make data available online. Here are some examples:
The United Nations and just about every country provide information as numeric data available online. Free and accessible data like this is called open data. The U.S. federal government, all states, and many local governments provide open data. You can find them (among other places) at site: .gov.
Other data are available through vendors who publish the data collected by researchers. Here are some examples:
Don't know if a depository that could contain data in your discipline? Check out a data directory such as re3data.org
Evaluating data for relevance and credibility is just as important as evaluating any other source. Another thing that is the same with data is that there is never a 100% perfect source. So just as is pointed out in Evaluating Sources, you'll have to make educated guesses (inferences) about whether the data are good enough for your purpose.
Critical thinking as you evaluate sources is something your professors will expect. But you'll benefit in other ways, too, because you'll be practicing a skill necessary for the rest of your life, both in the workplace and in your personal life. It's those skills that will keep you from being duped by fake news and taken advantage by posts that are ignorant or, sometimes, simply scams.
To evaluate data, you'll need to find out how the data were collected. If the data are in another source, such as a book; web page; or newspaper, magazine, or research journal article, evaluate that source in the usual way. If the book or newspaper, magazine, or web page got the data from somewhere else, do the same evaluation of the source from which the book or article got the data. The article, book, or web page should cite where the data came from. If it doesn't, then that is a black mark against using that data. (The data in a research journal article are often the work of the authors of the article. But you'll want to be sure they provide information about how they collected the data).
In addition, if the data are in a research journal article, read the entire article, including the section called Methodology, which tells how the data were collected. Then determine the data's relevance to your research question by considering such questions as:
Research articles are sometimes difficult to read until you get used to them. Here's a helpful PDF: https://violentmetaphors.files.wordpress.com/2018/01/how-to-read-and-understand-a-scientific-article.pdfTo evaluate the credibility of the data in a research journal article you have already read, take the steps recommended in Evaluating Sources, plus consider these questions:
Modern software can help you display your data in ways that are striking and often even beautiful. But the best criterion for judging whatever display you use is whether it helps you and your audience understand your data better than only text, maybe even noticing points that you would have otherwise missed.
Specific kinds of charts and graphs accomplish different things, which is important to keep in mind as you evaluate data and data sources. For instance:
It's important to decide what you want a display to do before making your final choice. Studying your data first so you know what you have will help you make that decision. Also, it may also be conventional in your discipline to display your data in certain ways. Examining the sources you were assigned to read in your course or asking your professor will help you learn what's considered conventional.
Your professors will be examining your visual display to make sure you did not misrepresent the data. For example, the proportions of slices in a pie chart all have to add up to 100%. If yours don't, you've done something wrong.
It's easy to get overwhelmed by all the choices to be made between potential displays and what each can do: Here are two sites to help you sort them out once you know your data:
http://datavizproject.com/
https://datavizcatalogue.com/
If you aren't ready yet to use some of the specialized tools for display, make it a point to learn how to use the data display capabilities in Microsoft Word and/or Excel. You can find helpful tutorials on the Web. Good search statements to find those tutorials are:
If you are OSU staff, students, or faculty, OSU Libraries' Research Commons can help you choose a display, recommend a tool to accomplish it, and check out your finished data visualization before you have to turn it in. Contact the data visualization specialist.
If you are interested in displaying geospatial data on a map, consider how the Research Commons also helps OSU students, staff, and faculty find geospatial data and choose tools to display them.
Data is not copyrightable, but the expression of data is. So as with any other information source, you should cite any data you use from a source, whether it appeared in an article or you downloaded the data from a repository on the Web.
Unfortunately, data citation standards do not exist in many disciplines, although the DataCite initiative is working on them. Current workarounds include:
Once you have your data, you can examine them and make an interpretation. Sometimes, you can do so easily. But not always.
What if…
…you had a lot of information? Sometimes data can be very complicated and may include thousands (or millions…or billions…or more!) of data points. Suppose you only have a date and the high temperature for Columbus – but you have this for 20 years' worth of days. Do you want to calculate the average highs for each month based upon 20 years' worth of data by hand or even with a calculator?
…you want to be able to prove a relationship? Perhaps your theory is that social sciences students do better in a certain class than arts and humanities or life and physical science students. You may have a huge spreadsheet of data from 20 years' worth of this course's sections and would need to use statistical methods to see if a relationship between major and course grade exist.
You may find yourself using special software, such as Excel, SAS, and SPSS, in such situations.
Many people may have a tendency to look for data to prove their hypothesis or idea, as opposed to really answering their research questions. However, you may find that the opposite happens: the data may actually disprove your hypothesis. You should never
try to manipulate data so that it gives credence to your desired outcome. While it may not be the answer you wanted to find, it is the answer that exists. You may, of course, look for other sources of data – perhaps there are multiple sources of data
for the same topic with differing results. Inconclusive or conflicting findings do happen and can be the answer (even if it's not the one you wanted!).
Conflicting results on the same topic are common. This is the reality of research because,
after all, the questions researchers are studying are complicated. When you have conflicting results you can't just ignore the differences – you'll have to do your best to explain why the differences occurred.
Source: Teaching and Learning, Ohio State University Libraries, https://ohiostate.pressbooks.pub/choosingsources/chapter/data-as-sources/
This work is licensed under a Creative Commons Attribution 4.0 License.