Data collection - What is it and why is it important?

Meaningful data analysis relies on thorough and rigorous data collection. Let's look at data collection methods, strategies and considerations for collecting data, and establishing rigor and transparency to persuade your audience that your research is built on reliable data.
Susanne Friese
Product specialist, trainer and author of the book "Qualitative Data Analysis with ATLAS.ti"
Roehl Sybing
Content creator and qualitative data expert
Interview with a camera
  1. Data collection - an overview
  2. Data in research
  3. Data collection examples
  4. Types of data
  5. Data collection methods
  6. Considerations when collecting data
  7. Data organization

Data collection - an overview

The data collected for your study informs the analysis of your research. Gathering data in a transparent and thorough manner informs the rest of your research and makes it persuasive to your audience.

Figure 1: Interviews and focus groups are common forms of qualitative data collection.

We will look at the data collection process, the methods of data collection that exist in quantitative and qualitative research, and what ATLAS.ti can do to help you make sense of your collected data.

  • Data in research
  • Data collection examples
  • Types of data
  • Data collection methods
  • Considerations when collecting data
  • Data organization

Data in research

When it comes to defining data, data can be any sort of information that people use to better understand the world around them. Having information to draw from prevents us from making blind guesses or acting on uneducated intuitions.

Necessity of data collection skills

As a result, collecting data is critical to the fundamental objective of research as a vehicle to organize knowledge. While this may seem intuitive, it's important to acknowledge that researchers must be as skilled in data collection as they are in data analysis.

Collecting the right data

Imagine a simple research question: what factors do people consider when buying a car? It would not be possible to ask every living person about their thinking about car purchases. Even if it was possible, not everyone drives a car, so asking non-drivers seems unproductive. As a result, the researcher conducting a study to devise data reports and marketing strategies has to take a sample of the relevant data to ensure reliable analysis and findings.

Data collection examples

In the broadest terms, any sort of data gathering contributes to the research process. In any work of science, researchers cannot make empirical conclusions without relying on some body of data to make rational judgments.

  • Various examples of data collection in the social sciences include:
  • responses to a survey about product satisfaction
  • interviews with students about their career goals
  • results of an experimental vitamin supplement regimen
  • observations of workplace interactions and practices
  • transactional data about customer behavior

Data science and scholarly research do not pose many restrictions on data collection, except that the data set should be specific and clearly defined, ruling out any irrelevant data or unsupported intuitions while developing new theory or key findings.

Types of data

Researchers can collect data themselves (primary data) or use third-party data (secondary data). The data collection considerations regarding which type of data to choose depend on what data is appropriate and relevant to their research.

First-party data

Original research relies on first-party data, or primary data that the researcher collects themselves for their own analysis. When you are collecting information in a primary study yourself, you are more likely to gain the high quality you require.

Because the researcher is most aware of the inquiry they want to conduct and has tailored the research process to their inquiry, first-party data collection has the greatest potential for reliability between the data collected and the insights that are generated as a result.

Ethnographic research, for example, relies on first-party data collection since a description of a culture or a group of people is contextualized through a comprehensive understanding of the researcher and their relative positioning to that culture.

Third-party data

Researchers can also use publicly available secondary data that other researchers have generated to produce new insights. Online databases and literature reviews are good examples where researchers can find existing data to conduct research on a previously unexplored inquiry. Third-party data carries considerations of data accuracy given that the researcher can only conduct limited quality control of the data that has already been collected.

Big data

A relatively new consideration in data collection and data analysis has been the advent of big data, where data scientists employ automated processes to collect data in large amounts.

Figure 2: Data collection can produce large amounts of data that can be challenging to process. Photo by Mike Baumeister.

The advantage of collecting data at scale is that a thorough analysis of a greater scope of data can potentially generate more reliable findings. Obviously, this is a daunting task to overcome because it is time-consuming and arduous. Moreover, it requires skilled data scientists to sift through large data sets to filter out irrelevant data and generate useful insights.

Data science made easy with ATLAS.ti.

ATLAS.ti handles all research projects big and small. See how with a free trial.

Data collection methods

Different methods for gathering data exist depending on the research inquiry you want to conduct.

Quantitative data collection methods

Quantitative methods are used to collect numerical data. These can then be processed statistically to test hypotheses or gain insights. Quantitative data gathering is typically aimed at measuring a particular phenomenon (e.g., the amount of awareness a brand has in the market, the efficacy of a particular diet, etc.) in order to test hypotheses (e.g., "What is the efficacy of this diet relative to other diets?").

Some qualitative methods of research can contribute to quantitative data collection and analysis. Online surveys and questionnaires with multiple choice questions can produce structured data ready to be analyzed. A survey platform like Qualtrics, for example, aggregates survey responses in a spreadsheet to allow for numerical or frequency analysis.

Qualitative data collection methods

Analyzing qualitative data is important for describing a phenomenon (e.g., the requirements for good teaching practices), which may lead to the creation of a hypothesis or the development of a theory. Behavioral data, transactional data, and data from social media monitoring are all different forms of data that can be collected qualitatively.

Consideration of tools or equipment for collecting data is also important. Primary data collection methods in observational research, for example, employ tools such as audio and video recorders, notebooks for writing field notes, and cameras for taking photographs. As long as the products of such tools can be analyzed, those products can be incorporated into a study's data collection.

Employing multiple data collection methods

Moreover, qualitative researchers seldom rely on one data collection method alone. Ethnographic researchers, in particular, incorporate direct observation, interviews, focus group sessions, and document collection in their data collection process to produce the most contextualized data for their research. Mixed methods research employs multiple data collection methods, including qualitative and quantitative data, and just as many tools to study a phenomenon from as many different angles as possible.

Figure 3: Primary data collection tools can be as simple as notebooks or smartphone cameras. Photo by Kari Shea.

New forms of data collection

External data sources such as social media data and big data have also gained contemporary focus as social trends change and new research questions emerge. This has prompted the creation of other data collection methods beyond traditional processes in research.

Ultimately, there are countless data collection instruments used for qualitative methods, but the key objective is to be able to produce data that can be analyzed. ATLAS.ti can help researchers analyze text, audio, video, and images to accommodate as many forms of data as possible.

Whatever your data, analyze it with ATLAS.ti.

Analyzing your data for critical insights begins with a free trial of our cutting-edge software.

Considerations when collecting data

Research relies on empiricism and credibility at all stages of a research inquiry. As a result, there are various data collection problems and issues that researchers need to keep in mind.

Data quality issues

Think about a picture taken with a smartphone camera and a picture taken with a professional camera. If your analysis depends on capturing the fine-grained details that some data collection tools may miss, then you should carefully consider data quality issues regarding the precision of your data collection.

Quantitative data collection especially relies on precise data collection tools to evaluate outcomes, but researchers conducting qualitative data collection should also be concerned with quality assurance for the collected data. If a study involving direct observation requires multiple observers in different contexts, researchers should take care to ensure that all observers can gather data in a similar fashion to ensure that all data can be analyzed in the same way.

Figure 4: Quality assurance in data collection requires ensuring data consistency. Photo by Julia Koblitz.

Data quality is a crucial consideration when gathering information. Even if the researcher has chosen an appropriate method for data collection, is the data that they collect useful and detailed enough to provide the necessary analysis to answer the given research inquiry?

One example where data quality is consequential in qualitative data collection includes interviews and focus groups. Recordings may lose some of the finer details of social interaction such as pauses, thinking words, or utterances that aren't loud enough for the microphone to pick up. If you are conducting an interview for a study where such details are relevant to your analysis, then you should consider employing tools that collect sufficiently rich data that records these aspects of interaction.

Data integrity

The possibility of inaccurate data has the potential to confound the data analysis process, as faulty data makes drawing conclusions and making decisions difficult or impossible. Failure to establish the integrity of data collection can cast doubt on the findings of a given study.

Accurate data collection is just one aspect researchers should consider to protect data integrity. After that, it is a matter of preserving the data after data collection. How is the data stored? Who has access to the collected data? To what extent can the data be changed between data collection and research dissemination?

Owing to these questions, data integrity is an issue of research ethics as well as research credibility. The researcher needs to establish that the data presented for research dissemination is an accurate representation of the phenomenon under study.

Imagine if a photograph of wildlife becomes so aged that the color becomes distorted over time. If the findings depend on describing the colors of a particular animal or plant, then not preserving the integrity of the data presents a serious threat to the credibility of the research and the researcher.


As explored earlier, researchers rely on both intuition and data to make interpretations about the world. As a result, researchers have an obligation to explain how they collected data and how much of it they collected. Establishing research transparency allows other researchers to examine a study and determine if it is credible.

To address this need, research papers typically have a methodology section, which includes the tools employed for data collection and the breadth and depth of the data that is collected for the study. Absent this description, the research in question may not be sufficiently transparent for scholars to analyze, critique, and develop.


How to gather data is also a key concern, especially in social sciences where people's perspectives represent the collected data. In interviews and focus groups, how questions are framed may change the nature of the answers that participants provide. In market research, researchers have to carefully design questions when working with customer data and customer surveys to gather feedback. Even in the hard sciences, researchers have to consider whether the data collection equipment they use for gathering data produces accurate data sets for analysis.

Finally, the different methods of data collection raise questions about whether the data says what we think it says. Consider how people might consider establishing monitoring systems for online tracking as a source of behavioral data. When a user spends a certain amount of time on mobile apps, are they deeply invested in using the app or do they leave it on while they work on other tasks?

Data collection is only as useful as the extent to which the resulting data analysis leads to useful determinations about the research inquiry being pursued. While it is tempting to collect as much data as possible, the inferences that the researcher makes when examining and analyzing the data are ultimately what determine the impact of the research.

Data organization

Data analysis after collecting data is only possible if the data is sufficiently organized into a form that can be easily sorted and understood. Imagine collecting social media data, which could be millions of posts from millions of social media users everyday. You can dump every single post into a file, but how can you make sense of it?

Data organization is especially important when dealing with unstructured data. The researcher needs to structure the data in some way that facilitates the analytical process, and ATLAS.ti can facilitate this process in numerous ways.


Collecting data in focus groups, interviews, or other similar interactions produces raw video and audio recordings. ATLAS.ti certainly allows you to view and listen to these recordings directly so you can code them for later analysis. However, most traditional analyses of interview and focus group data benefit from conversion into text form.

Recordings are typically transcribed so that the text can be analyzed and incorporated into research papers or presentations. Transcription can be a tedious task, especially if a researcher has to deal with hours of audio data. These days, researchers have the choice of manually transcribing their raw data or using automated transcription services to greatly speed up this process.

In ATLAS.ti, researchers can view videos side-by-side with synchronized transcripts that they can directly edit. Moreover, researchers can also view or hear the non-verbal elements of their raw data while they view the transcripts to facilitate a richer analysis. For researchers manually transcribing their recordings in ATLAS.ti, they can easily control playback of the data as they transcribe. For researchers importing already created transcripts, ATLAS.ti can automatically code for each person speaking in the data.

Survey data

In popular online survey platforms, customer data found in surveys is aggregated in a spreadsheet that presents all records in an easy to reference manner. The Import Survey tool in ATLAS.ti can convert that spreadsheet into manageable documents so that records can be viewed and analyzed one at a time. This allows researchers to narrow their inquiry to a specific set of respondents for deeper analysis.

Field notes and artifacts

In ethnographic research or research involving direct observation, gathering data often means writing notes or taking photographs during field work. While field notes can be typed into a document for analysis in ATLAS.ti, the researcher can also scan their notes into an image or a PDF, either of which can also be incorporated into an ATLAS.ti project. This degree of flexibility allows researchers to code all forms of data that aren't textual in nature but can still provide useful data points for analysis and theoretical development.

Figure 5: Data collection through taking notes in observational research. Photo by Tom Rogerson.


Coding is among the most fundamental skills in qualitative research as its main role is to reduce large data sets into patterns of compact codes for later analysis. If you are dealing with dozens or hundreds of pages of qualitative data, then applying codes to your data is a key method for condensing, synthesizing, and understanding the data.

Figure 6: Coding data in ATLAS.ti makes the collected data easier to understand and analyze.


Document groups are an important tool in ATLAS.ti when dealing with large numbers of documents, because documents can be classified into multiple categories. This means that if you are looking for a specific set of data, you can look at documents in a particular group while filtering out all irrelevant data for your inquiry.

Figure 7: Working with document groups in ATLAS.ti gives you access to the relevant data you're looking for.

Organize your data with ATLAS.ti.

All your research data in one organized place. Give ATLAS.ti a try with a free trial.