Managing Tensions between Theory and Practice: An Educator guide on Data Saturation with ATLAS.ti
This blog article was written by one of our Certified ATLAS.ti Professional Trainers, Kenny Cheah, Ph.D. (Department of Educational Management, Planning and Policy
Faculty of Education, University of Malaya, Kuala Lumpur, Malaysia)
In this ATLAS.ti blog, I shall take the opportunity to share my views on data saturation with the computer-assisted qualitative data analysis software (CAQDAS) ATLAS.ti. Without a doubt, it is one of the most frequently asked questions asked by my students in my ATLAS.ti training that is held quarterly in my university. Usually, I would explain to them about the meaning of saturation as a concept, before I guide them through the application with ATLAS.ti. As I am writing this blog to a wider audience, I foresee that there is a possibility that I may not exactly address the reader’s specific context, concerns, or area of work. However, if you are in the field of education as I am, perhaps you may find it relevant to use this approach to teach your students on this matter.
Definition of Saturation Point in Qualitative Research
As a synthesis of my understanding, data saturation is regarded as the point in the research process when additional data does not add to the level of knowledge. While the process of data analysis usually takes the approach of the reflective and iterative process, it can be illustrated by a flattened curve as shown in Figure 1 below. At this stage, it is a signal to researchers that further data collection can be discontinued.
Figure 1: Illustration to indicate the saturation point
Under the flattened curve is the discovered knowledge of the researcher, while the red line above the saturation point (flattened curve) indicates the undiscovered knowledge if there is no limitation to research. Metaphorically speaking, if the researcher is likened to a sponge, then the amount of water that the sponge can hold is dependent on its capacity to hold the water (that symbolizes discovered knowledge). The capacity of the sponge can also symbolize the limitations of the researcher and the research design. In academia, the contributions of all researchers build the body of knowledge.
Critiques to Data Saturation in Qualitative Research
To critics, data saturation is a weak concept in qualitative research although it is commonly used to indicate the research is sufficient. Some of my colleagues argued that the saturation point should not be regarded as the only factor to stop the research from continuing because there is a possibility that new knowledge can be found if the research is continued with more samples, and across other locations.
Due to this issue, my students are constantly worried if their samples are ‘not enough’, or that they should try the quantitative or mixed-method approach ‘to satisfy’ the examiners’ expectations. Therefore, there is a need for me to consider the functions of ATLAS.ti and how they could be used to support the students’ candidature defense when this question is thrown at him/her.
Operationalizing Data Saturation in Qualitative Research
So how do I address my students’ concerns? What are the feasible ways to convince examiners that saturation is reached with ATLAS.ti? In one of the training sessions, I started by asking my students to explain their understanding of triangulation. To facilitate their learning, I used the definition used by Fusch, & Ness (2015) to explain my point;
“Data saturation is reached when there is enough information to replicate the study when the ability to obtain additional new information has been attained, and when further coding is no longer feasible”.
(Cited from Fusch, P. I., & Ness, L. R. (2015). Are we there yet? Data saturation in qualitative research. The qualitative report, 20(9), 1408).
As a trainer, I later decode this message by breaking the statement into a few points so that it can be applied through practical applications in ATLAS.ti. I extracted the key characteristics of data saturation into key pointers, interpretation, and application in ATLAS.ti as indicated in Table 1 below.
|Key pointers||Interpretation||Application in ATLAS.ti|
|a. “Enough information”||Information to the researcher(s) is enough to address the research questions set out in the purpose of the study||
1. Scope the research objectives and corresponding questions in the project comment section.
2. Justify to yourself in a memo, what you want or do not want to investigate in your study
3. Operationalize the unit of analyses in the research and define in the comment section as a code group, or the individual code itself.
|b. “Replication of study”||Research design in the study could facilitate future researchers to repeat the study be it for comparisons, generalization, predictions or other relevant application of research||
1. Use a memo to describe the researcher’s reflective and iterative process of data analyses.
2. Look at the codes, group them into the code groups that you think could address the primary research question.
3. Use the memo, or the comment section of the code groups to describe vividly on your encounters, experiences, and the evaluative adjustments between the implementation and the plan.
4. You can also convert the memo into a document, and code according to the unit of analyses as a method of processual analyses to the investigator and methodological triangulation.
c. “Ability to obtain new information has been attained”
|Data collection methods have been exhausted, or there are no better or innovative ways to collect quality data||
1. Use the Code-Document table in ATLAS.ti to show the data sources.
2. Apply the triangulation method of analyses across sources.
3. Depend on the memo to see the progressions of your thoughts, the repetition of information, and its familiarity with you.
4. Evaluate the number of your questions with the quality of the respondents’ answers so that they are within the scope of your unit of analyses. If there is no need to add further because the information is sufficient, then you could stop asking new ones. However, if you do not intend to use the unit of analyses to limit your curiosity, you can keep exploring from one idea to another by asking further questions in the memo. This could lead to new sources of information as you think out of the box for more questions.
d. “Further coding is no longer feasible”
|Codes are densely numbered with quotations, and the tendency of reusing codes increases as more sources of data is analyzed||
1. Analyze the data in the Code Manager and Quotation Manager to evaluate the density of codes.
2. This is then followed by using the Code-document Table analysis in ATLAS.ti
3. Scan across the density count with the code-document table. If the codes are set as rows and documents as columns, observe its code-density patterns from left to right.
4. Use a memo to evaluate the saturation stage for your research with a series of questions as a checklist below. (More information will be shown below)
Table 1: Key pointers, interpretation, and application in ATLAS.ti
Factors to Data Saturation in Qualitative Research
It can be said that the saturation point does not only relate to the level of knowledge, but also the number of samples, or sample size in the qualitative research. A good practice is to determine which sample size has the best opportunity for the research to reach data saturation. Hence, there is always a need for the researcher to set clear criteria for purposive sampling before the research begins. Also, a large sample size (or vice versa) does not always guarantee that data saturation can be reached within a stipulated time frame.
Alternatively, the sample size of the research is also dependent on if they are homogenous or heterogeneous. If samples are heterogeneous, such as spanning different contexts, more needs to be collected, as compared to a homogenous sample as found in a single context. For this aspect, I encourage novice researchers to conduct case studies (be it singular or multiple) in a homogenous context first to familiarise themselves with the process of reaching saturation points.
In reality, there are many other internal and external factors to the researcher that should explain the saturation point of a study. As such, the researcher’s journey and experience to always probe, identify, and explain these processual factors so that the research community can understand this point of research and ultimately accept the study as trustworthy or rigorous. The other factor to highlight is that not all knowledge can be coded into explicit knowledge due to the internal limitation (like language and sensory ability) and external limitations (like cultural norms and effectiveness of the communication cycle). These difficulties contribute to the researcher’s challenges in reaching a saturation point, and should best be explained in the limitations of the study. When the limitations of the research are highlighted, then the study is acceptable because the blind spots in the study are self-acknowledged and subsequently invite future researchers to solve those limitations in their research.
Techniques to Convince Data Saturation with ATLAS.ti
In the following discussion, I shall guide the use of ATLAS.ti to indicate the data saturation point of one unit of analysis. I was investigating the Leadership Strategies of two award-winning Principals in my city. In terms of planning, I have scoped my research objectives to this unit of analysis in a theoretical and practical sense, and I have specifically chosen the two principals so that I can write about their success stories in one of the local Educational Leadership Journal.
Suppose I do not have much prior knowledge about principals’ strategies. Therefore, I will have to read and code their transcripts. In terms of school context, I have data to justify why I needed two principals to be homogenous in terms of school characteristics and environmental challenges within. Additionally, I must also set the boundary in my research objective by stating that its core purpose is to compare and compare the similarities of leadership strategies between both. In ATLAS.ti, these points can be written in the Project comment section.
After all their transcripts are read and coded, I shall use the Code-Document-Table to analyze between the density of the codes for every succession article that I read. As the codes are built-in ATLAS.ti, I analyze the Code-in-Use (CIU) for each of the articles that I read. Figures 2 to 4 are screenshots from ATLAS.ti to illustrate the processes involved.
Figure 2: Click Analyse as a start for Conducting Code-in-Use (CIU) to analyze for saturation
Figure 3: Click Code-Document Table to lead to the Unit of Analyses, Code-in-Use (CIU), and Primary Sources of Data
Figure 4: Observing Data Saturation with Code-Document Table
The Code-Document table in ATLAS.ti enables the researcher to identify changing patterns that are repeated between the respondent(s), and within the respondent(s). With the help of a memo in ATLAS.ti, the researcher could ask the following questions to assist him/her to determine if data saturation is reached. Figure 5 below shows a checklist that I designed to determine the saturation level of research.
Figure 5: Personal checklist to determine the saturation level of research.
With practice, the researcher would also need to probe and collect more data through follow-ups before the unit of analyses can be regarded as thoroughly explored. For illustration purposes, the number of quotations grounded on each code (or group of codes) is approaching a flattened curve as shown in Figure 6.
Figure 6: Flattened curve symbolizes the stage where there are no significant findings from the code (or group codes) addressing the unit(s) of analysis
These processes of observing between the previous to the subsequent source of data in the Code-Document table could be carried out as an alternative argument for students to defend that their unit of analyses has reached the point of saturation. As mentioned earlier, there is a need for researchers to preliminarily set the boundaries of their research objectives, justify their purposive sampling and state other known/unknown limitations of their research so that the ‘undiscovered knowledge’ can be acknowledged as the blind spot of the existing researcher, and for future researchers to rectify those limitations.
Additionally, I found that another approach to enhance and attain data saturation is through the triangulation of multiple sources of data. The reason is that triangulation adds depth to the data that are collected. Triangulation is important to improve the trustworthiness of findings, and there are four types of triangulation: (a) data; (b) investigator; (c) theoretical; and (d) methodological triangulation. For application in ATLAS.ti, these triangulation types can also be regarded as the four sources of data (or documents) to conduct code-document table analyses. As illustrated in Figure 6, when the Code-in-Use (CIU) are analysed through the Code-Document table (across four or more sources), researchers will conclude that the saturation point is reached when there are no significant findings in the code(s). Otherwise, the Code-in-Use (CIU) has to be further probed or analyzed within sources because they have not reached the saturation stage yet.
Figure 5: Analyses for the Saturation stage by considering the different sources of triangulation (Denzin, 2012).
This blog is written to assist novice researchers who are facing the problem of countering critics on data saturation, and more particularly to describe how I used ATLAS.ti to address the tension between theory and practice. The failure to reach data saturation harms the validity of the research, but you could capitalize on the functions in ATLAS.ti to explain and justify to your critics if you are queried in this matter. Nonetheless, I acknowledged that there is no one-size-fits-all method to reach data saturation, and each choice must be justified with sound theoretical, realistic, and logical implications. If there are further concerns that cannot be addressed in your research, you could also highlight them as the limitations of your research because the attainment of data saturation is never that straightforward.
About the author
Dr. Kenny Cheah Soon Lee is a senior lecturer in the Institute of Educational Leadership. His interests are organizational management/leadership/behavior, contemporary management, outdoor education /recreation management. He is a Certified ATLAS.ti Professional Trainer.