Co-occurrence Analysis with ATLAS.ti
Written by: Dr. Susanne Friese
ATLAS.ti provides many tools that allow you to analyze your coded data. In this article I explain how to conduct a code co-occurrence analysis. I first explain the operators you need to know about and then I walk you through some examples and show you how you can visualize findings and write it all up in a report. And then all you need to do is to apply it to your own research. As Freeman (2017) rightly noted: “As novice researchers take on the analytic task, they begin to gain concrete understanding for three fundamental principles […]: First, the need to do analysis to understand analysis; second, the importance of understanding the relationship between analysis and interpretation; and third, the essential role of writing” (p. 122).
The Code Co-occurrence Table
Proximity operators are used to analyze the spatial relations (e.g., distance, embeddedness, overlapping, co-occurrence) between coded data segments.
First, I explain some of the technical details that you need to be familiar with when running a code co-occurrence analysis. In order to better understand what is happening when you create a code co-occurrence table, I need to explain the proximity operators that are behind a co-occurrence analysis.
Proximity describes the spatial relation between quotations. Quotations can be embedded in one another, one can enclose the other, one can overlap the other, or be overlapped by the other quotation.
Proximity operators differ from the other operators in one important aspect. When using the query tool, you need to observe the place where you in insert them in a query. While “A OR B” is equal to “B OR A,” this does not hold for any of the proximity operators: “A WITHIN B” is not equal to “B WITHIN A.” When building a query, always enter the expressions in the order in which they appear in their natural language manifestation.
The embedding operators describe quotations that are contained in one another and that are coded with certain codes.
Quotations being enclosed by quotations: A being enclosed by B (WITHIN) retrieves all quotations coded with A that are contained within data segments coded with B.
Quotations enclosing quotations: A ENCLOSES B retrieves all quotations coded with A that contain quotations coded with B.
Let us assume you have coded a biographic interview. During the interview, respondents have talked about different time periods in their lives. All these sections are coded. Within those sections, among other things, they talked about the role of friendship. This also has been coded.
Now you are interested in reading everything about ‘friendship’ in the time period that was coded with ‘childhood’. To find those segments, you can use the WITHIN operator:
Friendship WITHIN Childhood.
If you enter Childhood WITHIN friendship, you do not find anything, as such a constellation does not exist.
An example for the use of the ENCLOSES operator is: Find all blog posts that contain information about sources of happiness:
Finding overlapping quotations
The overlap operators describe quotations that overlap one another:
Overlaps (quotation overlapping at start): A OVERLAPS B retrieves all quotations coded with A that overlap quotations coded with B
Overlapped by (quotations overlapping at end): A OVERLAPPED BY B retrieves all quotations coded with A that are overlapped by quotations coded with B.For example, the ability to ask exactly where a code A, a code B overlaps, or vice versa is a viable option when working with video data in which the order of events is often more important than for interview data. Consider a classroom situation. The teacher stands at the blackboard explaining something (A). The door opens, and a student comes in (B). Does the teacher continue with the lesson (A ENCLOSES B), or does he or she turn to the pupil who comes in (A is overlapped by B)?
Please note, ATLAS.ti can only retrieve quotations and not the intersection of the overlapping segments as this is not a quotation! This is illustrated in the figure below.
Finding co-occurring quotations
Often when exploring the relation between two or more codes, you do not really care whether something overlaps or is overlapped by or is within it or encloses it. If this is the case; you simply use the COOC operator. The code co-occurrence operator is a short-cut for a combination of the four proximity operators discussed above, plus the operator AND. AND is a Boolean operator, but it also finds cooccurrence, namely all coded segments that overlap 100%.
The more general co-occurrence operator is quite useful when working with transcripts. In interviews, people often jump back and forth in time or between contexts, and therefore it often does not make much sense to use the specific embedding or overlap operators. With other types of data, they are however quite useful. Think of video data where it might be important whether action A was already going on before action B started or vice versa. Or if you have coded longer sections in your data like biographical time periods in a person’s life and then did some more fine-grained coding within these time periods. The WITHIN operator comes in very handy in such instances. The same applies when working with pre-coded survey or focus group data where all questions/speakers are automatically coded by ATLAS.ti. Using the WITHIN operator you can ask, for instance, for all quotations coded with ‘topic x’ WITHIN ‘question 5’ or by ‘speaker y’.
The co-occurrence operator, essentially the combination of the five operators, is also used when running the Code Co-occurrence Explorer or Code Co-occurrence Table.
Running a Code Co-occurrence Analysis
I will now show how to make use of the co-occurrence operators using the Code Co-occurrence Table. I will use the Children & Happiness sample project.
If you want to follow along in ATLAS.ti, you can download a specially prepared version of the Children & Happiness project from the companion website.
We will look at a few research questions and how to find answers to them. When reading through the examples, think about how you can transfer this knowledge to investigate the data in your own projects. Here is the first research question:
RQ1: Do parents with one child differ from parents with two or more children regarding the positive and negative effects of parenting they report?
If you look at the sample project, you will find two documents (D3 and D5) that contain comments from multiple people on a parenting blog and comments on an article published by the New York Time Magazine. As each document contains responses from multiple respondents, sociodemographic characteristics needed to be coded. Document groups could not be used here. You can find more information on this in the chapter on project setup in the full book.
As you can see from Figure 13, each response was coded with sociodemographic codes like gender: male and gender: female; having 1 or 2 to more children, and with codes that describe other aspects like various positive and negative effects of parenting.
The relationships between these various categories of codes can be explored using the Code Co-occurrence Table. To open it:
- Select Analyze / Co-Oc Table from the main ribbon or menu.
- For row codes, select all ‘effects positive’ and ‘effects negative’ codes. Type ‘effect’ into the search field then the list of codes is filtered, and it is easier to select them.
- For column codes, select the two codes ‘#fam: 1 child’ and ‘#fam: 2 or more children’.
- You can click on the compress option if you want to remove all rows that show no results. The table then looks as follows:
The cells of the table show the number of co-occurrences. If you click on a cell, you can retrieve the quotations for the codes in the rows and columns. In the figure above, the retrieved quotations are for the column code ‘#fam: 1 child’ and for the row code ‘effects neg: more worries/stress’ (see the blue blox).
When preparing the table, you do not have to consider the order of the codes in the query. This is only relevant for the Query Tool. Depending on your interest, you can either read the quotations of the column or the row code. Both are provided.
What you can see in the table is that there is a shift to writing more about positive effects of parenting when having two or more children. The positive effect that stands out for parents with one child is personal growth. When we now begin to describe this in a memo for this research question (see more on memo writing in the full book), we move from analysis to interpretation.
We could, for example, apply self-consistency theory to explain the findings arguing that parents with two or more children feel compelled to report positive effects as otherwise they would need to question their own decision of why to have more than one child. Another explanation could be that life as a parent gets easier with more experience. Reading the data behind the numbers will likely give you some clues regarding which explanation might be more appropriate.
If you may wonder about the low frequencies in the above table, it is worth noting that this is just a small sample project that is used here for illustrative purposes. Scientific conclusions cannot be drawn from it. It is however still fun to explore this data further as you do get meaningful results. For instance, if you look at the relationship between reported effects of parenting and whether people believe children make you happy or not, you also see an interesting trend:
People who think that children make a person unhappier report more negative effects of parenting; with those who think that the level of happiness does not change with children, it is a mixed effect; those who believe children contribute to happiness report only positive effects. This result might trigger ideas about which other relations to explore, for example, the relationship between the attitude codes and number of children:
We see the same trend. Those with two or more children write more often that they think children add to happiness and they also report more positive effects of parenting. So, piece by piece, the analysis comes together. Reading the quotations that you can access by clicking on a number will help you with the interpretation of the data.
Exporting results: If you want to continue to work with the resulting numbers, you can export the table as an Excel file. If you want to export the quotations, click on the ‘burger menu’ above the quotation list.
The results of the code co-occurrence analysis have shown that there are a number of relations between number of children, the way parenting has been described, and perceived level of happiness. The tables give you a 2-dimensional view, and we can only relate two categories or dimensions at a time. If we now move on to the networks trying to represent our findings there, we get a multidimensional picture.
Bringing in all the codes that we had in the tables resulted in a network that was difficult to comprehend. Therefore, I created smart codes that already capture the relationship between number of children and perceived happiness. Then I brought in the co-occuring effects of parenting codes by setting a global filter. You will learn more about smart codes and global filter settings in a sequel to this article.
In the network, we see that all described negative effects are related to having one child and feeling less or equal levels of happiness. Negative effects on relationships can also occur even though one feels happier with the child. Those are related to quarrels that most of the additional work is done by one partner and the other partner is not “pulling” their weight.
“I would have to say it is not the child that makes you unhappy but maybe when your partner/spouse is not “pulling” their weight and you start adding up the lack of assistance/help they provide (dishes, laundry, meals etc…) I am happy to do those things for my child (and do not keep a running tally) but if I start comparing how much I do and how much my life has changed in comparision to my spouse (when we both work)-that makes me unhappy. Ha ha ha.
I love my spouse but it just seems like the least he can get away with…the least he will do” (female, 3:163).
Personal growth is mentioned as a positive effect by parents with one, or two or more children who perceive an equal level or greater level of happiness. If we now add all other parents into the network that have not written about their perception of happiness, we get the following picture:
In order not to clutter the network, I added code groups for responses by parents with different numbers of children. You can see that parents with one child perceive parenthood much more negatively overall. For parents with two or more children, the effect on careers can be both positive and negative. In the perception of some parents with several children, the sacrifices made for the children do not offset the gains:
“However, I have never felt that the time and money and effort I exerted to keep them healthy and happy and occupied offset all the sacrifices I have made in my own personal life, despite the pride I feel when consider their achievements. My overriding sentiment is resentment” (female, 3:68).
There is a shift to more positivity the more children you have, but this picture is not free of ambiguity. This can be explored further, by looking at other topics that were coded. For instance, which sources for happiness, reasons for having and not having children are mentioned. How does this relate to how parenting and happiness are perceived?
To come back to the quote at the beginning of this article – the relationship between analysis, interpretation and writing – a lot of what I described above is analysis. I hinted to the link to interpretation when bringing in some ideas about how the results might be explained (e.g., by drawing on existing theories). The first step is to begin to write down what you see. Networks can help you relate findings that otherwise might stand isolated side-by-side, and they can help you develop a storyline of what you want to tell about your research. Describing it all, however, is not enough; you also need to add explanations and interpretations regarding what you see and relate it to the literature and existing theory or knowledge.
Freeman, Melissa (2017). Modes of Thinking for Qualitative Data Analysis. NY: Routledge.
Friese, Susanne (2019). Qualitative Data Analysis With ATLAS.ti. London: Sage.
Friese. S (2020). Co-occurrence Analysis with ATLAS.ti. Retrieved from
About the author
Dr. Susanne Friese
Dr. Susanne Friese started working with computer software for qualitative data analysis in 1992. Her initial contact with CAQDAS tools was from 1992 to 1994, as she was employed at Qualis Research in the USA. In following years, she worked with the CAQDAS Project in England (1994 – 1996), where she taught classes on The Ethnograph and Nud*ist (today NVivo). Two additional software programs, MAXQDA and ATLAS.ti, followed shortly. Susanne has accompanied numerous projects around the world in a consulting capacity, authored didactic materials and is the author to the ATLAS.ti User’s Manual, sample projects and other documentations. The third edition of her book “Qualitative Data Analysis with ATLAS.ti” was published in early 2019 with SAGE publications. Susanne’s academic home is the Max Planck Institute for the Study of Religious and Ethnic Diversity in Göttingen (Germany), where she pursues her methodological interest, especially regarding qualitative methods and computer-assisted qualitative data analysis.