Product Tutorial

Co-occurrence Analysis with ATLAS.ti

ATLAS.ti provides many tools that allow you to analyze your coded data. In this article, I explain how to conduct a code co-occurrence analysis. I first explain the operators you need to know, and then I walk you through some examples.
Susanne
Susanne Friese
Product specialist, trainer and author of the book "Qualitative Data Analysis with ATLAS.ti"
  1. Introduction
  2. Use of the Within operator
  3. Use of the Encloses operator
  4. Finding overlapping quotations
  5. Finding co-occurring quotations
  6. Running a Code Co-occurrence Analysis
  7. Exporting Results
  8. References

Introduction

Proximity operators analyze the spatial relations (e.g., distance, embeddedness, overlapping, co-occurrence) between coded data segments.

First, I explain some technical details that you need to be familiar with when running a code co-occurrence analysis. To better understand what is happening when you create a code co-occurrence table, I need to explain the proximity operators behind a co-occurrence analysis.

Proximity Operators

Proximity describes the spatial relation between quotations. Quotations can be embedded in one another; one can enclose the other, overlap the other, or be overlapped by the other quotation.

Figure 1: Proximity operators
Figure 1: Proximity operators

Proximity operators differ from the other operators in one important aspect. When using the query tool, you need to observe the place where you in insert them in a query. While “A OR B” is equal to “B OR A,” this does not hold for any of the proximity operators: “A WITHIN B” is not equal to “B WITHIN A.” When building a query, always enter the expressions in the order in which they appear in their natural language manifestation.

Embedding Operators

The embedding operators describe quotations that are contained in one another and that are coded with certain codes.

Quotations being enclosed by quotations: A being enclosed by B (WITHIN) retrieves all quotations coded with A that are contained within data segments coded with B.

Quotations enclosing quotations: A ENCLOSES B retrieves all quotations coded with A that contain quotations coded with B.

Use of the Within operator

Let us assume you have coded a biographic interview. During the interview, respondents talked about different periods in their lives. Also, they spoke about the role of friendship during those periods of their lives. You code both aspects.

If you are interested in reading everything about ‘friendship’ in the time period coded with ‘childhood,’ you can use the WITHIN operator to find those segments.

Friendship WITHIN Childhood.

Figure 2: An example for the use of the WITHIN operator
Figure 2: An example for the use of the WITHIN operator

If you enter Childhood WITHIN friendship, you do not find anything, as such a constellation does not exist.

Use of the Encloses operator

An example for the use of the ENCLOSES operator is: Find all blog posts that contain information about sources of happiness:

Figure 3: An example for the use of the ENCLOSES operator
Figure 3: An example for the use of the ENCLOSES operator

Finding overlapping quotations

The overlap operators find quotations that overlap one another:

Overlaps (quotation overlapping at the start): A OVERLAPS B retrieves all quotations coded with A that overlap quotations coded with B.
Overlapped by (quotations overlapping at the end): A OVERLAPPED BY B retrieves all quotations coded with A overlapped by quotations coded with B.

The ability to ask precisely where code A, a code B overlaps, or vice versa is a viable option when working with video data in which the order of events can be of interest. Consider a classroom situation. The teacher stands at the blackboard explaining something (A). The door opens, and a student comes in (B). Does the teacher continue with the lesson (A ENCLOSES B), or do they turn to the pupil who comes in (A is overlapped by B)?

Please note that ATLAS.ti can only retrieve quotations and not the intersection of the overlapping segments as this is not a quotation! This is illustrated in the figure below.

Figure 4: ATLAS.ti only retrieves quotations, not the overlapping area
Figure 4: ATLAS.ti only retrieves quotations, not the overlapping area

Finding co-occurring quotations

When exploring the relationship between two or more codes, you often do not care whether something overlaps or 'is overlapped by' or is 'within' it or 'encloses' it. If this is the case, you use the COOC operator. The code co-occurrence operator is a shortcut for combining the four proximity operators discussed above, plus the operator AND. AND is a Boolean operator, but it also finds cooccurrence, namely all coded segments that overlap 100%.

Figure 5: Five ways of how quotations can co-occur
Figure 5: Five ways of how quotations can co-occur

The more general co-occurrence operator is quite helpful when working with transcripts. In interviews, people often jump back and forth in time or between contexts, and therefore it usually does not make much sense to use the specific embedding or overlap operators. With other types of data, they are, however, quite helpful. Think of video data where it might be essential whether action A was already going on before action B started or vice versa. Or if you have coded longer sections in your data like biographical periods in a person’s life and then did some more fine-grained coding within these periods. The WITHIN operator comes in very handy in such instances. The same applies when working with pre-coded survey or focus group data where ATLAS.ti automatically codes all questions/speakers. Using the WITHIN operator, you can ask, for instance, for all quotations coded with ‘topic X WITHIN ‘question 5’ or by ‘speaker y’.

The co-occurrence operator is also used when running the Code Co-occurrence Explorer or Code Co-occurrence Table.

Running a Code Co-occurrence Analysis

I will now show how to use the co-occurrence operators using the Code Co-occurrence Table. I will use the Children & Happiness sample project that you can find on the ATLAS.ti website.

We will look at a few research questions and how to find answers to them. When reading through the examples, think about how you can transfer this knowledge to investigate the data in your projects. Here is the first research question:

RQ1: Do parents with one child differ from parents with two or more children regarding the positive and negative effects of parenting they report?

If you look at the sample project, you will find two documents (D3 and D4) that contain comments from multiple people on a parenting blog and comments on an article published by the New York Time Magazine. As each document contains responses from various respondents, sociodemographic characteristics needed to be coded. Document groups could not be used here. You can find more information about this in the book Qualitative Data Analysis with ATLAS.ti.

As you can see from the Figure below, each response was coded with sociodemographic codes like gender or number of children and codes that describe other aspects like various positive and negative effects of parenting.

Figure 6: Coding of multiple aspects that can be related to each other in a co-occurrence analysis
Figure 6: Coding of multiple aspects that can be related to each other in a co-occurrence analysis

The relationships between these codes can be explored using the Code Co-occurrence Table. To open it, select Analyze / Co-Oc Table from the main ribbon or menu.

For column codes, select the folder 'sociodemographics' and from the category 'No. of children' the two sub-codes '1 child" and '2 or more'. For row codes, select all sub-codes of the categories 'negative effect' and 'positive effects' found in the folder 'Effects of parenting.'

Figure 7: Selecting the variables for the co-occurrence analysis

The result that you get is shown in the figure below. If you click on the 'Compress' option in the ribbon or toolbar in the Mac version, you can remove all rows that show no results.

Figure 8: Example of a code co-occurrence analysis

The cells of the table show the number of co-occurrences. If you click on a cell, you can retrieve the quotations for the codes. You can see in the table that there is a shift to writing more about the positive effects of parenting when having two or more children. The positive impact that stands out for parents with one child is personal growth and positive emotions.

The results can also be visualized in a histogram or a Sankey Diagram. Below you see a histogram. Based on the colors used for the codes, you can see that there is much more red (negative) in the data for families with one child and much more green in the data for families with two children.

Figure 9: Visualizing the relationship between positive and negative experience of parenthood and number of children

To explain the results, we could, for example, apply self-consistency theory to explain the findings arguing that parents with two or more children feel compelled to report positive effects as otherwise, they would need to question their own decision of why to have more than one child. Another explanation could be that life as a parent gets easier with more experience. Reading the data behind the numbers will likely give you clues regarding which explanation might be more appropriate.

When we begin to describe this in a memo for this research question, we move from analysis to interpretation. See more on memo writing in the book Qualitative Data Analysis with ATLAS.ti.

If you may wonder about the low frequencies in the above table or histogram, it is worth noting that this is just a small sample project used here for illustrative purposes. Scientific conclusions cannot be drawn from it. However, it is still fun to explore this data further as you do get meaningful results. For instance, if you look at the relationship between reported effects of parenting and whether people believe children make you happy or not, you also observe an interesting trend:

Figure 10: Relating effects of parenting and attitude

People who think that children make a person unhappier report more negative effects of parenting; with those who believe that the level of happiness does not change with children, it is a mixed effect; those who think children contribute to happiness report only positive effects. Below these results are visualized in the form of a Sankey Diagram. Highlighted are the positive and negative attitudes and how they relate to the experience of parenthood.

Figure 11: Visualizing the relationship between attitude and experience of parenthood

This result might trigger ideas about which other relations to explore, such as the relationship between the attitude codes and the number of children.

Figure 12: Relationship between attitude and number of children

We see the same trend. Those with two or more children write more often that they think children add to happiness and report more positive parenting effects. So, piece by piece, the analysis comes together. Reading the quotations that you can access by clicking on a number will help you interpret the data.

Exporting Results

If you want to continue to work with the resulting numbers, you can export the table as an Excel file. To export the qualitative data, i.e., the quotations for a cell, click on the ‘burger menu’ above the quotation list.

Figure 13: Exporting the qualitative data for a tabel cell

References

  • Friese, Susanne (2019). Qualitative Data Analysis With ATLAS.ti. London: Sage.