Webcam Validation Process

1 view

Skip to first unread message

Vin Raichur

unread,

Aug 4, 2024, 10:41:26 PM8/4/24

to inincathtank

Whata person is looking at is not only interesting for nonverbal communication in social interactions, but is also useful for exploring more general questions concerning human attention. Since the late twentieth century, video-based eye-trackers have been used to track the eyes in real time (Singh & Singh, 2012) by measuring the position of an infrared light reflection on the cornea (i.e., the transparent layer forming the front of the eye), relative to the pupil (Carter & Luke, 2020). This method allows researchers to track gaze behavior and identify what guides visual attention. In the last 20 years, eye-tracking research gained much popularity and became a common measurement tool in many areas of science (Carter & Luke, 2020).

However, even though eye-tracking research has led to very interesting insights in recent years, this research method has some important limitations. The need for a lab, an expensive eye-tracker device, an experienced researcher who is familiar with the method, and a required calibration procedure make eye-tracking research a rather elaborate, expensive, and time-consuming method. Furthermore, these limitations do not allow for field research in natural environments. These restrictions recently sparked an interest in the use of common webcams to infer the eye-gaze locations of participants (e.g., Bott et al., 2017; Semmelmann & Weigelt, 2018). Moreover, the recent social and economic pressures of the COVID-19 pandemic reinforced this existing interest in webcam-based eye-tracking, as it would allow studies to move online.

The use of a webcam as eye-tracker would make the research quicker, easier, and cheaper, as no lab, experimenter, or dedicated hardware is needed. Moving from lab to web could also allow for reaching a larger and more diverse participant pool more quickly, or reaching a hard-to-reach sample (e.g., patients with dementia; or a US-based researcher wishing to compare US and Chinese participants). The collection of data would no longer be limited by time or location, as individuals could participate whenever they wanted from the comfort of their homes. Importantly, in other fields, research has already shown that the benefits of online research do not necessarily come at a price. Data quality has been shown to be similar to that of lab research (Kees et al., 2017; Walter et al., 2019), and several effects from other fields have already been replicated in online settings (e.g., Dodou & de Winter, 2014; Gosling et al., 2004; Klein et al., 2014; Semmelmann & Weigelt, 2018).

Some studies have already successfully implemented eye-tracking libraries in their online experiments (Semmelmann & Weigelt, 2018; Slim & Hartsuiker, 2021; Yang & Krajbich, 2021). Semmelmann and Weigelt (2018), for example, demonstrated some basic gaze properties (i.e., a fixation task, a pursuit task, and a free viewing task) with online webcam-based eye-tracking. Slim and Hartsuiker (2021), and Yang and Krajbich (2021) both successfully replicated a behavioral eye-tracking experiment (a visual world experiment and a food choice task respectively), although in both studies most participants did not pass the initial calibration/validation phase, and were excluded from the study (73% and 61% exclusions). Moreover, the latter two studies did not directly compare online webcam-based eye-tracking to lab-based eye-tracking. In conclusion, it remains to be established to what extent online webcam-based eye-tracking could be a valid replacement for lab-based eye-tracking and what the cost in terms of capturing cognitive effects on gaze behavior would be.

The first effect we aimed to replicate was the cascade effect, originally shown by Shimojo et al. (2003). The cascade effect refers to the phenomenon that when people choose which of two presented faces they find most attractive, their gaze is initially distributed evenly between the faces, but then they gradually prioritize the face that they eventually choose. Here, we define the cascade effect as the likelihood of looking at the face that people eventually select during the 100 ms before reporting the decision.

All participants gave informed consent before taking part in the study. The task was computerized and completed online. In the first part of the experiment, participants provided some demographic information and we double-checked whether they had a working webcam and whether it was placed correctly on their computer. Next, participants saw an instruction screen detailing optimal conditions for webcam-based eye-tracking (see Semmelmann & Weigelt, 2018). Once participants indicated that they had successfully set up according to the instructions, they proceeded to an eye-tracking calibration phase (i.e., participants were instructed that they would see a series of white squares and that they had to look at them and click on them), followed by the main task.

After completing 18 trials, participants received a short debriefing and were thanked for their participation. The demographics and webcam check were programmed in and hosted on Qualtrics ( ), and the eye-tracking part was programmed in PsychoJS and hosted on Pavlovia ( ). For the eye-tracking part, we made use of the WebGazer open-source eye-tracking library (Papoutsaki et al., 2016). The entire study was conducted in English.

Based on our preregistered exclusion criteria, no trials were excluded due to a deviation of the corrected midline of more than 25% of the total screen width from the true midline; no trials were excluded because the standard deviation of the corrected midline was larger than 25% of the screen width; 12% of the measurement points were excluded because they fell outside any of the specified AOIs, and 16% of the trials because the response time was below 0.5 s or above 30 s.

The second effect we replicated was the novelty preference effect. This effect refers to the finding that people are more likely to attend to new stimuli than to stimuli they have already seen. This effect is typically demonstrated with the visual paired-comparison task and has been shown by Crutcher et al. (2009), among others.

The first part of the procedure was the same as in study 1 (i.e., informed consent, demographics and camera check, instructions about optimal conditions for webcam-based eye-tracking followed by a calibration phase). After this first part, participants proceeded to the main novelty preference task.

Each trial of the main task started with a fixation cross (2000 ms), followed by a familiarization phase which consisted of two identical images on the left and right side of the screen (5000 ms). After the familiarization phase, participants saw a black screen (2000 ms), followed by the test phase (5000 ms), in which participants saw two images: one that was the same as the one presented during the familiarization phase and another one that was novel. The left or right positioning of the novel stimulus was randomized across trials. Each experimental trial ended with a black screen (7000 ms; Fig. 4). There were 10 trials in total. Stimuli were black and white, horizontally oriented clipart images selected from the Snodgrass and Vanderwart (1980) database. They were presented on a light gray background, centered vertically, were 472 331 px in size, and were 295 px apart from each other.

We replicated the visual world paradigm effect demonstrating that when people hear utterances while looking at a visual display showing common objects, some of which are mentioned in the sentences, they tend to look more at the images of the words that they hear in the utterances. This effect has been shown by Huettig and Altmann (2005), among others.

The proportion of time participants looked at the target versus distractors in the online version (top) versus the lab version (bottom) of the experiment. The 400 ms time interval on which we based our analyses is shown in yellow

We examined whether alternative ways of analyzing the data could converge the online data to the lab data. This could give an indication of which choices are most consequential for the results. To do this, we added an extra exclusion criterion to the online data making it more strict. We excluded participants who indicated that their data were unreliable at the end of the experiment. The results can be found in Table 2.

There are several reasons that could explain why the effect sizes of the online webcam-based studies were smaller than those of lab-based eye-tracking studies. For example, it is possible that webcam-based eye-tracking leads to an underestimation of the true effect size. This could be due to smaller numerators (i.e., if the mean differences between the test values of online webcam-based eye-tracking are smaller) or larger denominators (i.e., if the standard deviations of online webcam-based eye-tracking are larger). The third study of the current paper revealed both that the numerator of online webcam-based eye-tracking was 41% smaller than that of lab-based eye-tracking, and that the denominator was 32% larger.

At the same time, original effect sizes often overestimate the true effect sizes. It has been shown that the effect sizes of replications are typically 50% smaller than the effect sizes of original studies (Camerer et al., 2018). It is argued that this is caused by exaggerated effect size estimates in the existing literature due to a combination of publication bias and questionable research practices (e.g., Simmons et al., 2011; Sterling, 1959). In the third study of the current paper, we found the effect size of the lab-based replication indeed to be lower than that of the original study. However, the effect size of the online webcam-based replication was even lower than that of the lab-based replication, so replication per se could not fully account for the effect size shrinkage in online webcam-based eye-tracking. This indicates that the decreased effect sizes of online webcam-based eye-tracking are probably caused by a combination of both factors.