Category Archives: user engagement

In 2014, Heather O’Brien, E. Yom-Tov and I published a 100+ pages article on measuring user engagement:

M. Lalmas, H. O’Brien and E. Yom-Tov. Measuring User Engagement, Synthesis Lectures on Information Concepts, Retrieval, and Services, Morgan & Claypool Publishers, 2014

A pre-prodcution PDF version can be found here. Enjoy.



Bounce, Shallow, Deep & Complete: Four Levels of User Engagement in News Reading

This is the second blog post on a paper that will be presented at WSDM 2016 [1], on metrics of user engagement using viewport time. This work is in collaboration with Dmitry Lagun, and was carried out while Dmitry was a student at Emory University, and as part of a Yahoo Faculty Research and Engagement Program.

In a previous blog post, I motivated using viewport time (the time a user spends viewing an article at a given viewport position) as a coarse, but more robust instrument to measure user attention during news reading. In this blog, I describe how to use viewport time to classify each individual page view into one of the following four levels of user engagement:


Four levels of User Attention

  • A Bounce indicates that users do not engage with the article and leave the page relatively quickly. We adopt 10 seconds dwell time threshold to determine a Bounce. Other thresholds can be used, for example accounting for genre (politics versus sport) or device (mobile versus tablet)
  • If the user decides to stay and read the article but reads less than 50% of it, we categorize such a page view as Shallow engagement, since the user has not fully consumed the content. The percentage of article read is defined as the proportion of the article body (main article text) having a viewport time longer than 5 seconds. Note that using 50% is rather arbitrary and used only to distinguish between extreme cases of shallow reading and consumption of the entire article.
  • On the other hand, if the user decides to read more than 50% of the article content, we refer to this as Deep engagement, since the user most likely needs to scroll down the article, indicating greater interest in the article content.
  • Finally, if after reading most of the article the user decides to interact (post or reply) with comments, we call such experience Complete. The users are fully engaged with the article content to the point of interacting with its associated comments.

To understand what insights these engagement levels actually bring, we compare them with four sets of measures, by reporting the mean and standard errors between them and viewport time broken down for the engagement levels. This analysis is based on the viewport data for 267,210 page views on an online news website from Yahoo.

  • Dwell time.
  • Viewport time for article header (usually a title which may include small image thumbnail), body (main body of the article), and comment block.
  • Percentage of the total viewport time spent viewing one of the above mentioned regions.
  • Comment clicks showing the number of clicks on the comment block.

Means and standard errors of the fined grained measures for Bounce, Shallow, Deep and Complete (all with statistical significant differences)

Dwell time and viewport time on head, body and comment increase from Bounce to Complete. We also note that the distribution of the percentage of time among these blocks changes in an interesting manner. The viewport time on head steadily decreases from 0.31 for Bounce to 0.09 for Complete indicating that users spend an increasing amount of time reading content deeper in the article. The percentage of article read steadily increases from Bounce to Complete, as expected. With respect to this, Bounce (12%) and Shallow (23%) clearly represent low levels of engagement with the article, since less than 25% of the article was read. On the other hand, Deep and Complete correspond to the situations when the majority (83%) of the article was read. The number of comment clicks is highest for Complete (3.14), followed by Shallow (0.43), suggesting that users may engage with comments even if they do not read a large proportion of the article.


Mean viewport time at different viewport positions for each of the levels of user engagement (thickness of the line corresponds to the standard error of the mean)

Finally, we compute the average viewport time at varying vertical position. Each of the four curves corresponds to one of the engagement levels. For the page views in the Bounce case, users rarely scroll down the page, whereas many users in the Shallow case spend approximately another 5 seconds of viewport time at lower scroll positions. Deep engagement is characterized by significant time spent on the entire article (peak at the first screen amounts to about 50 seconds) with a steady position decay of the viewport time towards the bottom. Interestingly, the viewport time profile for Complete engagement no longer monotonically decays with the position; instead it has a bi-modal shape. This could be due to a significant time users spend viewing and interacting with comments, which are normally placed right after the main article content.

Our analysis shows that the proposed four user engagement levels are intuitive, and bring more refined insights about how user engage with articles, than using dwell time alone. We recall that the engagement levels can be derived using the viewport time information, which can be computed through scalable and non-intrusive instrumentation. In a future blog post, I will describe how viewport time can be used to predict these levels of engagement based on the textual topics of a news article.

Viewport time: From user engagement to user attention in online news reading

This is the first blog post on a paper that will be presented at WSDM 2016 [1], on metrics of user engagement using viewport time. This work is in collaboration with Dmitry Lagun, and was carried out while Dmitry was a student at Emory University, and as part of a Yahoo Faculty Research and Engagement Program.

Figure 1 (a): Example page showing a pattern of user attention, which is the most common one when reader’s attention decays monotonically towards the bottom of the article.

Figure 1 (a): Example page showing the most common pattern of user attention, where the reader attention decays monotonically towards the bottom of the article.

Figure 1 (b): Example page showing a pattern of user attention with an unusual distribution of attention indicating that content positioned closer to the end of the article attracts significant portion of user attention.

Figure 1 (b): Example page showing an unusual distribution of attention, indicating that content positioned closer to the end of the article attracts significant portion of user attention.

Online content providers such as news portals constantly seek to attract large shares of online attention by keeping their users engaged. A common challenge is to identify which aspects of the online interaction influence user engagement the most, so that users spend time on the content provider site. This component of engagement can be described as a combination of cognitive processes such as focused attention, affect and interest, traditionally measured using surveys. It is also measured through large-scale analytical metrics that assess users’ depth of interaction with the site. Dwell time, the time spent on a resource (for example a webpage or a web site) is one such metric, and has proven to be a meaningful and robust metric of user engagement in many contexts.

However, dwell time has limitations. Consider Figure 1 above, which shows examples of two webpages (news articles) of a major news portal, with associated distribution of time users spend at each vertical position of the article. The blue densities on the right side indicate the average amount of time users spent viewing a particular part of the article. We see two patterns:

In (a) users spend most of their time towards the top of the page, whereas in (b) users spend significant amount of time further down the page, likely reading and contributing comments to the news articles. Although the dwell time for (b) is likely to be higher (the data indeed shows this), it does not tell us much about user attention on the page, neither it allows us to differentiate between consumption patterns with similar dwell time values.

Many works have looked at the relationship between dwell time and properties of webpages, leading to the following results:

  • A strong tendency to spend more time on interesting articles rather than on uninteresting ones.
  • A very weak correlation between article length and associated reading times, indicating that most articles are only read in parts, not in their entirety. When these two correlate, they do so only to some extent, suggesting that users have a maximum time-budget to consume an article.
  • The presence of videos and photos, the layout and textual features, and the readability of the webpage can influence the time users spend on a webpage.

However, dwell time does not capture where on the page users are focusing, namely the user attention. Hence, the suggestion of using other measurements to study user attention.

Studies of user attention using eye-tracking provided numerous insights about typical content examination strategies, such as top to bottom scanning of web search results. In the context of news reading, gaze is a reliable indicator of interestingness and correlates with self-reported engagement metrics, such as focused attention and affect. However, due to the high cost of eye-tracking studies, a considerable amount of research was devoted to finding more scalable methods of attention measurement, which would allow monitoring attention of online users at large scale. Mouse cursor tracking was proposed as a cheap alternative to eye-tracking. Mouse cursor position was shown to align with gaze position, when users perform a click or a pointing action in many search contexts, and to infer user interests in webpages. The ratio of mouse cursor movement to time spent on a webpage is also a good indicator of how interested users are in the webpage content, and cursor tracking can inform about whether users are attentive to certain content when reading it, and what their experience was.

However, despite promising results, the extent of coordination between gaze and mouse cursor depends on the user task e.g. text highlighting, pointing or clicking. Moreover, eye and cursor are poorly coordinated during cursor inactivity, hence limiting the utility of mouse cursor as an attention measurement tool in a news reading task, where minimal pointing is required. Thus, we propose to use instead viewport time to study user attention.

Viewport is defined as the position of the webpage that is visible at any given time to the user. Viewport time is the time a user spends viewing an article at a given viewport position.

Viewport time has been used as an implicit feedback information to improve search result ranking for subsequent search queries, to help eliminating position bias in search result examination, and to detect bad snippets and improve search result ranking in document summarization. Viewport time was also successfully used to infer user interest at sub-document level on mobile devices, and was helpful in evaluating rich informational results that may lack active user interaction, such as click.

Our work adds to this body of works, and explores viewport time, as a coarse, but more robust instrument to measure user attention during news reading.

Figure 2. Distribution of viewport time averaged across all page views.

Figure 2. Distribution of viewport time averaged across all page views.

Figure 2 shows the viewport time distribution computed from all page views on a large sample of news articles. It has a bi-modal shape with the first peak occurring at approximately 1000 px and the second, less pronounced peak at 5000 px, suggesting that most page views have the viewport profile that falls between cases (a) and (b) of Figure 1. This also shows that on average user spends significantly smaller amount of time at lower scroll positions – the viewport time decays towards the bottom of the page. The fact that users spend substantially less time reading seemingly equivalent amount of text (top versus bottom of the article) may also explain the weak correlation between article length and the dwell time reported in several works.

Although users often remain in the upper part of an article, some users do find the article interesting enough to spend significant amount of time at the lower part of the article, and even to interact with the comments. Thus, some articles entice users to deeply engage with their content.

In this paper, we build upon this observation and employ viewport data to develop user engagement metrics that can measure to what extent the user interaction with a news article follows the signature of positive user engagement, i.e., users read most of the article and read/post/reply to a comment. We then develop a probabilistic model that accounts for both the extent of the engagement and the textual topic of the article. Through our experiments we demonstrate that such model is able to predict future level of user engagement with a news article significantly better than currently available methods.

Story-focused reading in online news

I worked for several years with Janette Lehmann as part of her PhD looking at user engagement across sites. This blog post describes our work on inter-site engagement in the context of online news reading. The work was done in collaboration with Carlos Castillo and Ricardo Baeza-Yates [1].

Online news reading is a common activity of Internet users. Users may have different motivations to visit a news site: some users want to remain informed about a specific news story they are following, such as an important sport tournament or a contentious important political issue; others visit news portals to read about breaking news and remain informed about current events in general.

story-focusWhile reading news, users sometimes become interested in a particular news item they just read, and want to find more about it. They may want to obtain various angles on the story, for example, to overcome media bias or to confirm the veracity of what they are reading. News sites often provide information on different aspects or components of a story they are covering. They also link to other articles published by them, and sometims even to articles published by other news sites or sources. An example of an article having links to others is shown on the right, at the bottom of the article.

We performed a large-scale analysis of this type of online news consumption: when users focus on a story while reading news. We referred to this as story-focused reading. Our study is based on a large sample of user interaction data on 65 popular news sites publishing articles in English. We show that:

  • Story-focused reading exists, and is not a trivial phenomenon. This type of news reading differs from a user daily consumption of news.
  • Story-focused reading is not simply a consequence of the fact that some stories are more popular, have more articles written about them, or covered by more news providers.
  • Story-focused reading is driven by the interest of the users. Even users that can be considered as casual news readers (they only read few articles) engage in story-focused reading.
  • When engaged in story-focused reading, users spend more time reading and visit more news providers. Only when users read many articles about a story, the reading time decreases. Our analysis suggests that this could be due to news articles containing mostly the same information.

The strategies that readers employ to find articles related to a story depend on how deep they want to delve into the story. If users are only reading a few articles about a story, they tend to gather all information from a single news site. In the case of deeper story-focused reading, where users are interested in the story details or specific information, they often use search and social media sites to access sites. Furthermore, many users are coming from less popular news sites and blogs, which makes sense, because blogs frequently link their posts to mainstream news sites when discussing an event and users are following these links to likely gather further information or confirm the veracity of what they are reading.

Strategies that keep users engaged with a news site include recommending news articles to users or integrating interactive features (e.g., multimedia content, social features, hyperlinks) into news articles. News providers can promote story-focused reading and increase engagement by linking their articles to other related content. Embedding links to related content into news articles and hyperlinks in general are an important factor that influences the stages of engagement (period of engagement, disengagement, and re-engagement). Having internal links within the article text promotes story-focused reading and as a result keeps users engaged:

It leads to a longer period of engagement (reading sessions are longer) and earlier re-engagement (shorter absence time). Providing links to external content does not have a negative effect on user engagement; the period of engagement remains the same (reading sessions are the same), and the re-engagement begins even sooner (shorter absence time).

This does not mean that news providers should just provide links; they should provide the right ones in terms of quantity and quality. The type, the position, and the number of links play an important role. Users tend to click on links that bring them to other news articles within the same news site, or to articles published by less known sources, probably because they provide new or less mainstream information. However, it is not a good strategy to offer too many such links, as this is likely to confuse or annoy users. Too many inline links can have detrimental effect on users’ reading experience. Finally, when engaged in story-focused reading, users tend to click on links that are close to the end of the article text.

The linking strategies of news providers affect the way users engage with their news sites, which by itself is not new. However, our results are in contradiction with the linking strategy that aims at keeping users as long as possible on a site by linking to other content on the site.

Instead, it can be beneficial (long-term) to entice users to leave the site (e.g., by offering them interesting content on other sites) in a way that users will want to return to it.

News providers could adapt their sites when they identify a user engaging in story-focused reading in various ways:

  • Such information could be integrated in the personalised news recommender of the news site. Story-related articles in the news feed could be highlighted or content frames containing information and links related to the story could be presented on the front page.
  • It might be also beneficial to provide and link to topic pages containing latest updates, background information, blog entries, eye witness reports, etc. related to the story.

Story-focused reading also brings new opportunities for news providers to drive traffic to their sites by collecting the most interesting articles and statements around a story, i.e., becoming a news story curator, and publishing them via social media channels or email newsletters.

Promoting Positive Post-Click Experience for Native Advertising

Since September 2013, I have been working on user engagement in the context of native advertising. This blog post describes our first paper on this work, published at the Industry Track of ACM Knowledge Discovery & Data Mining (KDD) conference in 2015 [1]. This is work in collaboration with Janette Lehmann, Guy Shaked, Fabrizio Silvestri and Gabriele Tolomei.

Feed-based layouts, or streams, are becoming an increasingly common layout in many applications, and a predominant interface in mobile applications. In-stream advertising has emerged as a popular online advertising because it offers a user experience that fits nicely with that of the stream, and is often referred to as native advertising. In-stream or native ads have an appearance similar to that of the items in the stream, but clearly marked with a “Sponsored” label or a currency symbol e.g. “$” to indicate that they are in fact adverts.

A user decides if he or she is interested in the ad content by looking at its creative. If the user clicks on the creative he or she is redirected to the ad landing page, which is either a web page specifically created for that ad, or the advertiser homepage. The way user experiences the landing page, the ad post-click experience, is particularly important in the context of native ads because the creatives have mostly the same look and feel, and what differs mostly is their landing pages. The quality of the landing page will affect the ad post-click experience.

A positive experience increases the probability of users “converting” (e.g., purchasing an item, registering to a mailing list, or simply spending time on the site building an affinity with the brand). A positive post-click experience does not necessarily mean a conversion, as there may be many reasons why conversion does not happen, independent of the quality of the ad landing page. A more appropriate proxy of the post-click experience is the time a user spends on the ad site before returning back to the publisher site:

“the longer the time, the more likely the experience was positive”

The two most common measures used to quantify time spent on a site are dwell time and bounce rate. Dwell time is the time between users clicking on an ad creative until returning to the stream; bounce rate is the percentage of “short clicks” (clicks with dwell time less than a given threshold). On a randomly sampled native ads served on a mobile stream, we showed that these measures were indeed good proxies of post-click experience.

We also saw that users clicking on ads promoting a positive post-click experience, i.e. small bounce rate, were more likely to click on ads in the future, and their long-term engagement was positively affected.

Focusing on mobile, we found that a positive ad post-click experience was not just about serving ads with mobile-optimised landing pages; other aspects of an landing page affect the post-click experience. We therefore put forward a learning approach that analyses ad landing pages, and showed how these can predict dwell time and bounce rate. We experimented with three types of landing page features, related to the actual content and organization of the ad landing page, the similarity between the creative and the landing page, and ad past performance. The later type were best at predicting dwell time and bounce rate, but content and organization features performed well, and have the advantages to be applicable for all ads, not only for those that have been served.

Finally, we deployed our prediction model for ad quality based on dwell time on Yahoo Gemini, an unified ad marketplace for mobile search and native advertising, and validated its performance on the mobile news stream app running on iOS. Analyzing one month data through A/B testing, returning high quality ads, as measured in terms of the ad post-click experience, not only increases click-through rates by 18%, it has a positive effect on users: an increase in dwell time (+30%) and a decrease in bounce rate (-6.7%).

This work has progressed in two ways. We have improved the prediction model using survival random forests and considered new landing page features, such as text readability and the page structure [2]. We are also working with advertisers to help improving the quality of their landing pages. More about this in the near future.

Cursor movement and user engagement measurement

Many researchers have argued that cursor tracking data can provide enhance ways to learn about website visitors. One of the most difficult website performance metrics to accurately measure is user engagement, generally defined as the amount of attention and time visitors are willing to spend on a given website and how likely they are to return. Engagement is usually described as a combination of various characteristics. Many of these are difficult to measure, for example, focused attention and affect. These would traditionally be measured using physiological sensors (e.g. gaze tracking) or surveys. However, it may be possible that this information could be gathered through an analysis of cursor data.

This work [1] presents a study that asked participants to complete tasks on live websites using their own hardware in their natural environment. For each website two interfaces were created: one that would appear as normal and one that was intended to be aesthetically unappealing, as shown below. The participants, who were recruited through a crowd-sourcing platform, were tracked as they used modified variants of the Wikipedia and BBC News websites. There were asked to complete reading and information-finding tasks.

wiki_normal wiki_ugly








The aim of the study was to explore how cursor tracking data might tell us more about the user than could be measured using traditional means. The study explored several metrics that might be used when carrying out cursor tracking analyses. The results showed that it was possible to differentiate between users reading content and users looking for information based on cursor data. They also showed that the user’s hardware could be predicted from cursor movements alone. However, no relationship between cursor data and engagement was found. The implications of these results, from the impact on web analytics to the design of experiments to assess user engagement, are discussed.

This study demonstrates that designing experiments to obtain reliable insights about user engagement and its measurement remains challenging. Not finding a signal may not necessary means that the signal does not exist, but that some of the metrics used were not the correct ones. In hindsight, this is what we believe happened. The cursor metrics were not the right ones to differentiate between the levels of engagement experience as examined in this work. Indeed, recent work [2] showed that more complex mouse movement metrics did correlate with some engagement metrics.

  1. David Warnock and Mounia Lalmas. An Exploration of Cursor tracking Data. ArXiv e-prints, February 2015.
  2. Ioannis Arapakis, Mounia Lalmas Lalmas and George Valkanas. Understanding Within-Content Engagement through Pattern Analysis of Mouse Gestures, 23rd International Conference on Information and Knowledge Management (CIKM), November 2014.

Online, users multitask

screenshot multitaskingWe often access several sites within an online session. We may perform one main task (when we plan a holiday, we often compare offers from different travel sites, go to a review site to check hotels), or several totally unrelated tasks in parallel (responding to an email while reading news). Both are what we call online multitasking. We are interested in the extent to which multitasking occurs, and whether we can identify patterns.

Our dataset

Our dataset consists of one month of anonymised interaction data from a sample of 2.5 millions users who gave their consent to provide browsing data through a toolbar. We selected 760 sites, which we categorised according to the type of services they offer. Examples of services include mail, news, social network, shopping, search, and sometimes cater to different audiences (for example, news about sport, tech and finance). Our dataset contains 41 million sessions, where a session ends if more than 30 minutes have elapsed between two successive page views. Finally, continuous page views of the same site are merged to form a site visit.

How much multitasking in a session?

On average, 10.20 distinct sites are visited within a session, and for 22% of the visits the site was accessed previously during the session. More sites are visited and revisited as the session length increases. Short sessions have on average 3.01 distinct sites with a revisitation rate of 0.10. By contrast, long sessions have on average 9.62 different visited sites with a revisitation rate of 0.22.

We focus on four categories of sites: news (finance), news (tech), social media, and mail. We extract for each category a random sample of 10,000 sessions. As shown in Figure 1 below, the sites with the highest number of visits within a session belong to the social media category (average of 2.28), whereas news (tech) sites are the least revisited sites (average of 1.76). The other two categories have on average 2.09 visits per session.

Visits and absence time
Figure 1: Site visit characteristics for four categories of sites: (Left) Distribution of time between visits; and (Right) Average and standard deviation of number of visits and time between visits.

What happens between the visits to a site?

We call  the time between visits to a site within the session absence time. We see three main patterns with the four categories of sites, as shown in Figure 1 above (right):

  • social media sites and news (tech) sites have an average absence time of 4.47 minutes and 3.95 minutes, respectively, although the distributions are similar;
  • news (finance) sites have a skewer distribution, indicating a higher proportion of short absence time for sites in this category;
  • mail sites have the highest absence time, 6.86 minutes on average.

However, the media of the distributions of the absence time across all categories of sites is less than 1 minute, and this for all categories. That is, many sites are revisited after a short break. We speculate that a short break corresponds to an interruption of the task being performed by the user (on the site), whereas a longer break indicates that the user is returning to the site to perform a new task.

How do users switch between sites?

Users can switch between sites in several ways:

  1. hyperlinking: clicking on a link,
  2. teleporting: jumping to a page using bookmarks or typing an URL,  or
  3. backpaging: using the back button on the browser, or when several tabs or windows are ope and the user returns to one of them).

The way users revisit sites varies depending on the session length. Teleporting and hyperlinking are the most important mechanisms to re-access a site during short sessions (30% teleporting and 52% hyperlinking for short sessions), whereas backpaging becomes more predominant in longer sessions. Tabs or the back button are often used to revisit a site.

Patterns of multitasking
Figure 2: (Top) Visit patterns described by the average time spent on the site at the ith visit in a session. (Bottom) Usage of navigation types described by the proportion of each navigation type at the ith visit in a session.

We also look at how users access a site at each revisit, for the four categories of sites. This is shown in Figure 2 (bottom).

  • For all four categories of sites, the first visit is often through teleportation. Accessing a site in this manner indicates a high level of engagement, in particular in terms of loyalty, with the site, since users are likely to have bookmarked the site at some previous interaction with it. In our dataset, teleportation is more frequently used to access news (tech) sites than news (finance) sites.
  • After the first visit, backpaging is increasingly used to access a site. This is an indication that users leave the site by opening a new tab or window, and then return to the site later to continue whatever they were doing on the site.
  • However, in general, users still revisit a site mostly through hyperlinking, suggesting that links still have an important role in directing users to a site. In our dataset, news (finance) sites are mostly accessed through links; users are directed to sites of this category via a link.

Time spent at each revisit

For each site, we select all sessions where the site was visited at least four times. We see four main patterns, which are shown in Figure 2 (top):

  • The time spent on social media sites increases at each revisit (a case of increased attention). The opposite is observed for mail sites (a case of decreased attention). A possible explanation is that, for mail sites, there are less messages to read in subsequent visits, whereas for social media sites, users have more time to spend on them eventually because the other tasks they were doing are getting finished.
  • News (finance) is an example of category for which neither a lower or higher dwell time is observed at each subsequent revisit (a case of constant attention). We hypothesise that each visit corresponds either to a new task or a user following some evolving piece of information such as checking the latest stock price figures.
  • The time spent on news (tech) sites at each revisit is fluctuating. Either no patterns exist or the pattern is complex, and cannot easily be described (a case of complex attention). However, when looking at the first two visits or the last two visits, in both cases, more time is spent in each second visit. This may indicate that the visits belong to two different tasks, and each task is performed in two distinct visits to the site. Teleportation is more frequent at the 1st and 3rd visits, which confirms this hypothesis (Figure 2, bottom).

Take away message

Multitasking exists, as many sites are visited and revisited during a session. Multitasking influences the way users access sites, and this depends on the type of site.

This work was done in collaboration with Janette Lehmann, Georges Dupret and Ricardo Baeza-Yates. More details about the study can be found in  Online Multitasking and User Engagement, ACM International Conference on Information and Knowledge Management (CIKM 2013), 27 October – 1 November 2013, San Francisco, USA.

Photo credits: D&D (Creative Commons BY).

How engaged are Wikipedia users?

Wikipedia Recently, we were asked: “How engaged are Wikipedia users?” To answer this question, we visited Alexa, a Web Analytics site, and learned that Wikipedia is one of the most visited sites in the world (ranked 6th), that users spend on average around 4:35 minutes per day on Wikipedia, and that many visits to Wikipedia come from search engines (43%). We also found studies about readers’ preferences, Wikipedia growth, and Wikipedia editors. There is however little about how users engage with Wikipedia, in particular about those not contributing content to Wikipedia.

Can we do more?

Beside reading and editing articles, users perform many other actions: they look at the revision history, search for specific content, browse through Wikipedia categories, visit portal sites to learn about specific topics, or visit the community portal. Although discussing an article is a sign of a highly engaged user, performing several actions within the same visit to Wikipedia is also a sign of a highly engaged user. It is this latter type of engagement we looked into.

Action networks

action_networkWe collected 13 months (September 2011 to September 2012) of browsing data from an anonymized sample of approximately 1.3M users.  We identified 48 actions such as reading an article, editing, opening an account, donating, visiting a special page. We then built a weighted action network: nodes are the actions and two nodes are connected by an edge if the two corresponding actions were performed during the same visit to Wikipedia. Each node has  a weight representing the number of users performing the corresponding action (the node traffic). Each edge has a weight representing the number of users that performed the two corresponding actions (the traffic between the two nodes).

Engagement over time

We use the following metrics to measure engagement on Wikipedia based on actions:

  • TotalNodeTraffic: total number of actions (sum of all node weights)
  • TotalEdgeTraffic: total number of pairwise actions (sum of all edge weights)
  • TotalTrafficRecirculation: actual network traffic with respect to maximum possible traffic (TotalEdgeTraffic/TotalNodeTraffic).

We calculated these metrics for the 13 months under consideration and plotted their variations over time. An increase in TotalNodeTraffic means that more users visited Wikipedia. An increase in TotalTrafficRecirculation means that more users performed at least two actions while on Wikipedia, our chosen indicator of high engagement in Wikipedia. We observe that TotalNodeTraffic increased first then became more or less stable. By contrast, TotalTrafficRecirculation mostly decreased, but we see a small peak in January 2011.

rcTraffic_monthlyTwo important events happened in our 13-month period. During the donation campaign (November to December 2011) more users visited Wikipedia (higher TotalNodeTraffic value). We speculate that many users became interested in Wikipedia during the campaign. However, because TotalTrafficRecirculation actually decreased for the same period, although more users visited Wikipedia, they did not perform two (or more) actions while visiting Wikiepedia; they did not become more engaged with Wikipedia. However, during the SOPA/PIPA protest (January 2012), we see a peak in TotalNodeTraffic and TotalTrafficRecirculation. More users visited Wikipedia and many users became more engaged with Wikipedia; they also read articles, gathered information about the protest, donated money while visiting Wikipedia.

rcTraffic_weekdays+endWe detected different engagement patterns on weekdays and weekends. Whereas more users visited Wikipedia during weekdays (high value of TotalNodeTraffic), users that visited Wikipedia during the weekend were more engaged (high value of TotalTrafficRecirculation). On weekends, users performed more actions during their visits.

People behave differently on weekdays compared to weekends. The same happens with Wikipedia.

Did the donation campaign make Wikipedia more engaging?

meaganmakes - 182-365+1 [cc] - 2 So which actions became more frequent as a result of the donation campaign? As expected, we observed a significant traffic increase on the “donate” node during the two months; many users made a donation. In addition, the traffic from some nodes to other nodes  increased but only slightly. Additional actions were performed;  for instance, more users created a user account, visited community-related pages, all within the same session. However, overall, users mostly performed individual actions since TotalTrafficRecirculation decreased during that time period.

So the campaign was successful in terms of donation, but less in terms of making Wikipedia more engaging.

This is a write-up of the presentation given by Janette Lehmann at TNETS Satellite, ECCS, Barcelona, September 2013.

Measuring user engagement for the “average” users and experiences: Can psychophysiological measurement help?

3081315619_fe0647a5d8_mI recently attended the Input-Output conference in Brighton, UK. The theme of the conference was “Interdisciplinary approaches to Causality in Engagement, Immersion, and Presence in Performance and Human-Computer Interaction”. I wanted to learn about  psychophysiological measurement.

I am myself on a quest: understand what is user engagement and how to measure it, with a focus on web applications with thousands to millions of users. To this end, I am looking at three measurement approaches: self-reporting (e.g., questionnaires); observational methods (e.g., facial expression analysis, mouse tracking); and of course web analytics (dwell time, page views, absence time).

Observational methods include measurement from psychophysiology, a branch of physiology that studies the relationship between physiological processes and thoughts, emotions, and behaviours. Indeed, the body responds to physiological processes: when we exercise, we sweat; when we get embarrassed, our cheeks get red and warm.

relaxCommon measurements include:

  • Event-related potentials – the electroencephalogram (EEG) is based on recordings of electrical brain activity measured at the surface of the scalp.
  • Functional magnetic resonance imaging (fMRI) – this technique involves imaging blood oxygenation using an MRI machine
  • Cardiovascular measures – heart rate (HR); beats per minute (BPM); heart rate variability (HRV).
  • Respiratory sensors – monitor oxygen intake and carbon dioxide output.
  • Electromyographic (EMG) sensors – measure electrical activity in muscles.
  • Pupillometry – measures measure variations in the diameter of the pupillary aperture of the eye in response to psychophysical and/or psychological stimuli.
  • Galvanic skin response (GSR) – measures perspiration/sweat gland activity, also called Skin Conductance Level  (SCL).
  • Temperature sensors – measure changes in blood flow and body temperature.

I learned how these measures are used, why, and some outcomes. But I started to ask myself. Yes these measures can help understanding engagement (and other related phenomena) for extreme cases, for example:

  • patient with a psychiatric disorder (such as depersonalisation disorder),
  • strong emotion caused by an intense experience (a play where the audience is part of the stage, or when on a roller coaster ride), or
  • total immersion (while playing a computer game), which actually goes beyond engagement.

In my work, I am measuring user engagement for the “average” users and experiences; millions of users who visit a news site on a daily basis to consume the latest news. Can these measures tell me something?

Some recent work published in the Journal of Cyberpsychology, Behavior, and Social Networking explored many of the above measures to study the body responses of 30 healthy subjects during a 3-minute exposure to a slide show of natural panoramas (relaxation condition), their personal social network account (Facebook), and a mathematical task (stress condition). They found differences in the measures depending on the condition. Neither the subjects nor the experiences were “extreme”. However, the experiences were different enough. Can a news portal experiment with three comparably distinct conditions?

Psychophysiology measurement can help understanding user engagement and other  phenomena. But to be able to do so for the average users or experiences, we are likely to need to conduct “large-ish scale” studies to obtain significant insights.

How large-ish? I do not know.

This is in itself an interesting and important question to ask, a question to keep in mind when exploring these types of measurement, as they are still expensive to conduct, cumbersome, and obtrusive. This is a fascinating area to dive into.

Image/photo credits: The Cognitive Neuroimaging Laboratory, and Image Editor and benarent ((Creative Commons BY).

Today I am giving a keynote at the 18th International Conference on Application of Natural Language to Information Systems (NLDB2013), which is held at MediaCityUK, Salford.

I have now started to think at what are the questions to ask when evaluating user engagement. In the talk, I discuss these questions through five studies we did. Also included are questions asked when

  • evaluating serendipitous experience in the context of entity-driven search using social media such as Wikipedia and Yahoo! Answers.
  • evaluating the news reading experience when links to related articles are automatically generated using “light weight” understanding techniques.

The slides are available on Slideshare.

Relevant published papers include:

I will write about these two works in later posts.