Airbnb at KDD 2023. KDD (Data and Information Mining) is a… | by Alex Deng | The Airbnb Tech Weblog

Alex Deng
The Airbnb Tech Blog

10 min learn

Dec 22, 2023

KDD (Data and Information Mining) is a flagship convention in knowledge science analysis. Hosted yearly by a particular curiosity group of the Affiliation for Computing Equipment (ACM), it’s the place you’ll find out about among the most ground-breaking developments in knowledge mining, data discovery, and large-scale knowledge analytics.

Airbnb had a big presence at KDD 2023 with two papers accepted into the principle convention proceedings and 11 talks and shows. On this weblog submit, we’ll summarize our workforce’s contributions and share highlights from an thrilling week of analysis talks, workshops, panel discussions, and extra.

Though search rating is an issue that researchers have been engaged on for many years, there are nonetheless many nuances to discover. For instance, at Airbnb, friends are sometimes looking over a interval of days or even weeks, not minutes. And being a two-way market, there are elements just like the potential for hosts to cancel the reserving that we’d wish to account for in rating.

Optimizing Airbnb Search Journey with Multi-task Learning, our paper accepted at KDD 2023, presents Journey Ranker, a brand new multi-task deep studying mannequin. The core perception right here is that for this type of long-term search activity, we need to optimize for intermediate steps within the consumer journey.

The Journey Ranker base module assists friends in reaching constructive milestones. There’s additionally a Twiddler module that assists friends in avoiding damaging milestones. The modules work off a shared characteristic illustration of itemizing and visitor context, and their output scores are mixed.

Due to its modular design, Journey Ranker can be utilized every time there are constructive or damaging milestones to contemplate. We’ve applied it in several Airbnb search and different merchandise to drive enhancements in enterprise metrics.

We additionally co-presented a tutorial on Data-Centric AI (DCAI). DCAI is a fast-growing subject in deep studying, as a result of as mannequin design matures, innovation is being pushed by knowledge. We shared DCAI finest practices and tendencies for growing coaching knowledge, growing inference knowledge, sustaining knowledge, and creating benchmarks, with many examples from working with LLMs.

On-line experimentation (e.g., A/B testing) is a typical means for organizations like Airbnb to make data-driven choices. However excessive variance is incessantly a problem. For instance, it’s arduous to show {that a} change in our search UX will drive worth when bookings are rare and rely upon a lot of interactions over a protracted time frame.

Our paper Variance Reduction Using In-Experiment Data: Efficient and Targeted Online Measurement for Sparse and Delayed Outcomes presents two new strategies for variance discount that rely solely on in-experiment knowledge:

  1. A framework for a model-based main indicator metric that frequently estimates progress towards a delayed binary end result.
  2. A counterfactual remedy publicity index that quantifies the quantity a consumer is impacted by the remedy.

In testing, each strategies achieved a variance discount of fifty% or extra. These methods have enormously improved our experimentation effectivity and influence.

With greater than 50% variance discount, the brand new model-based main indicator metric (listing-view utility, on the precise) aligns with the goal uncancelled reserving metric significantly better than different indicators similar to listing-view with dates (on the left).

One other fascinating problem in on-line experimentation is avoiding interference bias, which may occur when you’ve gotten competitors between your A/B check topics. Airbnb introduced a keynote speak on this subject at KDD’s 2nd Workshop on Decision Intelligence and Analytics for Online Marketplaces. For instance, should you ran an A/B check the place group B noticed decrease reserving costs, they could “cannibalize” the bookings from group A. There are two imperfect options: clustering (isolating the choices for members) and switchbacks (grouping members by time intervals).

Additionally on the workshop, we introduced the paper The Price is Right: Removing A/B Test Bias in a Marketplace of Expirable Goods. This discusses the issue of lead-day bias: the place gadgets like live performance tickets, air journey, and Airbnb bookings differ in worth primarily based on the gap from their expiration date. This will wreak havoc on A/B assessments, and within the paper we current a number of mitigation methods, similar to restricted rollout, sensible overlapping of experiments, and Heterogeneous Therapy Impact (HTE) remixed estimator to appropriate for bias and speed up R&D course of.

Together with restricted rollout and sensible overlapping of experiments, HTE-remixed estimator can present sufficiently sturdy estimation of the long-term experiment influence from the short-term outcome and considerably shorten the experiment run-time.

In advertising and marketing, the million-dollar query is how a lot do you have to spend per channel? This may be reframed as a causal inference drawback: what number of incremental conversions does every channel drive?

After we take a look at advertising and marketing actions throughout Nielsen’s Designated Advertising Areas (DMAs) we discover reasonable to sturdy correlation throughout channels. This makes it arduous to isolate the influence of 1 channel from one other. In actual fact, once we embrace the correlated channels in the identical regression, the coefficients flip indicators for many channels, a transparent signal of multicollinearity.

Present options to multicollinearity, similar to shrinkage estimators, principal element evaluation, and partial linear regression, are notably useful for prediction issues however work much less nicely for our use case the place we have to preserve enterprise interpretability whereas isolating causality. Our strategy, described within the paper Hierarchical Clustering as a Novel Solution to Multicollinearity, is to hierarchically cluster DMAs primarily based on their similarity in advertising and marketing impressions over time. With such clustering, cross-channel correlation dropped by as much as 43% and the channel coefficients now not flip indicators.

Not solely does our technique present an intuitive and efficient answer to multicollinearity, it additionally circumvents the necessity for complicated transformation and preserves the interpretability of the information and the outcomes all through, empowering broad functions to causal inference issues.

We introduced this paper on the new KDD workshop, Causal Inference and Machine Learning in Practice: Use cases for Product, Brand, Policy, and beyond. Airbnb’s Totte Harinen co-organized this workshop, which strongly resonated with KDD’s viewers — it had 12 papers and 4 invited talks from 37 authors in 14 establishments.

As well as, we had been invited to current two talks and one poster at KDD’s 2nd Workshop on End-End Customer Journey Optimization, and joined the workshop’s panel dialogue. One among these talks coated CLV (buyer lifetime worth) modeling. At Airbnb, we need to develop our model and neighborhood by rising all customers. Our CLV ecosystem applies two frameworks:

  1. The worth of Airbnb clients. We use conventional ML approaches together with analysis into extra customer-lifecycle-focused architectures (i.e. HMMs). We increase this with demand-supply incrementality modeling to correctly account for visitor and host contributions to worth.
  2. The worth development that Airbnb delivers to clients. By accounting for long-term incremental results of reserving on Airbnb together with incremental contributions from advertising and marketing and attribution methods, we will measure incremental adjustments in CLV and optimize in direction of them.

Causal inference will also be utilized to go looking. On the CJ workshop, we introduced our paper Low Inventory State: Identifying Under-Served Queries for Airbnb Search, which explored the issue of searches that return a low variety of outcomes. Whether or not or not that quantity is “too low” and can deter a visitor from reserving will depend on search parameters and intent to e book. For a given search question, we will use causal inference to find out the incremental impact of an extra outcome on the chance of reserving. Our mannequin outperforms non-causal strategies and may help with provide administration as nicely.

Lastly, our poster mentioned how we measure the consequences of nationwide TV promoting campaigns. We analyzed TV publicity knowledge and demographic knowledge with knowledge on Airbnb onsite habits utilizing a third-party identification graph. We had been in a position to resolve disparate datasets to a novel identifier and mannequin particular person households.

We use propensity rating matching to estimate TV results, after which scale these estimates to a nationally-representative inhabitants. We leverage this knowledge to offer tactical insights for advertising and marketing and perceive how lengthy TV results take to decay.

The plot above (from simulated research for illustration) reveals the outcomes of an evaluation for a TV marketing campaign from August — October. We will see that the TV marketing campaign was efficient at growing bookings for households that noticed an Airbnb TV advert and was more practical for one subgroup (purple line) than the opposite subgroup.

How are you going to obtain science at scale in a medium-to-large engineering group? On the KDD’s 2nd Workshop on Applied Machine Learning Management, we shared Airbnb’s answer for knowledge science reproducibility and reuse, Onebrain. The core of Onebrain is a coding customary for configuring knowledge science tasks solely in YAML. Onebrain’s backend abstracts away CI/CD, configuration/dependency administration, and command-line parsing. Because it’s “simply code,” Onebrain tasks may be checked right into a version-controlled repo, and any repo generally is a Onebrain repo.

Person interplay with Onebrain occurs by way of a CLI. With a single command, anybody can use an current challenge as a template for their very own work, or generate a one-click URL to spin up a server and run the challenge. Utilization is rising quick with over 200 distinct tasks and over 500 customers at Airbnb inside only a yr.

Whereas most of our analysis focuses on high-order knowledge use-cases like fashions, knowledge seize is important because it’s the place to begin for any evaluation. Occasion logging libraries sometimes seize actions on and impressions of app elements (buttons, sections, pages). However with this stage of granularity, it may be tough to summary out consumer habits, measure the whole time spent on a floor, or perceive the context surrounding an motion.

On the 2nd Workshop on End-End Customer Journey Optimization, we spoke a couple of new kind of client-side occasion referred to as Classes. A part of Airbnb’s client-side logging answer, Classes present a strategy to observe consumer context and behaviors throughout the Airbnb product. Not like conventional time-based classes utilized in net analytics, these Classes may be tied to numerous facets of the Airbnb consumer expertise. For instance, they are often tied to particular surfaces just like the checkout web page, API calls used for observability, and even inside states of the app that summary away complicated UI elements. The flexibleness of Classes permits us to seize a variety of consumer interactions and higher perceive their journey all through our platform.

KDD is an incredible alternative for knowledge scientists from all over the world, and throughout trade and academia, to come back collectively and alternate learnings and discoveries. We had been honored to be invited to share methods we’ve developed by way of utilized analysis at Airbnb. The methods and insights we introduced at KDD have been important to enhancing Airbnb’s platform, enterprise, and consumer expertise. We’re continually motivated by improvements taking place round us, and we’re thrilled to provide again to the neighborhood and desperate to see what sorts of recent functions and developments might come about in consequence.

On the backside of the web page, you’ll discover a full record of the talks and papers shared on this article together with the workforce members who contributed. For those who can see your self on our workforce, we encourage you to use for an open position at this time.

Optimizing Airbnb Search Journey with Multi-task Studying [link]

Authors: Chun How Tan, Austin Chan, Malay Haldar, Jie Tang, Xin Liu, Mustafa Abdool, Huiji Gao, Liwei He, Sanjeev Katariya

Variance Discount Utilizing In-Experiment Information: Environment friendly and Focused On-line Measurement for Sparse and Delayed Outcomes [link]

Authors: Alex Deng, Michelle Du, Anna Matlin, Qing Zhang

Past the Easy A/B check: Mitigating Interference Bias at Airbnb

Speaker: Ruben Lobel

The Worth is Proper: Eradicating A/B Check Bias in a Market of Expirable Items [link]

Creator: Thu Le, Alex Deng

Unveiling the Visitor & Host Journey: Session-Based mostly Instrumentation on Airbnb Platform

Speaker: Shant Torosean

Dedicated to Lengthy-Time period Journey: Rising Airbnb By means of Measuring Buyer Lifetime Worth

Speaker: Sean O’Donell, Jason Cai, Linsha Chen

Low Stock State: Figuring out Below-Served Queries for Airbnb Search [link]

Creator: Toma Gulea, Bradley Turnbull

Measuring TV Campaigns at Airbnb

Speaker: Adam Maidman, Sam Barrows

Tutorial: Information-Centric AI [link]

Presenter: Daochen Zha, Huiji Gao

Hierarchical Clustering As a Novel Resolution to the Infamous: Multicollinearity Drawback in Observational Causal Inference [link]

Authors: Yufei Wu, Zhiying Gu, Alex Deng, Jacob Zhu, Linsha Chen

Onebrain — Microprojects for Data Science [link]

Authors: Daniel Miller, Alex Deng, Narek Amirbekian, Navin Sivanandam, Rodolfo Carboni