When history fails you, borrow from geography
How Airbnb used sequential geographic recovery signals and prior propagation to generate reliable corridor-level forecasts when local data was scarce. By: Harrison Katz The problem with unprecedented shocks Almost every forecasting system is built on the same implicit assumption: the future will resemble the past. You train on historical data, you validate on holdout periods, and you trust that past patterns will at least roughly indicate future performance. When this assumption breaks, the model does not gracefully degrade; it fails confidently. It produces precise, well-calibrated intervals around the wrong answer. The acute phase of COVID, from early to late 2020, was a clear illustration of this, and we wrote about it in a previous post . But the more interesting forecasting problem was not the shutdown. It was everything that came after. The period from late 2020 through 2022 was not a single coherent regime. It was a sequence of overlapping, asynchronous changes: vaccine rollouts that reached some markets months before others, border reopenings that followed their own country-level timelines, reclosures triggered by new variants that hit different corridors (a pairing of the travelerβs origin city and destination city) at different moments. Demand was not recovering uniformly. It was rebounding unevenly across every corner of the world, in ways that had no historical precedent and no single governing pattern. The standard response to a shock is to wait for each affected market to accumulate its own post-shock data and retrain locally. But Covid was among the biggest shocks the travel industry has faced in decades. With markets worldwide reopening and reclosing on staggered schedules, waiting for markets to settle meant forecasting blind for months at a time, across all markets, just when timely projections were most needed, in the circumstances. So we started building something different. When we could not simply look backward in time for relevant examples, we looked sideways across geographies instead. The insight: geography as a time machine The key observation was that the recovery was not happening everywhere at once. It was unfolding sequentially, and messily, often punctuated by further reclosings and reopenings. Vaccines reached some markets in early 2021 and others months, a few quarters, or many quarters later. Some borders reopened in spring and reclosed by autumn. Demand in one corridor could be surging while an adjacent corridor was still effectively shut. This sequential, asynchronous structure was operationally painful. But it contained more information than might have been obvious on initial consideration. One of the clearest signals we track is the mean lead time for bookings: how far in advance guests book relative to their travel dates, measured as a ratio against the same period in a baseline set in 2019, the last fully pre-pandemic year. When there is a disruption, and the pandemic as a whole was the largest disruption weβve ever seen, lead times compress sharply as travelers shorten planning horizons for the trips they do take, then lengthen again as conditions stabilize. The figure below shows this signal for Europe and North America across the major phases of the pandemic. The key observation is not the shape of either curve in isolation. It is the lag between them. Europeβs first wave of booking lead time compression hit in February 2020. North Americaβs came roughly four to six weeks later, but following the same trajectory. The reopening recovery was partial in both regions, because the travelers who returned first were booking short-lead-time trips rather than resuming normal planning horizons . And when vaccine rollout arrived, the direction reversed: North America turned the corner in December 2020, while Europe was still in its second wave trough, and did not begin its recovery until February and March 2021. Figure 1. Mean booking lead time as a ratio vs. 2019 baseline, Europe and North America, Feb 2020 to Jun 2021. Each region cycled through similar phases, but on its own timeline. Phase labels reflect the different timing that applies to each region. Once we could see how demand responded to reopening in one of the two markets, we had a genuine signal about how demand was likely to respond when the other market reopened later. It was not a perfect signal. The markets were distinct, the timing varied, and the traveler mix was somewhat different. But the underlying dynamics were related. Travelers responded to reopened borders, to restored flight routes, and to lifted entry requirements, in ways that were not completely idiosyncratic to each corridor. Corridors in the earlier-reopening market were ahead of the later-reopening market in time, but they were observing the same underlying phenomena. Doing the math for demand increases with reopening In Bayesian terms, the structure is as follows. A brief glossary: c is a corridor; ΞΈ_c denotes the demand parameters for corrid
Comments
No comments yet. Start the discussion.