The peril of Uber's plan to screen out riders with bad ratings

Drivers who get consistently low ratings are forced off the platform - the same fate will soon apply to impolite passengers

In this Sunday, June 24, 2018 photo, Ammal Farahat, who has signed up to be a driver for Careem, a regional ride-hailing service that is a competitor to Uber, drives her car in Riyadh, Saudi Arabia. After lifting a longstanding ban on women driving, it's the latest job opening for Saudi women, that had been reserved for men only, and one that sharply challenges traditional norms. (AP Photo/Nariman El-Mofty)
Powered by automated translation

One notable feature of the mobile ride-hailing company Uber is that customers give drivers ratings, with five stars signaling excellence, and one star indicating unacceptable service. Drivers who get consistently low ratings are forced off the platform. At the same time, the platform allows drivers to rate riders - although thus far those ratings haven’t been put to much use except in extreme cases.

But the company just announced a shift: it plans to start screening out riders whose ratings don’t make the cut.

Perhaps understandably, drivers have applauded the move. It might give customers an incentive to be more courteous, or at least establish some baseline level of civility. And it brings about a pleasing symmetry: Drivers have stressed about their ratings for years; it seems only fair to make riders sweat a bit, too.

The economist in me might even want Uber to go further; instead of simply banning rude or insufferable riders, it could charge nuisance riders higher prices. That’s more or less the way credit ratings work: they raise the cost of borrowing for individuals deemed to be bigger credit risks.

The growing ubiquity and impact of ratings in our lives is a real concern. Nevertheless, in ride-hailing, ratings-based pricing may have significant advantages.

It could motivate riders to improve their behavior, in much the same way that consumers try to raise their credit scores by making payments on time or avoiding taking on excessive debt. This might actually do more to balance behaviour in the Uber ecosystem than outright disqualification would, since barred riders might just try to skirt the ban by opening new accounts. Indeed, without collecting more detailed user information, there isn't much Uber can do to stop riders from sidestepping their ratings by getting new phone numbers and credit cards.

But there are challenges in using Uber’s rider scores like credit ratings.

Perhaps taking customers to the airport is especially lucrative, which makes drivers prefer those rides (and thus give those riders higher ratings); this could result in a bias in favor of wealthier jetsetters or businesspeople.

Whereas credit ratings are based on verifiable transactions and repayment history, Uber ratings are based on hearsay and subjective assessments of driver and rider performance. That means they’re more likely to reflect prejudices. Moreover, when there’s a dispute, it’s hard to ascertain what happened. That has already caused problems. For example, there’s a scam called - I kid you not - vomit fraud, in which drivers claim riders got sick while in transit and tack on phony clean-up fees of as much as $150 (Dh550).

A second problem is reciprocity: credit companies rate us, but we never get to rate them in return, no matter how much we may want to. Uber drivers and riders rate each other, and it’s already common for drivers to ask riders to give them five stars in exchange for a reciprocal five-star rating. The more riders’ ratings matter, the more willing riders will be to make these sorts of exchanges. That would reduce the amount of actual information the ratings contain.

Theoretically, that’s where algorithms come in - a good algorithm should be able to separate the signal in ratings from the noise. If a driver or rider gets a large number of five-star ratings that are followed by complaints after-the-fact, for example, the algorithm should infer that some sort of ratings pressure may be going on.

But even algorithms sometimes reflect inherent biases in the data and/or implicit biases in the data-generating process. Perhaps taking customers to the airport is especially lucrative, which makes drivers prefer those rides (and thus give those riders higher ratings); this could result in a bias in favor of wealthier jetsetters or businesspeople. Or if drivers don’t enjoy traveling to pick up passengers in low-income areas, the algorithm might infer a problem with riders based there, and this could quickly translate into a bias against the poor. Of course, we’d like our rating algorithms to self-correct these problems, too - but while there’s hope, this problem is notoriously difficult.

And indeed, Uber has already faced concerns about whether its rating system indirectly reinforces discrimination on the platform. (Of course, credit-rating companies have the same sort of problem all the time and have tried to develop solutions.)

So as Uber starts making riders’ ratings count, there’s serious work to be done to ensure that the system is as objective and evenhanded as possible.

Yet Uber already has experience dealing these issues on the driver side. And almost nobody - other than bad apples themselves - would argue that the ride-hailing ecosystem would be better if Uber didn’t screen out unsafe or otherwise unacceptable drivers. It’s hard to make a case that enforcement on the rider side should be any different.

Bloomberg