- Credit-card data is one of the original alternative-data streams used by hedge funds and other investors.
- But the pandemic has disrupted consumer spending habits, and data vendors have been forced to do more hand-holding with clients.
- Banks are using techniques like post-stratification weighting and “swarming” to help make sense of data.
- For more stories, sign up for our Wall Street Insider newsletter.
One of the earliest and most popular forms of alternative data is proving more difficult to handle these days.
Investors like hedge funds have long leaned on credit-card data to help suss out everything from new retail trends to the health of specific businesses.
But the coronavirus pandemic has transformed shopping habits, and that’s prompted some tough discussions in the ecosystem of data buyers and sellers.
“Relying on set-it-and-forget-it type models may not be a good idea because your outcome may be completely wrong,” Inna Kuznetsova, CEO of 1010data, told Business Insider.
For one hedge fund, Coatue’s $350 million quant fund, that means returning all outside capital while the firm reworks its strategy. It had pulled back severely from the markets in early April thanks to the instability and uncertainty in its data feeds.
One source had told Business Insider that Coatue uses an e-commerce data set that showed a spike in visits to retailers’ websites while physical locations were closed — which could have overstated how well certain retailers were doing overall.
How credit-card data is getting skewed
Some of the ways that stay-at-home orders have skewed credit-card data are relatively easy for investors to predict and work around.
Spending on board games and web cameras, which saw explosive growth in the spring, is likely unsustainable and only temporary, Kuznetsova said. A hedge fund that specializes in video games has soared this year, partially thanks to people turning to new games while stuck at home.
Other shifts, however, are more nuanced.
For instance, Kuznetsova said there were huge increases in the size of purchases per customer at quick-service restaurants even as the overall number of visits decreased.
The shift, she said, was likely due to a single member of the household buying meals for everyone as opposed to each person getting their own. In reality, people’s spending habits weren’t necessarily changing, but credit-card data made it appear as if one person had tripled his or her spending.
“Depending on what criteria, what indicator you use for your investment decisions, you risk making a very wrong decision unless you combine several different factors and understand what is real is happening in this sector,” Kuznetsova said.
“This is the difference between using multiple sources of data and looking at the complexity, trying to track and understand what happens, as opposed to automating all the predictive analysis.”
How banks and investors are reworking alt-data
That’s not to say credit-card data has been rendered entirely useless. But firms are changing how they work with and interpret the figures.
Ryan Preclaw, who co-leads Barclays’ data-science research group, told Business Insider the bank applies something called post-stratification weighting. The approach takes in all of the data and reweights it at the end to account for discrepancies between an individual’s usage, he said.
Post-stratification weighting differs from traditional panels used in credit-card data where the behaviors of a group are extrapolated out to represent a wider audience — a concept similar to that of Nielsen TV ratings, where a smaller number of people are considered representative of a wider viewership.
“You’re taking a broader pool of people. You’re less likely to have accidentally selected somebody who ends up with a non-representative behavior change,” Preclaw said.
Jeremy Brunelli, UBS Evidence Lab’s global head of frameworks, told Business Insider that a key piece of the bank’s approach to handling alternative-data is something called “swarming.”
The idea is to remain agnostic to the dataset, instead applying any information that could answer the question at hand. The result could be relying on receipt transaction data or app usage instead of just credit-card data.
And while the practice was standard at UBS prior to COVID-19, Brunelli said the pandemic has definitely shifted some of the “channels,” or ways which people are going about making purchases, which has created additional noise and further demonstrated the benefits of swarming.
“I would say my experience with credit card data is that it can be hit or miss,” Brunelli said.
“It’s not as efficacious as I think people might assume it to be. Some of our clients who are deep, heavy users of alternative data prefer to look at lots of different signals, in a sense, to create a mosaic around a theme.”
While credit-card data “can provide insight into how consumer habits change,” it’s not the end-all, be-all data source some might think it is, said Man Group’s Hinesh Kalian, who is the head of data science for the $104 billion hedge-fund manager.
Man Group uses its data-science team to combine credit-card, geolocation, web traffic, and hiring data to get a more holistic look at the retail landscape. Credit-card data providers have worked to increase the panel size since the pandemic, but “they only capture a portion of the macro experience,” he told Business Insider.
“We don’t solely rely on credit card data,” he added.
Data customers need more handholding, more options
To that point, the coronavirus has given a chance for other data sets to shine. Geolocation data, in particular, has proven to be extremely useful, Barclays’ Preclaw said. The data identifies the physical location of electronic devices.
From understanding overall traffic in public places to seeing activity at factories or restaurants, Preclaw said, geolocation data has been useful to understand reopening progress.
“I think geolocation data is having its moment,” Preclaw said. “It has been, even more than credit-card data right now, the single most important and useful source of insights in our opinion.”
To be sure, no data set has proven to be a silver bullet.
And now, just as data users are making sense of how the pandemic has altered shopping habits, there’s another question they’ll need to consider: which changes will stick around for good?
“There’s a lot of decisions that get made in the course of preparing this stuff for people to use and those decisions really matter,” he added. “Hopefully the thing that’ll make people more aware of is the fact that they actually have to think about what those decisions are and what the potential consequences are. “
Uncertainty around the future of consumer behavior
Amit Jain, CEO of Bridg, an alt-data company that tracks spending in the restaurant industry, primarily sells his data to retailers and corporations that work with the restaurant, like Pepsi, but has worked with Wall Street clients on one-off projects.
The question all clients are trying to answer, though, is the same: “How many of these customers will come back?”
“What no one knows is if the consumer behavior has changed forever,” he told Business Insider.
To that end, Jain has “definitely seen some want for hand-holding from clients.” He’s doubled his data science team — from three people to six — to help crank out research reports on his datasets and help clients better understand the underlying trends they might be missing.
This approach highlights a split that’s been emerging in the alt-data space for some time now — with providers choosing to either prioritize raw feeds that quant funds with significant data-science teams can mold as they see fit, or to build out internal data-science teams that help clients use the data.
Some alt-data providers have been forced to reevaluate their offerings as hedge funds continue to build out their own teams focused on alternative data.
But both sides of the data equation realize the dangers of relying on just one source of data, no matter how predictive past results appear.
Kristina Fan, CEO of 7 Chord, an artificial-intelligence bond trading platform, used an example made famous by “Black Swan” author Nassim Taleb to prove that point on a webinar hosted the AI & Machine-Learning in Trading conference on Monday.
As the tale goes, for 1,000 days a turkey is fed every day by a farmer, until day 1,001 — a few days before Thanksgiving — when the farmer comes out to kill the turkey, a dramatic way of explaining a “Black Swan” event.
“The turkey learns pretty quickly that his prediction model was off,” Fan said, explaining that the turkey would of course assume the farmer would feed it, not kill it.
But with additional data, Fan said, the turkey could have seen it coming — and been prepared for its “Black Swan.”
“If the turkey sees gravy sales are spiking in the weeks leading up, that could be a sign,” Fan added.
Read more:
- Hedge funds are using these 10 alt-data sources to gain an investing edge as the coronavirus wreaks havoc on global markets
- Bloomberg’s alt-data head jumped ship to start a new business at retail consultant Ascential — and it shows how firms can cut out middlemen and sell data directly to hedge funds
- Alt data’s Wild West days may be ending as Congress and privacy advocates zero in on the industry. Nearly a dozen insiders tell us how data streams going dark is an ‘unhedgeable’ risk.