- Recent congressional inquiries into Envestnet’s Yodlee and Avast’s Jumpshot highlight the risks aggregators of alternative data and their hedge-fund clients have as they rely on datasets that could disappear overnight.
- Business Insider spoke with about a dozen alternative-data providers and consumers to get a sense of how to sell and use a product, respectively, that is not guaranteed to always be there.
- “It’s a problem, and it’s a largely unhedgeable risk,” Tammer Kamel, the cofounder and CEO of Quandl, said.
- Visit Business Insider’s homepage for more stories.
Tammer Kamel built — and sold — a business around finding unique datasets financial-services clients can’t get anywhere else.
His company, Quandl, which was bought by Nasdaq at the end of 2018, relies on third-party data streams of unique information, such as satellite images and credit-card transactions, to create products hedge funds and asset managers can use to either generate or reinforce their investment strategies.
But collecting rare datasets is only half the battle. Data aggregators like Quandl run the risk of having the originators they pull from shut off with little-to-no heads-up. That was the case in 2018 when a data stream Quandl was using essentially went dark overnight after it was acquired by a company that no longer wanted to license data.
The issue sits at the core of every data aggregator in the alternative-data space, Kamel told Business Insider.
“With uniqueness comes this risk that if the source evaporates, there is no substitute for it. That is why it was valuable in the first place,” Kamel said. “It is a problem, and it is largely an unhedgeable risk.”
While frustrating, Kamel acknowledged the data product being wound down wasn’t the end of the world. Quandl’s customers, which includes quant funds and corporate organizations and counts over 400,000 users on its platform, understood the situation and were willing to continue to work with the company.
But clients aren’t always so understanding, especially if the reason a data source is shut down stems from regulatory issues.
“That’s the nightmare scenario,” Kamel said. “The dataset disappearing, that’s annoying. If the dataset proves to be ill-gotten, that’s the disaster.”
The risk is becoming ever more real as Avast’s Jumpshot has been permanently shut down due to privacy issues while Envestnet’s Yodlee is currently dealing with congressional inquiries. Yodlee, which has data on credit-card transactions, and Jumpshot, which provided metrics on clickstream data, are far from the only ones affected though.
Hedge funds and data aggregators build investment models and research projects off the raw data provided by companies like Yodlee and Jumpshot. There has already been collateral damage. Connexity, a Los Angeles-based online-retailer conglomerate, shut down its website-traffic-analytics arm Hitwise within a week of Jumpshot closing because it was extremely reliant on the now shuttered data stream.
“Any model, any strategy we build, cannot be dependent on one or two data providers,” Yin Lou, the head of Wolfe Research’s quant division, said.
How hedge funds protect themselves
Lou said the firm uses more than 100 outside data providers and that 80% of the providers have a clear backup in case the primary option goes dark. For example, the firm uses RavenPack data for social-media tracking and backs it up with Refinitiv’s social-media data if needed.
While there’s not a risk to the entire business of the largest hedge funds, losing a data feed as widely used as Yodlee could disproportionately affect some teams, Lou said.
“I do know many consumer PMs were using Yodlee data extensively,” he added.
Smaller hedge funds might also run into problems of overreliance on a single dataset, not because they are not careful, but because they can’t afford to pay so many different data providers, Steve Iannini, who runs the alt-data company P Street Advisors, said.
Organizations that can afford both the data streams and the personnel to turn that information into investment ideas make up the top of the hedge-fund industry — quant funds like Two Sigma and Renaissance Technologies and multistrategy funds like Point72, Citadel, and ExodusPoint.
“There’s a relative few number of hedge funds that are really in the data game,” Iannini said.
Campbell & Co., a Baltimore hedge fund that runs billions in its systematic strategies, has built an expectation for errors in data feeds, Kevin Cole, the firm’s chief investment officer, said.
Cole said the firm has had practice dealing with data streams going dark during government shutdowns, which temporarily halts data like economic output and unemployment figures.
When a data feed is cut off, Campbell has to decide how long a model can run without certain data. If it goes on long enough, the manager will allocate away from the model or even shut it down for a period until it feels like it can either replace the data or the feed comes back.
“It should be assumed it will happen, not that it is an exception to the fact,” he said.
Some investors are choosing to buy less alternative data altogether, instead taking the matter into their own hands. That’s the tactic for Mike Chen, the director of portfolio management at PanAgora Asset Management, a $43 billion firm.
About half the alternative data the firm ingests is collected by PanAgora itself, Chen told Business Insider. With the explosion of the space in recent years, Chen said it was harder to find datasets that aren’t pitched across Wall Street, which causes them to lose their value.
Chen also said PanAgora’s investing strategy of finding data to reinforce ideas, as opposed to finding ideas in data, lends itself to sourcing its own data as opposed to going to vendors.
“We are becoming ever more cynical,” Chen said. “Compared to three to five years ago, PanAgora now has a much higher onboard threshold for alternative datasets. As a result, our adoption rate of external alternative datasets have been lower.”
An industry in need of rules
While the Yodlee scrutiny from Congress has been interpreted as a challenge for the entire alternative-data industry, many providers are hoping it gives more clarity to an industry that many say is still in its Wild West days.
Emmett Kilduff, the founder and CEO of the data aggregator Eagle Alpha, told Business Insider the industry would welcome some help from rulemakers.
Kilduff, whose firm has over 1,200 datasets on its platform, said vendors, intermediaries, and even buyers would be happy to adapt their practices to meet standards that might be set.
“It’s not helpful, frankly, that there is not enough regulation or guidance from regulators or governments,” he said. “It’s a gap that needs to be corrected.”
Iannini considers himself pretty lucky on the regulation front: His company, which is focused on satellite imagery, is one of the few alt-data providers that have a clear legal and regulatory framework to work with.
The government has already laid out the rules for how high the resolution of a picture from a satellite can be, and the biggest buyer of satellite images is the US government, Iannini said, so surprises like the ones that ensnared Yodlee are unlikely.
Others feel the regulatory spotlight would be better-suited on another industry: advertisers. Quandl’s Kamel said while it’s easy to pitch Wall Street as the boogeyman who is making money off of people’s personal information, that’s far from the truth.
Investors are happy to have anonymized data, as they are looking for overall market trends. Ad companies are the ones in search of specific data about people to better understand how to sell them products.
“Hedge funds or the finance industry doesn’t give a damn about your personal information,” Kamel said. “The stuff that adtech is doing with your data is far creepier and far more pernicious and far more threatening than anything Wall Street is ever going to do with consumer data because it doesn’t matter. The aggregate is all that matters to Wall Street.”
However, some feel the outlook for data aggregators is much more grim. Marta Lopata, the chief growth officer at Thinknum, told Business Insider firms could consider themselves truly protected only if they sourced their own data.
Before she launched Thinknum, which sells companies data that it scrapes from the web, Lopata said a lot of thought was put into where the startup would be best-suited to source data. Something like the internet, which is a constant source of never-ending data, seemed like a better bet than places that might have been out of their control, she added.
“I think the way you can really protect yourself in a space is choosing to bet on data sources that you originate. I think that’s really the future of the alt-data space,” Lopata said. “If you cannot be the originator of the data source, you are going to be at risk because you can’t control whether the data originators will cut you off or the regulations will change, and you’re not completely in control. It’s a very volatile space.”