- A group of five alt data firms have a blacklist of hedge funds they won’t work with.
- The list currently has 12 funds on it, including some big names, according to one vendor.
- Funds can end up on the list by stealing vendors’ data collection processes, among other ways.
The alternative data business, for as much as the industry has boomed in recent years, is still a hard nut to crack.
Data vendors need a product with a history and clear money-making ability, as well as a connected sales team to get it in front of the right hedge funds and companies. This data is pulled from thousands of non-traditional sources, like email receipts and cell-phone geolocation trackers, and has become a requirement for many hedge funds looking for an edge. But, with how fast the markets move, one style of data can be the go-to source one day, and rejected the next.
So, when hedge funds try to get around paying for data, vendors are understandably frustrated. It’s gotten to the point that five major vendors, including web-scraping data-seller Thinknum, have maintained an informal blacklist of funds not to work with for two years now.
Justin Zhen, the cofounder of Thinknum, told Insider that the list currently has 12 funds on it, including “some big names,” but declined to disclose which funds are on it or the other vendors that contribute to the list. The way a manager ends up on the list is simple: They want what a vendor has, but don’t want to pay for it.
How to get blacklisted
Theft is a main way funds get in hot water with data vendors. Some managers will try and pry from the sales team how other funds are using the data, or rip off the methodology the vendor uses to create its datasets internally.
“For us, we see a lot of firms try to get ideas on our system of crawling,” said Zhen, whose web-scraping firm capitalized on the Reddit-induced market madness earlier this year by creating WallStreetBets-focused datasets. The firm is constantly taking snapshots of different webpages to notice changes the instant they happen.
Then there are funds that just take sample data and don’t delete it from their systems, which nearly all vendors require to get a sample.
“Usually if a firm will do this to one vendor, they’ll do it to another,” Zhen said. It hasn’t risen to the level of legal action ever for Thinknum, but it’s come close, he said, including some harsh letters sent out.
Other ways to end up on the blacklist are less nefarious, but still hard for vendors to deal with. One main reason funds can get on vendors’ bad side is by constantly trialing data but never buying it.
“It becomes a drain on resources,” Zhen said.
A lot of funds in this situation are often figuring out how exactly they’ll use alternative data in their firm, and are trying different techniques. Thinknum and others have patience, but it’s not unlimited.
“It’s not always malicious. If a firm doesn’t buy, that’s ok. They don’t go on the blacklist. But if you sample five times in two years, that’s not ok,” he said.
Still a trust-based business
Once a fund is on the list, it’s not there forever. Often, the blacklist is related to an individual at certain hedge funds, and once that person leaves for another job, dialogues can open back up, Zhen said. For excessive samplers, coming back with a ready-made offer can be enough to bump you off the list.
Lorn Davis, VP of corporate and product strategy at Facteus, which is not one of the vendors that use the blacklist, said the expectation is that large hedge funds will try and source their own data to avoid paying companies like his. At website-traffic-tracking vendor SimilarWeb, which also isn’t among the five vendors that share a blacklist, the firm has noticed funds trying to find the same signal its data provides through cheaper datasets.
“You’ve got to convince the user of the long-term value, that it’s not just a short-term buy,” said Ed Lavery, director of investor intelligence for the newly public company.
And protecting your edge in sales conversations is critical, Davis said.
“It’s my job to make sure I don’t divulge proprietary information,” he said. Facteus compiles transaction data from scores of sources, but doesn’t reveal to potential clients where it gets its information, for example.
“This is the game, you have to have a level of trust,” Davis said. “They’re going to test it, you cannot lie about data.”
For funds that end up on a blacklist, uncovering the cause is critical, said Chris Petrescu, a former data executive at ExodusPoint who now runs his own consultancy.
For funds that end up on a blacklist, uncovering the cause is critical, said Chris Petrescu, a former data executive at ExodusPoint who now runs his own consultancy.
“It could be trivial, like adding a clause in a contract, it might relate to someone that is no longer at the fund, or it could be ego-related,” he said. In the meantime though, managers find an alternative as fast as possible because, in the data business, “time is money.”