Skip to main content

The Future of Market Data in the Cloud

In a previous series, Exegy covered the rise of machine learning and how at its base level, machine learning classifies and organizes large datasets and then applies algorithms to make predictions. To make these predictions, firms will have to sort through oceans of data and doing so can be time-consuming and costly. Luckily, the future of this process is market data in the cloud.

In 2020, the world reached over 64 zettabytes of available and globally consumed data. By 2025 experts predict that we will produce over 180 zettabytes of data. Like all other aspects of the tech race underlying capital market success, firms that find a way to leverage this data will thrive, while others will be left behind.

However, new innovations in the cloud space have produced a viable solution for data sourcing by bringing buyers and sellers together in a marketplace. The advent of cloud-based data marketplaces (DMs) is revolutionizing the way data is ingested, analyzed, and shared. Cloud marketplaces currently serve buy-side innovators, hedge-fund managers, developers, and quantitative researchers by allowing them to search datasets by budget, use-case, date, and size.

Firms no longer need to communicate with an overwhelming number of different vendors, each with different rules and systems. In cloud-based data marketplaces, datasets are consolidated on one platform. This article will go over how cloud-based marketplaces are affecting the way financial data is sourced and analyzed, how DMs operate, what kind of data they offer, and how they can benefit you.

Market Data in the Cloud - BitsandBytes graphic

Data Procurement with Cloud-Based DMs

There are many challenges that data ‘hunters’ can face when trying to acquire new datasets for their firms. Traditionally firms have been unable to efficiently optimize for cost savings due to a high level of demand paired with only a few dominant providers.

Other notable areas firms struggle with are managing current inventory, knowledge on which datasets add value to existing libraries, managing compliance and entitlements, and implementing data-specific strategies that consider a continuously evolving world.

Market data in the cloud has become popular due to its ability to solve most of these costly sourcing issues. Some data marketplace models forego tiresome and expensive ETL and FTP processes by allowing customers to access datasets without ever leaving the cloud platform. Customers can try datasets before they buy them, run algorithms, and test data against internal resources all in one location. This saves them time and decreases the infrastructure costs that come along with downloading and formatting data, as well as moving it between different platforms, which is no longer necessary with certain providers. Some cloud platforms are also capable of taking care of data entitlement concerns by acting as a middleman between the data providers and data consumers.

Benefits to Data Providers

Many data marketplaces also benefit data providers allowing firms to showcase their own unique datasets and their value analysis. Cloud marketplaces also supply their own value analysis either through internal quantitative research teams or the community of users sourcing the data. 

Non-traditional data providers and non-financial groups can also benefit as data providers by selling their ‘exhaust data.’ The increasing ease-of-use of these platforms is driving down data costs giving firms opportunities and resources to experiment with more forms of alternative data that these non-traditional sellers can deliver.

Data Marketplaces

Exegy has experience partnering with cloud-based data marketplaces as a provider. Furthermore, our specialty as a company lies in assisting our existing and prospective clients in sourcing affordable data and creating alpha through predictive machine-learning based analytics. Thus, we are in a knowledgeable position to survey available data marketplaces and their offerings.

Most data marketplaces lean towards serving their clients analysis-ready core financial data at a cheaper cost than most vendors while also vetting hard-to-find alternative datasets on a smaller scale. For firms whose R&D budgets are almost entirely taken over by market data costs, having cost-conscious research data needed to grow business and produce insight can make a huge difference. Core market datasets can enrich existing data without the risks that come with alternative data—such as the frequent inability to manage the data in a time series, the processing of images and audio files, and missing data points that render the entire set useless.

However, alternative data shouldn’t be overlooked. In 2018, Nasdaq acquired an alternative data platform previously known as Quandl. Many took this event as a turning point for alternative data; Nasdaq’s willingness to answer client demands for alt data set the tone for other providers, some who had been previously hesitant to enter the space. While sourcing and utilizing alternative data can be challenging, it can also enrich existing datasets and provide uniquely valuable insights. For instance, Internet of Things (IoT) data can show you what thousands of Americans’ grocery lists look like, their gas usage, or when they’re about to buy new sneakers. Popularly, Hedge funds have been reported to use corporate jet data to predict merger and acquisition activity.

A lot of alternative and exhaust data is being funneled into commercial cloud providers like Amazon Web Services (AWS) and Google Cloud Provider (GCP) who attract a much wider audience than just financial participants. The scale of their cloud services can often result in a sea of irrelevant features and data for those looking to stay in the scope of professional capital markets. However, these companies have adapted and while offering their own data marketplaces, they also provide the underlying technology for companies like those listed below.

This list of cloud-based data marketplaces features those who specialize in core market data in the cloud as well as alternative offerings:

TMX Grapevine

TMX Grapevine is a cloud-based analytics-as-a-service platform that leverages Amazon Web Services (AWS). The analytics-as-a-service platform is a part of the larger TMX Group Limited—a leading Canadian financial services company and exchange group.

TMX Grapevine is slightly different than other cloud platforms as TMX’s focus is supplying an analytics-ready environment and analytics-proven data. This is ideal for firms looking for a packaged, turnkey solution. TMX Grapevine comes in three variations: TMX Grapevine Lite, TMX Grapevine Explore, TMX Grapevine Pro.

At the Grapevine Explore level users gain access to TMX’s core data which exceeds 20 Petabytes and covers Canada, the US, and Europe. Their costs are scalable and extremely variable divided into the three aforementioned categories and subdivided by data groups.

One thing that sets TMX apart is that they also offer TMX Logicly which is an ETF-specific analytics platform grown out of a collaboration between ETFLogic and TSX a subsidiary of TMX Group.

Their data types include:

  • Equity Intraday Trades
  • Trades and Quotes Tick Data
  • Essential Analytics for Options and Futures
  • Essential Analytics for Equities


IEX Cloud, owned by IEX Group, is a cloud-based data marketplace that specializes in financial data. While this DM leans away from some of the more abstract alternative datasets it instead offers international end-of-day stock prices, auction data, fundamental company data, CEO compensation, income statements, cryptocurrency, forex, quarterly earnings, and their own data from the Investors Exchange.

Some of their featured datasets include:

  • CityFalcon, a company that creates insights form financial news content using Natural Language Processing.
  • Fundamental data provided by New Constructs that gives insight into the profitability of public and privately owned companies
  • Stock Earnings, Estimates, Price Targets and Analyst Recommendations from Refinitiv
  • Audit Analytics, a dataset that tracks government regulation and the markets response
  • Language metrics that rank sentiment on company filings from BRAIN


IOWArocks works closely with the fintech industry and has experience with the pains of accessing proprietary data with legacy infrastructure. Their ‘Field of Dreams’ model seeks to level the playing field for all financial market participants by bringing proprietary access to cost-conscious firms.

Their core competencies include a focus on data monetization—they specialize in collecting, sourcing, and integrating data. Due to their pre-established infrastructure for internal and external data monetization, IOWArocks possess a broad suite of APIs. Their systems run on AWS servers, but they remain cloud agnostic.

IOWArocks also offers a granular search function to assist clients in instantly finding the data they need.

Some of their datasets include:

  • ESG Data that can help firms outperform the market
  • Mobile GPS Location Data
  • Online food purchase data
  • Defense Industry Manufacturing Data


CloudQuant supplies traditional market data but has a strong focus on alternative dataset sourcing and research. Their platform was designed with complex datasets in mind, they offer a single interface for all datasets saving clients the upfront cost of integration and allowing them to immediately proceed to trialing the data.

CloudQuant is a hybrid cloud and on-prem technology solution and are a part of the Amazon Partner Network and Google Advantage Network. Connecting to their liberator data fabric can be done in about two minutes via an internet connection. Their notable features include the ability to switch from research to production seamlessly and a built-in point-in-time functionality that allows users to keep track of multiple versions of their datasets.

Data sellers can also benefit from CloudQuant with third-party analysis from their quantitative research teams. The company seeks to empower its customers to profit from new data sources through its data scouting services, machine learning stack, and available analysis tools.

Some of their alternative datasets and derived indicators include:

Signum’s Liquidity Lamp Summary

Exegy’s predictive trading solutions arm—Signum—developed a portfolio of real-time signals that includes Quote Vector, Quote Fuse, and Liquidity Lamp. Quote Vector and Quote Fuse predict the direction and timing of changes to the NBBO. Liquidity Lamp reveals and tracks institutional investor reserve order activity.

Liquidity Lamp Summary, the dataset showcased on CloudQuant, is a daily summary of reserve order buying and selling activity on a per-stock, per-market basis derived from the real-time Liquidity Lamp signal. This provides a focused view of informed investors who use iceberg orders to deftly move large volumes of shares and is far timelier than scouring quarterly Form 4 and Form 13F regulatory filings.

Liquidity Lamp Summary (LLS) can enhance the performance of a broad range of quantitative strategies including long/short strategies both on the individual symbol level as well as at the index level. Signum’s internal teams have created offensive and defensive strategies based off LLS data; Diesel and Octane are both long/short predictive strategies that outperform the market.

Signum’s goal is to democratize access to proprietary datasets and actionable predictive insights. Cloud-based data marketplaces help Signum bring opportunity to previously locked out clients by creating an economy of scale that drives down market data prices and encourages the development of a competitive marketplace.

To learn more about market data in the cloud and Signum’s portfolio of real-time signals or Exegy’s breadth of real-time ultra-low latency, historical data, execution, and trading platform solutions book a free consultation.

Want to know more about Signum’s Liquidity Lamp Summary data? Learn how it can inform predictive strategies that generate unique alpha.

Exegy Insights Disclaimer