 
    Understanding the Internals of Binance and Poloniex Using Machine Learning and Data Visualizations
TL;DR →
Centralized exchanges remain one of the black boxes of the crypto landscape. Despite being a gateway into the world of decentralized finance, centralized exchanges are very…well…centralized in nature. The internal architecture of centralized exchanges remains a mystery to even the top experts in the crypto market. At IntoTheBlock, we are hard at work trying to use cutting edge machine learning techniques to unlock different insights about crypto-assets. In the process of mastering the knowledge to correctly identify these components, our models have uncovered some fascinating insights about the behavior of crypto exchanges.
Centralized exchanges remain one of the black boxes of the crypto landscape. Despite being a gateway into the world of decentralized finance, centralized exchanges are very…well…centralized in nature and hide many of its intrinsic mechanics from the scrutiny of public blockchains. Not surprisingly, ten years after the creation of Bitcoin, the internal architecture of centralized exchanges remains a mystery to even the top experts in the crypto market.
At IntoTheBlock, we are hard at work trying to use cutting edge machine learning techniques to unlock different insights about crypto-assets. One of the really difficult challenges we undertook a few months ago was the creation of a series of machine learning classifiers that learn to identify known actors in the crypto market such as exchanges, mixers, miners, OTC desks and several others. By far, the identification of exchanges have proven to be the most difficult challenges but, after several iterations and hundreds of experiments with machine learning models we started to see some encouraging results. One of the interesting side effects of training models to learn on the wild is that they start uncovering all sort of knowledge. In our case, our classifiers started discovering some of the internal patterns in the architecture of centralized exchanges that haven’t been made public. Today, I would like to illustrate some of those examples using two of the most popular cryto exchanges in the market: Binance and Poloniex.
Before we go a bit too deeper into the internal architecture details, it might be relevant to highlight some of the relevant components of centralized exchanges. A few days ago, I published an initial article that deeps dive into some of those details but it might be worth a quick refresh. In general, the address topology of a centralized crypto-exchange is based on four fundamental types of addresses:
· Hot Wallets: Hot wallets are typically the main interaction point between external parties and an exchange. Exchanges use this type of wallets to make an asset available to trade.
· Cold Wallets: Exchanges use cold wallets as a secured storage of crypto-assets. This type of wallets typically hold larger amounts of assets that are not intended to be traded frequently.
· Deposit Addresses: Deposit addresses are, often temporary, on-chain addresses used to transfer funds into an exchange. The focus of this type of address is to facilitate user to exchange money flows.
· Withdrawal Addresses: Withdrawal addresses are, often temporary, on-chain addresses that are used to transfer funds out of the main exchange wallet. Sometimes withdrawal addresses can play a dual role as deposit addresses.
These components often combine into fascinating patterns that enable different types of transactions. For instance, the following visualization illustrates funds being moved from an exchange’s hot wallet to withdrawal addresses and then being routed to deposit addresses in another exchange ending finally in another hot wallet.
The goal of our machine learning classification models is to identify each one of these components with a high degree of accuracy. Notice the legend on the top right that is identifying the different actors and the confidence in the estimation. In the process of mastering the knowledge to correctly identify these components, our models have uncovered some fascinating insights about the behavior of crypto exchanges like Binance and Poloniex.
Visualizations are a very intuitive way to understand complex interaction patterns such as transactions in and across centralized crypto exchanges. It is important to understand that different exchanges use different architectures to enable various transaction flows. This is particularly notable in the case of Poloniex and Binance.
The following figure visualization illustrates some of the details of Poloniex’s architecture to process Bitcoin transactions. This exchange uses two main hot wallets(highlighted in green). The main hot wallet(right) process a larger number of transactions while the secondary hot wallet collects the transactions from deposit addresses and then transfer the funds to the main hot wallet. Another aspect to highlight is that Poloniex constraints withdrawals to a small number of addresses (1 to 10).
Binance exhibits a very different architecture maybe due to its larger volume. For instance, withdrawals from the hot wallets go to a much larger number of addresses (often more than 100).
A super curious pattern in the Binance exchange is the fact that deposit addresses not only transfer to hot wallets but also to temporary addresses which, in turn, move the funds to hot wallets.
As two of the main crypto exchanges in the market, Poloniex and Binance constantly interact with each other. The following visualizations illustrates some bidirectional flows of transactions going from Binance hot walletsm to withdrawal addresses to Poloniex’s deposit addresses and its corresponding hot wallet. On the other side, we have transactions going from Poloniex’s withdrawal addresses to Binance’s deposit addresses to its hot wallet.
These are just some of the fascinating insights into Binance and Poloniex’s transaction architectures that can be extrapolated from applying machine learning models against blockchain datasets. Different centralized exchanges use different architectures to adapt to its transaction volume. The use data science reveals some fascinating patterns that explains some of the behavior of centralized exchanges in ways that are impossible by just looking at prices and order books.
This story on HackerNoon has a decentralized backup on Sia.
Meta Data: 📄