What TPS Really Measures in Blockchains — and When It Misleads

TPS (transactions per second) has become a symbol of performance in the blockchain industry. Networks are often compared by a single number, with claims of “tens of thousands of TPS.” The problem is that TPS by itself says very little about a system’s real throughput.

TPS is highly dependent on transaction types, block configuration, and measurement methodology. The same blockchain can report radically different TPS values without any architectural changes — simply by tuning execution parameters such as block size or block time. The number goes up, but the understanding of performance does not.

Recent experimental studies of Bitcoin and Ethereum show that nominal TPS figures diverge significantly from actual execution throughput under realistic workloads. Other work demonstrates that TPS varies sharply with transaction complexity and smart contract execution costs, while execution models and the degree of transaction-level parallelism impose hard limits on achievable throughput. This is why modern benchmarking efforts focus on workload normalization rather than raw TPS numbers, as reflected in recent benchmark suites for EVM-compatible blockchains.

The goal of this article is to show when TPS is a useful engineering metric — and when it becomes misleading. To do that, I built a custom blockchain based on Substrate, fixed a standard transaction workload, and systematically varied execution parameters: block size, block time, the share of block space available to user transactions, and the way load is generated. Based on these experiments, I analyze what TPS actually measures, how strongly it depends on configuration, and why it should not be treated as an inherent characteristic of a blockchain.

It is important to clearly define the scope of the experiment upfront. I intentionally focus on execution-layer throughput and do not analyze network propagation, block dissemination latency, or the behavior of a distributed validator cluster. This simplification removes many external variables and allows TPS to be studied as a function of the execution model and system configuration.

All benchmark code and experimental setups used in this article are open source and available on GitHub.

What TPS Is — and How to Think About It

TPS is a throughput metric: how many transactions a blockchain can execute and include in blocks per second. That’s it. TPS tells you nothing about latency, confirmation time, or finality. And it definitely doesn’t describe how much actual “work” those transactions perform.

The first problem with TPS is that transactions are not equal. A simple native token transfer and a complex smart contract call can differ by orders of magnitude in computational cost. If the transaction type isn’t fixed, any TPS number is meaningless.

The second problem is how TPS is counted. Are failed transactions included? What about system operations or contract calls? Different answers to these questions can change TPS dramatically without touching the execution engine at all.

Sometimes you’ll see a theoretical TPS ceiling expressed as:

(maximum block size / minimum transaction size) / block time

This gives you an upper bound, not real performance. Real blocks always contain a mix of transaction types, and execution is constrained by CPU time, scheduling, and built-in safety margins. That’s why claims like “10,000 TPS” without a clearly defined workload don’t say much.

Finally, there is no universal definition of a “transaction.” In smart contract platforms, a single transaction may execute a substantial amount of logic. In UTXO-based systems, a transaction is often little more than signature verification. Comparing TPS across these systems without normalizing the workload is methodologically wrong.

If you want TPS to mean something, you have to fix the workload first. In this article, I use a simple baseline:

non-conflicting native token transfers between pre-funded accounts.

This is not a model of real applications. It’s a clean reference point. Without that, TPS is just a number you can make bigger by changing the rules.

This choice is based on four simple principles.

1. Universality

Every blockchain has a native token. And every blockchain supports transferring that token between accounts. Using native token transfers makes the benchmark applicable to most networks without adapting it to a specific virtual machine, ABI, or smart contract model.

This avoids testing language runtimes or contract frameworks instead of the execution layer itself.

2. Comparability

Account creation differs significantly across blockchains. In some systems, accounts are created implicitly on first transfer. In others, they require an explicit operation. Some networks introduce storage deposits, rent, or minimum balance rules.

If account creation is part of the benchmark, TPS immediately becomes dependent on the account model of a specific chain. Using pre-funded accounts removes this variable and makes results easier to compare across systems.

3. Representativeness

In most blockchains, the majority of transactions are transfers between already existing accounts. This workload is realistic, largely independent of application-specific logic, and reflects baseline pressure on the execution layer.

It’s not meant to model complex applications, but it does represent a common and fundamental class of activity.

4. No State Conflicts

Non-conflicting transfers mean that each transaction operates on its own pair of accounts with independent nonces. This avoids artificial serialization caused by multiple transactions competing for the same state.

Execution engines that support parallel transaction processing can apply their optimizations under these conditions.

If conflicting transactions are used instead — for example, many transfers from the same account — the benchmark starts measuring more than throughput. It also captures lock contention, nonce handling, and architecture-specific scheduling behavior.

Benchmark Setup

Once the workload is fixed — non-conflicting native token transfers between pre-funded accounts — the next step is to fix the execution environment. Without this, any TPS number quickly turns into an abstraction.

For the experiments, I built a custom blockchain based on Substrate, using the current polkadot-sdk-solochain-template as a starting point. Substrate was chosen deliberately for its flexibility: it allows fine-grained control over block parameters, execution rules, and transaction weights. This makes it possible to study how specific architectural decisions translate directly into throughput.

The node was run locally. I intentionally did not measure network communication, block propagation latency, or the behavior of a distributed validator cluster. Including networking would introduce too many uncontrolled variables and make the results noisy and harder to interpret. Removing network effects makes the experiment more predictable and the measurements more stable. The goal here is to study execution-layer throughput, not the resilience of a p2p network.

This is a fundamental assumption. I am measuring the upper bound of execution performance, not how the system behaves in a real distributed environment.

In its baseline configuration, Substrate operates in a deliberately conservative mode. Blocks are produced every six seconds, and only 75% of block capacity is available for user transactions. The remaining space is reserved for system extrinsics. In addition, part of the block time slot is explicitly left for networking and internal operations. This configuration prioritizes network stability under load, but it also limits the maximum achievable throughput.

This reserve is not accidental. Its purpose is to keep the system operational during bursts of user activity and to prevent critical system operations from being crowded out. We have already seen what happens when such a buffer is missing. During the April 30, 2022 Solana Mainnet Beta outage, a large-scale NFT mint effectively pushed control-plane traffic out of the system. Validators failed to stay in sync, the network degraded, and a manual cluster restart was required.

This is exactly why Substrate does not allow blocks to be fully filled with user transactions by default. The 75% limit is a stability safeguard, not a technical limitation of the execution engine.

I use this default configuration as the baseline. To make the experiments easier to run and reason about, I centralized all relevant parameters in the runtime and exposed them via feature flags. In particular, the following parameters are defined in one place:

block time
the share of block capacity available to user transactions
maximum block size

execution time allocation within the slot.

// Execution-related runtime configuration
mod block_times {
    use sp_runtime::Perbill;
    // Target block time
    #[cfg(not(feature = "fast-block"))]
    pub const MILLI_SECS_PER_BLOCK: u64 = 6000;
    #[cfg(feature = "fast-block")]
    pub const MILLI_SECS_PER_BLOCK: u64 = 3000;
    // Slot duration used by Aura / Timestamp
    pub const SLOT_DURATION: u64 = MILLI_SECS_PER_BLOCK;
    // Time reserved for transaction execution
    pub const EXECUTION_TIME_SEC: u64 = 2;
    // Share of block capacity available to normal user transactions
    #[cfg(not(feature = "full-block"))]
    pub const NORMAL_DISPATCH_RATIO: Perbill = Perbill::from_percent(75);
    #[cfg(feature = "full-block")]
    pub const NORMAL_DISPATCH_RATIO: Perbill = Perbill::one();
    // Maximum block size
    #[cfg(not(feature = "full-block"))]
    pub const BLOCK_LEN_BYTES: u32 = 5 * 1024 * 1024;
    #[cfg(feature = "full-block")]
    pub const BLOCK_LEN_BYTES: u32 = 20 * 1024 * 1024;
}

This setup allows me to vary configurations systematically without rewriting the runtime for each experiment, and to ensure that compared modes differ only in explicitly defined parameters.

From there, I began modifying execution parameters step by step. In full-block mode, the entire block capacity becomes available to user transactions and the maximum block size is increased. In practice, this removes the safety margin built into the default configuration. Throughput increases, but in a real distributed environment this would reduce resilience to overload.

In fast-block mode, the block production interval is reduced to three seconds. This nearly doubles how often transactions can be included in blocks and directly impacts TPS. However, shorter slots also mean less time for block propagation and a higher fork risk in a real network. In this local experiment, I deliberately ignore this factor to explore the execution engine’s throughput limits.

Special attention was paid to how load is generated. In the initial version of the benchmark, transactions were sent sequentially. The resulting TPS values were extremely low. Further analysis showed that the bottleneck was not blockchain execution, but transaction generation and submission itself. After switching to parallel submission and distributing load across multiple threads, TPS increased by an order of magnitude.

This leads to an important methodological takeaway: when measuring TPS, it is critical to ensure that the bottleneck is the execution system, not the benchmark.

Overall, the experiment varies four classes of parameters: transaction type, block size, block frequency, and load submission strategy. This makes it possible to treat TPS not as a fixed property of a network, but as a function of execution configuration. Understanding this dependency is the focus of the analysis that follows.

Results and Interpretation

Once the execution environment and transaction workload are fixed, we can move on to the results. It is important to be explicit about what these numbers represent. The measured TPS values are not the “speed of the blockchain in general,” but the outcome of a specific execution configuration under a specific workload.

Below are the median TPS values observed across different scenarios. The baseline configuration is substrate-basic, defined as a 6-second block time, 75% of block capacity available to user transactions, and the default block size.

The first observation concerns how transactions are submitted. When conflicting transfers were sent sequentially, TPS was extremely low — on the order of a few dozen transactions per second. The cause was not Substrate itself, but the benchmark becoming the bottleneck. The node spent most of its time idle, waiting for the next transaction to arrive. After switching to parallel submission, the workload began to saturate blocks and TPS increased by more than an order of magnitude. This highlights a critical point: before drawing any conclusions, it is essential to verify that you are measuring the blockchain, not the limitations of the testing tool.

In the baseline configuration (6-second blocks with 75% of block space available to user transactions), non-conflicting transfers and conflicting transfers achieved the same TPS. This is an important result. In Substrate’s current execution model, transactions are executed sequentially, so the absence of state conflicts does not increase throughput. In other words, there is no transaction-level parallelism in this configuration.

Replacing standard transfers with the lightweight remark transaction increased TPS by roughly 25%. The reason is straightforward: the operation has a much lower execution weight, barely modifies state, and allows more transactions to fit into a block. This directly demonstrates how sensitive TPS is to the computational cost of a transaction. At this point, it becomes obvious that comparing TPS across different networks without specifying the transaction type is fundamentally misleading.

Enabling full-block mode increased TPS by approximately 70% compared to the baseline configuration. There is no magic here — the 75% limit is simply removed and the maximum amount of user transactions per block is increased. This is a purely configurational change. It boosts throughput, but at the cost of making the system more sensitive to overload, since the safety margin reserved for system operations is no longer present (as discussed earlier).

Further reducing the block time to three seconds (fast-block mode) nearly doubled TPS compared to the baseline configuration. This result is expected: if blocks are produced twice as often, throughput increases almost proportionally, all else being equal. In a real distributed environment, however, shorter block times increase fork probability and place higher demands on the network. In a local experiment, this factor does not manifest, which is important to keep in mind when interpreting the results.

Transaction batching deserves separate attention. With batching, tens, hundreds, or even thousands of transfers are packed into a single transaction, significantly reducing per-transaction overhead. This is conceptually similar to what rollups do. If TPS is counted by the number of internal operations, the metric jumps dramatically — into the tens of thousands of operations per second. If, instead, TPS is counted by the number of extrinsics, throughput barely changes. This clearly shows how sensitive TPS is to what is defined as a “transaction.” Batching is a powerful optimization technique for fees and overhead, but it also illustrates how easily TPS numbers can be inflated.

For reference, I ran the same type of test on Solana. The observed TPS was significantly higher, which is consistent with Solana’s parallel execution model and optimized pipeline. At the same time, direct comparison with Substrate requires caution. The account model, transaction format, and execution engine are fundamentally different. Without normalizing the workload, such comparisons can be misleading.

System behavior under overload also differs. In the Substrate configuration used here, all correctly formed transactions that enter the mempool will eventually be included in blocks. Given enough time, transactions are delayed but not dropped.

In Solana’s architecture, transaction lifetime is bounded. If a transaction is not processed within the valid window defined by blockhash expiration, it becomes invalid. Under high load, some transactions may simply fail to land in a block in time.

This means that high nominal throughput does not guarantee the absence of losses. When transaction generation outpaces the network’s ability to include them in blocks, a portion of transactions can be dropped. Exact drop rates are not reported here, as they depend directly on load generation rate and submission parameters.

A detailed analysis of transaction behavior in the Solana network and the mechanics of landing transactions on the TPU is provided in Anza’s technical report.

This further reinforces the core point: comparing networks purely by TPS, without accounting for architectural differences and transaction handling policies, can produce an incomplete or distorted picture.

When TPS Is Useful — and When It Isn’t

The results show that TPS is neither a constant nor an inherent “property of a blockchain.” It is a derivative metric, shaped by a set of parameters. In practice, TPS is directly influenced by:

block configuration (size and share of user space),
block production frequency,
transaction weight,
load submission strategy,
execution model (sequential vs. parallel).

In these experiments, higher TPS was achieved not by changing the underlying architecture, but by removing safety margins, shortening block time, or modifying transaction structure. This means that TPS can be significantly altered through configuration alone.

As a result, the TPS number by itself is meaningless without explicitly stating:

the transaction type,
block parameters,
execution model,
testing conditions.

This is why TPS is useful as an internal engineering metric — for comparing different modes of the same system under fixed assumptions. Outside of that context, or when used to compare different architectures without workload normalization, it easily becomes misleading.

TPS should not be treated as a universal measure of network performance. That does not make it useless. Under the right conditions, TPS is a valid and sometimes necessary engineering tool.

1. Comparing changes within a single system

TPS is most useful when comparing different configurations of the same blockchain under fixed assumptions. If transaction type, hardware environment, and load generation remain constant, changes in TPS reflect real changes in system capacity.

This is exactly how TPS was used in this experiment:

enabling full-block resulted in ~42% higher TPS,
reducing block time (fast-block) produced more than a 2× increase,
replacing transfers with remark yielded +22% in the baseline scenario, but only ~5% in fast-block mode.

Under these conditions, TPS is a valid measure of capacity change. What matters is not the absolute TPS value, but the delta between configurations.

2. Capacity planning

TPS is useful for estimating maximum system throughput under a given configuration. If an application is expected to handle a peak load — for example, 400 transfers per second — TPS measurements can determine whether the current setup can sustain it. If not, block parameters must be adjusted, block frequency increased, or the architecture scaled. In this context, TPS serves as a capacity planning tool.

3. Regression testing

TPS is also well suited as a performance regression metric. If TPS drops after changes to the runtime or execution engine under the same workload, this signals a performance regression. There is no need to know the “correct” TPS value in absolute terms. What matters is that, under identical conditions, a new version does not become slower than the previous one. For this reason, TPS can be used effectively in CI as a regression signal.

Conclusion

TPS is a useful engineering metric — but only under clearly defined assumptions. It describes the throughput of a specific execution configuration under a specific transaction workload and a specific load generation model. Nothing more.

The experiments in this article show that TPS can be significantly increased through configuration alone: by enlarging blocks, shortening block times, using lighter transactions, or batching multiple operations together. A higher TPS number does not necessarily indicate a better architecture. In many cases, it simply reflects a different trade-off between performance, stability, and safety.

For this reason, TPS is meaningful as an internal analysis tool within a single system, where assumptions and parameters are fixed. Outside of that context — when transaction types, execution models, and architectural constraints differ — TPS quickly turns into a marketing number rather than a measure of real performance.