sia.hackernoon.com

Before writing this article, I had a couple of conversations with engineers concerning building peer-to-peer applications, what they think about it, their experiences, and approaches. Many times, we don’t get past the “what they think about it” part of the conversation because many of them don’t actually know what it is outside its famous misconceptions. A few of them have built applications they believe are p2p just because there is some sort of direct communication between some parties, but don’t really fit the real nature of p2p ( actually client-server ).

In this article, I will try to debunk some of these misconceptions, explain what p2p applications really are, the protocol, application design, and some common technologies or tools to leverage when building p2p applications.

Client-Server introduction

In the mid-1990s, when there was a fast adoption of the World Wide Web and HTTP, the internet quickly transformed from an early peer-to-peer network into a more centralized content-consuming network. With this adoption and transformation, the client-server architecture became the most commonly used model for data transfer and content delivery, giving rise to keywords like “web server”. This concept solidified the idea of having dedicated computer systems to run the server models to deliver content on requests. This client-server architecture allowed for the designation of computers as hosts and others as clients. In this model, the designated servers needed to be online all the time and with good connectivity. The server provides clients with data and can also receive data from clients.

What misconceptions did we find?

Illegal use case

Peer networks are most times misunderstood to be utilized only for file sharing. Peer-to-peer (P2P) networks were made popular by file-sharing services, although their uses are far more extensive than this. These days, they are essential in many domains, including distributed computing, blockchain technology, and even secure communication. They are adaptable and creative due to their decentralized structure.

Lack of security

Another common misconception is that peer networks inherently lack security. It is critical to recognize that security in peer networks is dependent on the protocols and encryption mechanisms utilized. Many modern peer-to-peer systems use strong security features to protect data and maintain privacy.

Totally off servers

Some believed that because it’s p2p, there are no servers involved. Many p2p systems today rely on some centralized components to support their operations; some of these components are designed to facilitate peer discovery, signaling, and even coordination. For example, BitTorrent makes use of trackers or DHTs for peer discovery. Completely serverless P2P is extremely difficult in practice due to NAT traversal, bootstrapping, and discovery challenges.

Privacy and Anonymity

Peer-to-peer does not guarantee it is private. IP addresses are visible to peers, traffic patterns can be examined, and without effective encryption and routing, such as content hashing and Tor, P2P software might leak sensitive data.

What exactly is peer-to-peer, then?

Simply put, p2p networks allow computers or participants to connect and share resources without any central server. In this architecture, any participant can function as both a server and a client. Having both roles allows the participants to send and receive information and services from one another.

On client-server networks, clients request services or resources from a centralized server, which controls access to those resources. In contrast, peer-to-peer networks enable peers to share resources without the need for an intermediary to administer access control or data dissemination. One of the primary benefits of peer-to-peer networks is their decentralization, which makes them more resilient to censorship and network outages. Furthermore, P2P networks can make better use of resources because each peer on the network can contribute resources like bandwidth, computing power, or storage space.

But how does this network protocol or architecture work?

When you form a network of connections where each participant directly communicates with the other without any central authority or server, you have built the basic architecture of a peer-to-peer network. Every connected participant is a peer, and these peers connect through various protocols. They can decide to dynamically join and leave the network whenever they want to.

The sequence diagram above illustrates:

Joining Phase - How a new peer discovers and connects to existing peers
Dual Role Activation - The peer becoming both client and server
Peer Operations - Bidirectional resource requests showing the dual nature
Network Scaling - How new peers join and the network grows horizontally
Collaborative Operations - Multi-peer transactions and resource sharing
Dynamic Reorganization - What happens when peers leave, and how the network self-organizes

Each participant in a peer-to-peer network serves as both a client and a server. This dual function differs from the usual client-server architecture, in which clients request services and servers deliver them. As additional peers join a peer-to-peer network, its resources and capabilities increase, allowing it to scale horizontally. This scalability improves the network's reliability and flexibility. Peers can discover and connect via a variety of mechanisms, such as centralized directories, distributed hash tables (DHTs), and peer exchange protocols.

These methods allow for more effective networking and communication among peers. Tixati, the Kad network, and BitTorrent are three notable examples of distributed networks that use DHTs. P2P networks' self-organizing nature allows them to dynamically alter and rearrange when peers join and leave, maximizing resource utilization and spreading the load over several peers.

In a P2P network, every peer has the same capabilities and can: initiate or complete transactions, share resources, collaborate on tasks, and exchange information.

What you should consider when building peer-to-peer applications

Think network topology, decentralization, node/peer discovery, direct communication protocol, resource sharing, synchronization mechanisms, etc.

We are going to be building a peer-to-peer expense tracker application. This application allows peers to join a network, create expense groups, add expenses, and get balances. For simplicity, I will be adding and explaining code snippets relating the the peer-to-peer architecture and concerns.

Network Layer

Connection Management

When building p2p applications, you need a robust strategy for peer discovery and connection establishment. This typically involves bootstrap nodes or a lightweight discovery service to find initial peers, plus a protocol for peers to share information about other peers they know, like gossip protocols. Think of it like the Genesis connection. Plan for NAT traversal using STUN/TURN servers, and implement connection pooling to maintain a stable set of active peers rather than connecting to everyone. IPFS Libp2p is a great library that serves as a modular network stack that allows developers to build decentralized peer-to-peer applications.

Topology or Structure, or Overlay

Choose a network structure that is appropriate for your use case. Unstructured networks (random mesh) are simple and durable, but less efficient for lookups. Structured overlays, such as DHTs (Kademlia and Chord), provide efficient key-based routing. Hybrid techniques frequently work best, incorporating structure where necessary while keeping mesh resilience. Libp2p also provides plugins like the libp2p-kad-dht, which is useful for providing support for DHT-based routing and lookups, and operations.

func SetupLibp2p(ctx context.Context, bootstrapAddressStr, bootstrapSetupPort, appProtocol string) (*dht.IpfsDHT, host.Host, error) {
	var bootstrapNode bool

	if len(bootstrapAddressStr) == 0 {
		bootstrapNode = true
	}

	var h host.Host
	var bootstrapPeer *peer.AddrInfo
	var err error

	if bootstrapNode {
		bHost, err := libp2p.New(
			libp2p.ListenAddrStrings(
				fmt.Sprintf("/ip4/0.0.0.0/tcp/%s", bootstrapSetupPort),
				fmt.Sprintf("/ip4/0.0.0.0/udp/%s/quic", bootstrapSetupPort),
			),
		)
		if err != nil {
			return nil, nil, err
		}
		h = bHost
	} ...
}

In the above code block, we intend to set up the peer-to-peer network by leveraging the libp2p library. The function determines whether this node will act as a bootstrap node (the initial connection point in the network) or a regular client node. If no bootstrap address is provided, this node becomes the bootstrap node itself. For bootstrap nodes, we create a libp2p host that listens on a specific port using both TCP and QUIC (UDP-based) transports. The 0.0.0.0 address means it accepts connections on all network interfaces.

...
else {
    bootstrapAddr, err := ma.NewMultiaddr(bootstrapAddressStr)
    if err != nil {
        return nil, nil, fmt.Errorf("invalid bootstrap address: %w", err)
    }
    cBootstrapPeer, err := peer.AddrInfoFromP2pAddr(bootstrapAddr)
    if err != nil {
        return nil, nil, fmt.Errorf("failed to parse bootstrap peer: %w", err)
    }
    bootstrapPeer = cBootstrapPeer

    cHost, err := libp2p.New(
        libp2p.ListenAddrStrings("/ip4/0.0.0.0/tcp/0"),
    )
    if err != nil {
        log.Fatal(err)
    }
    h = cHost
}

For client nodes, the code parses the bootstrap node's multiaddress (a flexible addressing format for P2P networks) and creates a host that listens on a random port (port 0).

Data Layer

Storage and Replication

Here, you want to design how data is going to be stored and distributed across peers. Storage can be done locally amongst peers in a DHT (Distributed Hash Table), a local data structure, or an in-memory store, and then broadcast to other peers via a pub-sub mechanism. Decentralized storage providers like Filecoin and Arweave foster storage without any central authority. You might use content-addressed storage (hash-based) for immutability, partition data using consistent hashing, or replicate popular content. Consider how much data each peer stores and implement garbage collection for old or unpopular content.

...
dstore := dsync.MutexWrap(ds.NewMapDatastore())
protocolPrefix := protocol.ID(appProtocol)

if bootstrapNode {
    kadDHT, err = dht.New(
        ctx,
        h,
        dht.Datastore(dstore),
        dht.Mode(dht.ModeServer),
        dht.BootstrapPeers(peer.AddrInfo{}),
        dht.ProtocolPrefix(protocolPrefix),
        dht.Validator(&AllowAnyValidator{}),
    )
    if err != nil {
        return nil, nil, err
    }
    if err := kadDHT.Bootstrap(ctx); err != nil {
        log.Printf("Bootstrap error: %v", err)
    }

We create an in-memory datastore that is wrapped with mutex protection for thread-safe DHT operations ( PUT and GET). The protocol prefix allows the DHT to operate on a custom protocol namespace, enabling isolated networks. We then go ahead to create the DHT (Kademlia) in server mode with no bootstrap peers. Note that the dht.BootstrapPeers is currently set to empty, but for other modes like non-bootstrap peers, the dht.BootstapPeers should be set to the peer address info of the existing bootstrap peer, like below:

else {
    kadDHT, err = dht.New(ctx, h,
        dht.Datastore(dstore),
        dht.Mode(dht.ModeServer),
        dht.ProtocolPrefix(protocolPrefix),
        dht.Validator(&AllowAnyValidator{}),
        dht.BootstrapPeers(*bootstrapPeer),
    )
    if err != nil {
        return nil, nil, err
    }

The bootstrap process initializes the DHT’s routing table and tells the DHT where to begin discovering the network for client nodes with a configured bootstrap peer address.

Consistency Model

Taking into consideration the decentralized nature of p2p architecture, consistency guarantee models have to be explicit. Strong consistency in p2p can be very difficult to achieve, so most p2p applications employ various eventual consistency strategies with conflict resolution like last-write-wins, CRDTs, version vectors, etc.


package crdt

import (
	"sync"
)

// PNCounter represents a Positive-Negative Counter
type PNCounter struct {
	P  map[string]int64 // positive increments
	N  map[string]int64 // negative increments
	mu sync.RWMutex
}

// NewPNCounter creates a new PNCounter
func NewPNCounter() *PNCounter {
	return &PNCounter{
		P: make(map[string]int64),
		N: make(map[string]int64),
	}
}

// Increment increments the counter by delta
func (c *PNCounter) Increment(id string, delta int64) {
	c.mu.Lock()
	defer c.mu.Unlock()
	c.P[id] += delta
}

// Decrement decrements the counter by delta
func (c *PNCounter) Decrement(id string, delta int64) {
	c.mu.Lock()
	defer c.mu.Unlock()
	c.N[id] += delta
}

// Value returns the current value
func (c *PNCounter) Value() int64 {
	c.mu.RLock()
	defer c.mu.RUnlock()
	var p, n int64
	for _, v := range c.P {
		p += v
	}
	for _, v := range c.N {
		n += v
	}
	return p - n
}

// Merge merges another PNCounter
func (c *PNCounter) Merge(other *PNCounter) {
	c.mu.Lock()
	defer c.mu.Unlock()
	for id, v := range other.P {
		c.P[id] = max(c.P[id], v)
	}
	for id, v := range other.N {
		c.N[id] = max(c.N[id], v)
	}
}

func max(a, b int64) int64 {
	if a > b {
		return a
	}
	return b
}

The PN-Counter CRDT structure is what we use to track balances in the application from a distributed standpoint. Whenever a request comes in to impact the balance of a participant, we increment and decrement the PN-Counter accordingly (positive and negative counter). This way, we are able to reconcile the final value even when the request comes from multiple peers.

The mutex ensures thread-safe operations. This separation is crucial because it allows the counter to handle both increases and decreases in a conflict-free manner. By separating positive and negative operations, we avoid the problem of concurrent updates overwriting each other. Each peer's operations are independent and can be merged deterministically.

This repo directory contains other CRDT structures like OrMap, and OrSet implementations for various use-cases within the application to carry out various operations relating to tracking and merging data and resolving conflicts that may arise from contributions from multiple peers.

Security Layer

Trust and Reputation

To debunk the problem of peer-to-peer networks lacking security and trust, we need to put in place mechanisms to prevent malicious behaviors. Implement reputation systems where peers track the reliability of others, use cryptographic signatures to verify data authenticity, and design protocols resistant to various attacks like the Sybil attack.

package utils

import (
	"crypto/ecdsa"
	"crypto/elliptic"
	"crypto/rand"
	"crypto/sha256"
	"encoding/hex"
	"fmt"
	"math/big"
)

// GenerateKeyPair generates a new ECDSA key pair
func GenerateKeyPair() (*ecdsa.PrivateKey, []byte, error) {
	privateKey, err := ecdsa.GenerateKey(elliptic.P256(), rand.Reader)
	if err != nil {
		return nil, nil, fmt.Errorf("failed to generate key pair: %w", err)
	}

	// Use the standard uncompressed format (0x04 + x + y)
	publicKey := make([]byte, 65)
	publicKey[0] = 0x04
	privateKey.PublicKey.X.FillBytes(publicKey[1:33])
	privateKey.PublicKey.Y.FillBytes(publicKey[33:65])

	return privateKey, publicKey, nil
}

// SignData signs data with a private key
func SignData(privateKey *ecdsa.PrivateKey, data []byte) ([]byte, error) {
	hash := sha256.Sum256(data)
	r, s, err := ecdsa.Sign(rand.Reader, privateKey, hash[:])
	if err != nil {
		return nil, fmt.Errorf("failed to sign data: %w", err)
	}

	// Encode signature as r || s
	signature := append(r.Bytes(), s.Bytes()...)
	return signature, nil
}

// VerifySignature verifies a signature against public key and data
func VerifySignature(publicKeyBytes, data, signature []byte) error {
	if len(signature) != 64 {
		return fmt.Errorf("invalid signature length")
	}

	// Parse public key
	x := new(big.Int).SetBytes(publicKeyBytes[1:33]) // Skip the 0x04 prefix
	y := new(big.Int).SetBytes(publicKeyBytes[33:65])
	publicKey := &ecdsa.PublicKey{
		Curve: elliptic.P256(),
		X:     x,
		Y:     y,
	}

	// Parse signature
	r := new(big.Int).SetBytes(signature[:32])
	s := new(big.Int).SetBytes(signature[32:])

	hash := sha256.Sum256(data)

	if !ecdsa.Verify(publicKey, hash[:], r, s) {
		return fmt.Errorf("signature verification failed")
	}

	return nil
}

// CreateOperationData creates canonical data for signing operations
func CreateOperationData(operation string, groupID, userID string, timestamp int64, extraData map[string]interface{}) []byte {
	data := fmt.Sprintf("%s|%s|%s|%d", operation, groupID, userID, timestamp)

	for key, value := range extraData {
		data += fmt.Sprintf("|%s:%v", key, value)
	}

	return []byte(data)
}

In the code block above, we implemented some cryptographic primitives for securing the p2p data access and operation layer. We want to make sure that only participants can contribute to the network, and we are doing this by providing authentication, integrity, and non-repudiation. This utility package implements ECDSA (Elliptic Curve Digital Signature Algorithm) for signing and verifying operations in the peer-to-peer network.

Final Thoughts

Building peer-to-peer applications requires some sort of shift in thinking from regular client-server architectures. While P2P applications come with their own set of challenges that have been highlighted in this article, like NAT traversals, peer discovery, and consistency management, they offer compelling advantages in terms of resiliency and resource distribution.

For a complete representation of the peer-to-peer application in the sample codes above, here is the repo containing a current WIP and fully operational p2p system as described in this article.

Building Actual Peer-to-peer Applications: Outside Misconceptions