The dark web 101: what it is, how it works, and why people use it
Published | Last Updated
Interesting fun fact, the internet only connects about 5.5 billion people as of 2024. Not surprisingly, most of these people live in developed countries as you can see in the animation below from the United Nations.

There are infinite ways to classify the internet, but I think that the most mysterious one is the breakdown of clearnet, deepweb, and darkweb. Most people that are aware of the distinction between those three terms don’t get too hot and bothered about the darkweb or the deepweb, but to the layman, the darkweb sounds like a scary place where you can only find spooky and unsavory things. I’ve always had a fascination with tech so I thought I’d do some research myself to understand things a little better.
Unfortunately, to truly understand the dark web you also kind of need to understand the basics of what makes the regular old internet tick. If you’re already familiar with the basic then you can skip ahead to Some Definitions
The Modern Internet
A Flash of Light as the Birth of the Web
Before the mushroom clouds in the Oppenheimer movie were a twinkle in Christopher Nolan’s eye, there was the real Oppenheimer. He’s famous for his quotes, one of my favorites being: “We knew the world would not be the same. A few people laughed, a few people cried.” No one at the time could have imagined that the nuclear bomb would spark a mad race to prevent further destruction which resulted in the internet.
In the 1960s, the United States Air Force was trying to improve its ability to ensure mutually assured destruction (MAD). My seventh-grade history teacher loved that term for all the right reasons. MAD was the big idea to prevent future nuclear conflict. It all boiled down to the most basic of human fears: if you nuke us, we nuke you. Simple game theory says that the stable equilibrium is no nuclear launches as long as both sides can still communicate and coordinate a retaliatory strike.
To guarantee that communication, the military needed systems that could survive jamming, attacks, and partial blackouts. These systems needed to send messages over long distances without relying on a direct connection.
Enter packet switching: the foundation of the modern internet.
Packet Switching, ARPANET, and TCP/IP
The core issue with point-to-point communication systems is that we want to send arbitrarily long messages over an unreliable messaging medium (all of them are, to some extent). For the sake of argument, let’s say that you’re trying to send a message cross country. It’s critically important that the message arrive and you don’t want to take any chances that your recipient misses even a piece.
You could send messages over radio in their entirety or encoded/encrypted messages over morse code or telegram, but intermittent connection mean data loss.
Packet switching is the idea that you can chunk messages into fixed packets of information that are transmitted over an arbitrary network medium. The packet need only contain enough information to identify the start of the message, the address, sender, and an id for ordering parts of, the piece of the message, and lastly a little piece to identify the end of the message. This way, instead of needing to be able to send the whole message from point A to point B, you can reliably transport most of it in pieces and just have the recipient re-request for any missing pieces. You can see a sample of a data packet here:

If you break a message into 100 packets and your network drops 10%, you just resend the missing packets until the recipient has all of them. All you need is a network of computers willing to relay packets. The first such network was ARPANET, created by DARPA. ARPANET didn’t initially support lost packet recovery - if the network failed to deliver, the sender and receiver simply got stuck waiting.
To fix this, early developers created a protocol that sits on top of the network called Transmission Control Protocol/Internet Protocol (TCP/IP). This protocol had roughly 4 ground rules:
- Each distinct sub-network in the broader network could/should stand on its own. No overarching network could impose requirements for other networks to connect.
- Communications (i.e. packets) would be forwarded on a best efforts basis. If a packet didn’t make it to the final destination, it would be up to the packet source to retransmit.
- Intermediate connection points would store no information about the individual flows of packets to keep them simpler and low latency
- No global authority would/could own the network
They also needed to handle the following issues:
- Algorithms to prevent lost packets from permanently disabling communications and enabling them to be successfully retransmitted from the source.
- Providing for host-to-host “pipelining” so that multiple packets could be enroute from source to destination at the discretion of the participating hosts, if the intermediate networks allowed it.
- Gateway functions to allow it to forward packets appropriately. This included interpreting IP headers for routing, handling interfaces, breaking packets into smaller pieces if necessary, etc.
- The need for end-end checksums, reassembly of packets from fragments and detection of duplicates, if any.
- The need for global addressing
- Techniques for host-to-host flow control.
- Interfacing with the various operating systems
- There were also other concerns, such as implementation efficiency, internetwork performance, but these were secondary considerations at first.
HTTP/HTTPS
There’s a fundamental issue with sending TCP/IP packets: packet contents are cleartext. Anyone along the route can read them, like passing a note to your crush through a classroom and every kid opens it on the way.
What we need is some way for the messages to be hidden from prying eyes but for the correct recipient to be able to read the message without anybody else reading it. Enter encryption. Cryptography is a field that is far too deep to even start with a 101 in this post, but maybe I’ll get into it another time. For now, it suffices to say that there is a protocol called Hypertext Transfer Protocol (HTTP) which sends data between two communicating services, which can be encrypted using Transport Layer Security (TLS). This encrypted variant of HTTP is called HTTPS (HTTP Secured).
Summary
The modern internet is a stack of different protocols:
- Physical - the actual wires connecting computers
- Data link - allows sending “frames” of data
- Network - layerleyer that coordinates sending packets across a network, including addressing, routing, and traffic control
- Transport (TCP, UDP)
- Session - managing continuous streams of data between two clients
- Presentation - encoding, compression, encryption/decryption
- Application (HTTP, HTTPS, FTP, FTPS, SFTP, SMTP) - high level protocols that coordinate how to actually exchange data
These layers enable encrypted communicatino over unreliable networks. Fleshing out the summary here into more details would be a literal textbook, so I won’t go into more details ARPANET, TCP/IP, HTTP, SSL, or TLS. If you’re interested you can follow A Brief History of the Internet and all the online resources like the OSI Model overview on Wikipedia that describe the modern web stack. Those articles have much better technical writers than I and have included more detail than I’ll ever be able to address.
With the basics covered, we can define the terms from the introduction.
Some Definitions
- clearnet, surface web - the shallowest layer of the internet. Sites that are accessible by the broad public with limited anonymity or barriers to entry. For example, public Twitter, the NYTimes, Reddit, etc. You can confidently send messages to and receive message from clearnet sites, but you can’t guarantee that the communication is encrypted or anonymous. Snoops can figure out who the communicating parties are pretty easily by monitoring packet traffic.
- deep web - hidden and unindexed websites. You can’t just find these by searching Google or you may need to log in to even know they exist. This includes banking websites and pages hidden behind paywalls
- dark web - web sites and services which are unindexed, and inaccessible except by special encryption schemes that statistically guarantee the user and service’s anonymity and security against the most basic attack vectors.

Who Needs the Darkweb?
Even though the modern internet stack using HTTPS hides message contents, it does not hide WHO is communicating. To normal people, this is no big deal. This could literally be life or death for:
- Political dissidents: try being a Russian critic of Putin or a Chinese critic of the CCP. These folks could use a way to communicate and share information without anybody knowing that they’re communicating
- Journalists: need to be able to communicate with sources securely (provided by HTTPS) AND ANONYMOUSLY
- Whistleblowers: they’ll need to communicate anonymously with tip off points
- Criminals: probably obvious why they would want to communicate anonymously
- People looking to prevent digital advertisers from tracking them
- People that want to support better privacy for all internet users
The dark web solves this problem by introducing a statistically anonymous communication protocol.
The Onion Router
If you’re any of the above cateogries, you might want to use onion routing.
The protocol is pretty simple. Instead of sending messages directly from sender to recipient, packets are routed in a secure layer from sender to intermediary 1, then intermediary 2, then to intermediary 3 and finally to the receipient.
To guarantee anonymity of the sender and recipient, the original message is wrapped in encrypted envelops in multiple layers. If intermediaries pass enough messages, there is a statistically low probability that somebody watching messages passed through the network could back out the original sender and intended recipient. That snoop would need to be watching a large portion of the network simultaneously.
Palo Alto Networks has a pretty good overview of TOR. The procedure is roughly as follows:
- A user builds a TOR circuit by selecting three nodes to relay messages and obtains a shared public encryption key from each of them. Let’s call them
N1,N2, andN3 - The user encrypts the private message that she wants to send in three layers. Let’s say encryption of a message M with public key
iis denoted byfi(M). The user sends a message M1 that looks likef1(f2(f3(M)))to relay node N1. - node 1 decrypts the wrapped message into the internal message. it reads that the next node in sequence is N2. node 1 forwards the contents of the message M1 to N2.
- node 2 decrypts the wrapped message into the internal message. it reads that the next node in sequence is N3. node 2 forwards the contents of the message M2 to N3.
- node 3 decrypts the wrapped message into the internal message. it reads that the next node in sequence is the intended recipient (R). node 3 forwards the contents of M2 to recipient R.
- the recipient receives the message and decrypts it. Recipient now has the option to reply and each relay node can rewrap messages at each step of the way to send messages back
What’s great about this is that at each step, the sender only knows the alias for the service, and each relay only knows about their immediate neighbors in the message. Even better, the receiver knows nothing about the sender except the contents of the message, and the sender knows nothing about the receiver except the alias of the service and the responses that they send.

Let’s assume now that there is some globally omniscient observer that can view every connection and every message in and out of every network node.
If the communicators use TOR, then it becomes impossible for said observer to correlate the first wrapped message with the final message sent to the receiver unless the observer can control all of the relay nodes.

Operational Security (OpSec) Considerations
Before we start, you might ask: why do I need this? Presumably you want to avoid letting bad actors (read: Facebook? OpenAI? Hostile governments?) surveil you. You also want to avoid exposing yourself to attacks and theft.
There are people that write multiple textbooks on the topic of opsec. I can’t possibly cover enough to save you from yourself, and frankly I bet I don’t know enough to protect myself 100% (does anybody really?). With that caveat in mind, read the below posts to maintain your safety when browing the dark web. As it turns out, these tips also are applicable to the regular web.
You should start with reading the basics on this GitHub gist and this Reddit post. Obviously, there are levels to this, and the only people that truly need to go to the furthest extremes are probably sending messages via carrier pigeon.
Common mistakes that put you at risk
- ID re-use: reusing the same usernames, e-mail addresses or passwords across difference services and platforms can make you more susceptible to risk. If one service gets hacked or has bad security practices, your sensitive access credentials are exposed more broadly
- Weak authentication: simple passwords or lack of multi-factor authentication (MFA) can leave you open to simple brute force attacks or other exploits. I won’t get into how modern password security is implemented - you can do some digging on that.
- sharing too much: just like saying too much makes you a target, revealing too much info online can do the same. Real g’s move in silence like lasagna.
- predictable routines: go to the same website every day at the same time? anybody snooping your traffic migyht be able to identify you by using that information
Frequently Asked Questions
I told my friends that I’m writing something about this and heard a suite of questions… some better than others.
1. Is the dark web illegal?
I cannot emphasize this enough: this is not legal advice and is merely my personal, non-professional opinion. That said, the legality of the dark web depends on jurisdiction. If you live in the United States and are not otherwise committing a crime, the law as of the time of this posting suggests that you are not committing any crimes.
In other countries, this is not necessarily the case. It’s possible that using TOR services can put you at significant personal risk. For example, I can imagine certain east Asian countries might not be too thrilled.
2. Can the police track TOR usage?
I cannot emphasize this enough: this is not legal advice and is merely my personal, non-professional opinion. In theory, it is absolutely possible for police to track TOR usage. In practice, this should be exceedingly difficult because anonymous and uncompromised relay hosts probably aren’t actively sharing your messages with law enforcement. If you’re worried that the police are using their limited resources to track your TOR usage, it’s probably more likely that they’re using higher quality and lower tech exploits (every heard of wire-tapping warrants?).
3. What’s the difference between Tor and VPNs?
A Virtual Private Network (VPN) acts as an overlay to pass messages between participants. The VPN generally has a known set of relays or at least a set of relays that are all controlled by the same provider. As such, compromising the intermediary means compromising your privacy.
TOR is a protocol that passes messages between relay circuit members in a big game of telephone. The relay strategy allows roughly anonymous communication between parties without requireing a dedicated and trusted third-party to act as the relay.
Conclusion
One of my favorite authors, Douglas Adams, says Space is big. You just won’t believe how vastly, hugely, mind-bogglingly big it is. I mean, you may think it’s a long way down the road to the chemist’s, but that’s just peanuts to space.”
Even if I’m not as entertaining as the Hitchiker’s Guide to the Galaxy, I can at least say that the internet is pretty big too. Most of us are exposed to the internet via public services, but that’s just the tip of the iceberg. There is just so much data hidden below the surface, just out of reach, and hidden deep so nobody can find it without know it’s there.
It might seem dark and mysterious, but at the end of the day these deep and hidden services are just another evolution of the data privacy craze that started millennia ago with basic cyphers and has extended into the data age. I’m 100% confident that encryption will continue to evolve in ways that we could never have imagined.
In the meantime, let me know if there’s any topics around the dark web that you’d like me to dig into in further detail.
#Onion #Tor #Opsec #Privacy #Dark Web #Internet #Tls #Arpanet #Clearnet
comments powered by Disqus