Reading Time | 6-7min
Since the early days of the internet, the client-server model has been one of its fundamental characteristics. For those who are new to this concept, here is a good explanation of how it works. As internet adoption expanded, so did the number of clients. Another interesting thing about the internet is that it wasn’t designed with security in mind, but speed. The core architecture of the internet was based on sending packets of data from place to place in a fast way, but security was the responsibility of the nodes on the network. HTTP, which stands for Hypertext Transfer Protocol is the foundation of data communication for the world wide web. Resources of this web, such as websites, audio, video and files are transferred using the HTTP protocol. This is an application layer protocol, which allows people to navigate the web through hyperlinks, like these to nowhere. The result of overlooking security has become apparent after 50 years. Cyber attacks are more common, state and non-state actors target individuals for nefarious purposes, and enterprises are becoming vulnerable to large scale DDoS attacks.
What happened to HTTP?
To solve some of these issues, IPFS was developed. IPFS is a peer-to-peer hypermedia protocol. It stands for Inter-Planetary File System, and as the name suggests, it is a distributed file system. This protocol allows nodes to exchange all types of hypermedia, such as audio, video and textual media in a decentralized way. Since the network is of P2P(peer-to-peer) type , there is no central point of failure, and users can obtain their files/content much faster. This is made possible with a data structure called a Merkle DAG(Directed Acyclic Graph).
Currently, HTTP(Hyper-text Transfer Protocol) is used for the above purposes, but has some critical drawbacks:
- Centralization
- Inefficiency
- Ephemerality
Centralization
HTTP requests involve two parties; the client and the server. This means that if the server decides to change the content, or deny access to the content, there is nothing which the client can do. To authenticate the identity of a website, a SSL certificate may be required, which can be only obtained from a trusted third party provider. The security of such certificates is also dependent on the third party. Moreover, technology giants have ended up controlling the flow of information.
Inefficiency
The world wide web uses the HTTP protocol for exchange of content. In this scheme, servers and CDNs(content distribution networks) provide access to websites, media, and more. In a P2P network, both dedicated servers and CDNs can be eliminated by peers on the P2P network which share content when requested. Servers may go down or face attacks by cyber-criminals, but it is very difficult for a P2P network to be taken down. The IPFS protocol can drastically speed up access to such resources since nodes host data which is then shared over a P2P network.
Ephemerality
Over the years, information originally posted on the internet has slowly become out of reach, due to archiving, loss of servers, or when funds run dry. A P2P network can ensure(in theory) that files don’t get lost, when nodes are given an incentive to store them.
An Example
IPFS attempts to overcome some of these drawbacks through the clever use of cryptography and decentralization. In fact, it is an entire protocol suite, which means it can manage identities of nodes, connections to peers, routing, data exchange, and name resolution. Here is how it works. Suppose you and your friend wish to collaborate during the creation of a computer program. You can use git to collaborate, but that too depends on the availability of a central repository where the source code is hosted, such as Github. As history shows, having a central point of failure is always dangerous, since Github has faced DDoS attacks in the past. If you both were using IPFS to share the code, then there is no way either of you loses the source code to a hack. This is because nodes on IPFS have a choice to pin some files to make them available for others. If you pin the file containing the source code, even if your friend loses the copy, you will have the copy. This example may not make much sense for sharing files among two people, but consider the advantage when all the nodes have the choice to share files in this manner. An implication of such file sharing is that the internet becomes permanent, if users choose to actively keep copies of files of importance. This means that websites don’t go down, content remains available as long as at least one node keeps copies of them, and content is delivered faster than it is done today. Also, IPFS has the ability to store data in a version control system, which means you can also collaborate with people, say to write a blog, just like using git for software development.
A question people ask is, how can I be sure that someone will have the file or website or video which I want? In IPFS, nodes can pin files of their choice, but when nodes can be given an incentive to store people’s files, that is where blockchain and hence cryptocurrencies enter the mix.
Crypto* and IPFS
In IPFS, every node has its own public and private keys. The public key is used to to generate the Node ID. This Node ID uniquely identifies a node on a network communicating using IPFS. Unlike in HTTP, where SSL certificates are used for authentication, in IPFS, the private key of a node can be used to sign a small piece of data for authentication purposes. Moreover, to incentivise nodes to store content, a cryptocurrency specific to the network can be created. Sia is a decentralized storage platform which uses two digital tokens called Siacoins and Siafunds. Another decentralized storage network is Filecoin, which has its own cryptocurrency called Filecoin. Filecoin works as an incentive layer on top of IPFS to allow nodes to mine blocks of storage, based on how much storage they have. It uses a consensus algorithm called Proof-of-Spacetime, where blocks are created by miners who store data.
Hyperlinks in IPFS
Can you see the address bar on this page? It is where we see the link to this content. On closer observation, we can see that the address consists of the name of the website and where this web page is located on the network. This URL is eventually converted to an IP address by a process called DNS resolution. This means that if the address is changed, say when the website name changes or if the web page is removed to some other location, people won’t be able to access the page without the correct link/address. Another problem is that if I change the content on this page after you have shared the link to a friend, then what you saw before you shared the link is not what your friend may see later on. In simple terms, addresses may point to different things at different times. On the other hand, in IPFS, instead of location based addressing, as in the current scenario, content-based addressing is used. This means that objects such as websites, images, etc. can be identified by a unique hash. This means that by knowing the correct hash of an object, you can be sure that you are seeing the correct version.
Future of IPFS
IPFS can be mounted as a global filesystem, a data sharing system, a linked and encrypted communications platform and more. Other uses of IPFS include social media networks, peer-to-peer marketplaces, and even to provide infrastructure for DApps(distributed apps). Since IPFS is still in active development, more use cases will come up as time passes.
Stay tuned for more !
References
- IPFS whitepaper
- Filecoin whitepaper
- IPFS Primer(book)