TLDR:
Storing data on-chain is expensive so most NFTs store a URI on-chain that points to external data storage with their NFT metadata.
Centralised data storage like Amazon’s S3 is an option, but this loses the decentralised nature of the NFT since the data is owned by a third-party.
Decentralised data storage like IPFS and Arweave solves this as data is stored in a separate decentralised network specialised in data storage.
Another week another topic! This week I’m going to discuss why developers prefer to use IPFS over AWS in the Web3 space, and what these names even mean.
As I mentioned a few weeks back in the post on Consensus Algorithms I’m working on a free short course on “NFT Tech”! I’m excited about the course as I hope it’ll give people without a technical background a better understanding of the fundamental concepts behind this powerful technology.
The course will ease people in, so more complex topics like today’s will likely appear later in the course.
On-chain storage is expensive
NFTs store data on the blockchain that’s directly associated with them. For example, if an NFT represents a piece of digital art then this artwork needs to be stored with that NFT somehow.
However, storing data on-chain is expensive by design so that blockchains don’t grow into massive files that only massive data warehouses can maintain. By keeping blockchains relatively small their networks can be more decentralised as its less costly to run an individual node.
So, although some developers do find neat tricks to store NFT data directly on-chain, the vast majority bypass this cost by saving only a URI to the metadata on-chain!
With Ethereum NFTs this metadata usually follows the “ERC721 Metadata JSON Schema” like the one I grabbed from Azuki below:
What is a URI?
URI stands for “Uniform Reference Identifier” which sounds complicated, but in practical terms its just a link that points to somewhere on the Internet just like a website URL like “https://google.com” but the start is not necessarily “https://”. For example, Azuki#0’s file above has the following URI:
ipfs://QmZcH4YvBVVRJtdn4RdbaqgspFU8gH6P9vomDpBVpAL3u4/0
However, if you copy and paste this URI directly into most browsers it won’t work, an exception being the Brave browser that has native IPFS integration. Instead you’ll need to turn it into a URL with “ipfs.io” at the start like this:
https://ipfs.io/ipfs/QmZcH4YvBVVRJtdn4RdbaqgspFU8gH6P9vomDpBVpAL3u4/0
Within the metadata itself the image is also stored as a URI. In the Azuki metadata above you can see it stores the name, attributes, and an image with this URI:
ipfs://QmYDvPAXtiJg7s8JdRBSLWdgSphQdac8j1YuQNNxcGE1hg/0.png
Centralised off-chain data
Considering then that most NFTs store their metadata off-chain and we want our NFTs to live “forever” just like the blockchain will, its pretty important to understand exactly where that data is stored.
Most websites you interact with use centralised storage services like Amazon’s cloud services called AWS, or another big player’s like Google or Microsoft. Here these Big Tech companies look after your data and ensure that you can access it fast and reliably.
AWS’s most commonly used solution for pure data storage is called S3. An NFT’s metadata can be stored on S3 and there would be no problems, it would still be visually the same within Opensea, Blur or any other platform that visualises the NFT. However, what happens if some point in the future Amazon ceases to exist? Or if it changes the way it handles its S3 service and the on-chain URI breaks?
While both are unlikely they are still possible and the reliance on a centralised player goes against a lot of the whole point of owning an NFT and it being supposedly decentralised and able to last forever.
That’s where decentralised data storage like IPFS and Arweave come in. Both of these offer alternatives to centralised data storage and are the most commonly used in Web3. As you can see in the previous example Azuki uses IPFS.
Decentralised off-chain data
Both IPFS and Arweave store data in an entirely different way to AWS where you are not bound by a centralised entity.
IPFS stands for “Interplanetary file system” and is a protocol that creates a decentralised file-sharing network where you can store folders and files on a network rather than a single computer. You can think of it like an improved version of using torrents. But why would random people on the internet just store your data for you?
Well that’s where Filecoin comes in, which is a cryptocurrency that adds a financial incentive for storage providers on the network to rent out their storage for clients and get paid. This may sound complicated but at its core its not too dissimilar to how Bitcoin miners spend electricity to earn Bitcoin, here network participants rent out storage in return for Filecoin!
The beauty here is that when using IPFS you are no longer dependent on a single centralised third-party, and as long as you pay for the network to store your data through a “pinning service” you can be confident that data will continue to exist forever without needing to know who exactly is storing it for you.
Even “pinning services” are not centralised, one of the most popular is Pinata but you can use others like Infura or Temporal and switch between them if you want, since they are not storing the data for you they are just helping you to pay the network. Moreover uploading data to IPFS is also simple and free with apps like NFT.storage.
Arweave is another solution that achieves something very similar to IPFS providing a decentralised, permanent storage platform through a data structure called a "blockweave". Its main difference to IPFS is that it focuses on providing permanent storage rather than a file-sharing system. However, for all intents and purposes to the end user it looks pretty much the same.
And that’s a wrap for today!
So next time you see an Ethereum NFT you can now check its tokenURI variable on Etherscan to know if it is storing data with a more long-lasting decentralised storage system like IPFS or Arweave, or if its not thinking about the long-term and storing on a centralised service like AWS’s S3.