Using Git Instead of Your Stupid Blockchain
Table of Contents
1. Using Git Instead of Your Stupid Blockchain
Much has been made lately1 about using the blockchain for supply chain applications. Proponents argue that it could enable greater transparency and allow for companies to more easily locate and contain issues in their supply chain2. They aren't wrong in this, but they are wrong when they say "the right tool to solve this is a blockchain!"
1.1. Git vs Blockchain
Git and the blockchain were designed by different people to solve different problems, but at their base have the same data structure: the Merkle tree. This data structure is the key to why blockchain proponents aren't entirely wrong. Briefly, the Merkle is a tree3 in which every leaf is referenced by the hash4 of its contents, while every non-leaf is referenced by the hash of the hashes of its leaves. The result of this is that it becomes easy to verify that the data stored in the tree is what you expect it to be, and that it hasn't been tampered with5 , 6.
So, what's the difference? Well, their purposes, for a start. The blockchain7 was designed to process transactions between an arbitrary number of untrusted participants, with no single point of truth or accountability. This means that everyone is accountable to each other, and everyone has to verify everything8. Git, meanwhile, was developed to track revisions in files that began at a single point in history, and can be directly administered by a person or organization9.
The trick here is that, as this essay is merely a textual representation of the thoughts in my mind, serving as a transmission vehicle, Git can be used to track the status and history of an arbitrary object as represented textually10, while also avoiding the cost, speed, and complexity problems that result from the blockchain11.
1.2. Implementation
"Well that's all well and good," I hear you say, "but how does a 20 year old source code version control system do the same thing as the shiny new blockchain I pay IBM $100k per month to use?"
Well, let's first consider how Walmart12 uses theirs:
1.2.1. Walmart13 , 14
In their blockchain15, Walmart tracks individual packages through their product ID, lot/batch codes, purchase order, and timestamps for each phase (e.g., for a piece of produce, harvesting, processing, shipping, and receiving). At each point in the supply chain, whoever is currently touching the product adds a new "transaction" indicating what the touch was, which product it was, the batch, and when the touch happened. This history is recorded in the blockchain, and it allows an analyst to grab the entire history for a single product in an instant. This may seem daunting for the non-technical supply chain professional, but it's actually quite simple in concept.
- Example Structure
First, I'll define a structure for the repository in which the products that will be tracked will live in. There are a handful of ways to do this16, but the most simple way is to define it by SKU. We will then "place an order" for one of those SKUs, using a textual representation based on s-expressions.
>> mkdir $product_sku >> cd $product_sku >> git init
Now, we will insert the basic metadata about the item: the SKU number17, the purchase order17, and the timestamp for when the order was placed18.
>> touch $purchase_order # Plaintext file to hold sexps
((sku . my-product-sku) (po . my-purchase-order) (po-timestamp . my-po-timestamp))
Afterwards, we add the PO to the repository and officially begin its history.
>> git add $purchase_order >> git commit -m "PO $purchase_order placed" # Automatically timestamped and searchable
Now, the next person to touch the PO (e.g., the harvesting company) will have a copy of our Git history, synchronized with our central company server; they can then take a branch of it, update the information in the current PO, commit the change, and then push it upstream to us19. We sync it20, then the next processor branches and records their changes, then merges, and so on until it is returned to us and placed on the shelf. When the product is finally sold, the final timestamp in its s-expression is placed and committed, and then it sits in the archive in case any review is necessary. The s-expressions themselves can be stored in a primary database, and the Git record is used as the blockchain record is, for verification21.
The key is that this is all automateable with simple and easy-to-understand tools, and can be self-administered without exorbitant payments to IBM, while also being light enough that the processing can be done manually in the event of mass system failure. this also means that you don't need to be Walmart to use it, and any business with a supply chain can gain the benefits that Walmart's blockchain provides at a fraction of the cost. Moreover, the way that the system has been structured means that, at small scales, Git itself can do the heavy lifting and output reports in human-readable format, without any extra programming effort required!
1.3. Q & A
So, I've discussed why Git beats the blockchain, demo'd the system, and am now hankering for dinner. However, I'm sure there are a few questions that you want to ask.
1.3.1. What about fakery? Doesn't the blockchain absolve me from the need to manually audit my suppliers?
Nope, it never did. Garbage in, garbage out applies to all record-keeping systems22.
1.3.2. How do I administer this?
Hire me!23 Seriously, though, it's a matter of basic Git administration skills24 and knowing what you want to record and how you want to do it25.
1.3.3. Does this scale?
Sure, especially if you parcel your repositories in a manner conducive to how your business operates.
1.3.4. But what about transparency? I want everyone to be able to see what I and my suppliers are doing!
Just throw it on a server! If you don't want to do that, services such as Codeberg, GitHub, and GitLab will do it for you.
1.4. Conclusion
The main benefit of any of these systems is actually tracking where things are coming from and going to in a single, coherent, machine-processable way. The Merkle tree makes it easy to run and difficult to defraud, but you could get similar results with EDI, or RSS feeds from your suppliers, or even template emails sent to a company server.
Footnotes:
I.e., within the past 8 years.
E.g., if Chipotle knew where the E. coli was coming from in 2015, they could have quickly isolated the contaminated shipments and kept doing business, instead of shitting the bed.
In the mathematical sense.
In the cryptographic sense.
As tampering with any point in the history requires tampering with the history that references the piece that you've tampered with.
Yeah, yeah, git blame someone-else
. Don't @ me. The master copy should be on your own server, with proper controls in place to handle merges.
In the BitCoin sense.
Hence mining.
I.e., you can control what is added to/subtracted from your repository of files.
Or in another format, but the greatest benefit comes from human-readable and easily-processable formats.
And cannot be bypassed due to its inherent design.
The gold standard in supply chain management, as my professors in the Sam Walton School of Business[link] tell me.
This is based on Walmart's openly-available tech blogs and reports. Nobody there wanted to talk to me about it while I had my eyebrow raised.
I am going to ignore that 2022 HBR article discussing the blockchain implementation that solved a problem that didn't actually need a blockchain (just a standard data format and a bit of plumbing), because I think the Merkle tree is actually useful here.
I.e., in the contents of the Merkle tree's leaves
Depending largely on what aspects about them are the most important when tracking or querying.
Again, for easy access when the contents of the file are eventually loaded into the company's primary database.
Much of this will be pseudocode, since the actual implementation will depend heavily on the company's actual format for product data.
Where we can audit it at our leisure.
Or "merge" in Git lingo.
This caching speeds the Git-based system up even further, and it was already faster than the blockchain.
And the GIGO problem that mining solves isn't applicable here, because you can't double-spend a physical object.
Disclaimer: N-C Softworks does not yet have a consulting division ready to implement this. However, raise an issue on my official GitHub repository and we can talk.
For which many in-depth books are available .
Which is easier said than done, but you'd have to do it for the blockchain anyway; IBM would make you pay more so their people could do it for you badly.