Using Git Instead of Your Stupid Blockchain

1. Using Git Instead of Your Stupid Blockchain

1. Using Git Instead of Your Stupid Blockchain

Much has been made lately¹ about using the blockchain for supply chain applications. Proponents argue that it could enable greater transparency and allow for companies to more easily locate and contain issues in their supply chain². They aren't wrong in this, but they are wrong when they say "the right tool to solve this is a blockchain!"

1.1. Git vs Blockchain

Git and the blockchain were designed by different people to solve different problems, but at their base have the same data structure: the Merkle tree. This data structure is the key to why blockchain proponents aren't entirely wrong. Briefly, the Merkle is a tree³ in which every leaf is referenced by the hash⁴ of its contents, while every non-leaf is referenced by the hash of the hashes of its leaves. The result of this is that it becomes easy to verify that the data stored in the tree is what you expect it to be, and that it hasn't been tampered with⁵ ^,⁶.

So, what's the difference? Well, their purposes, for a start. The blockchain⁷ was designed to process transactions between an arbitrary number of untrusted participants, with no single point of truth or accountability. This means that everyone is accountable to each other, and everyone has to verify everything⁸. Git, meanwhile, was developed to track revisions in files that began at a single point in history, and can be directly administered by a person or organization⁹.

The trick here is that, as this essay is merely a textual representation of the thoughts in my mind, serving as a transmission vehicle, Git can be used to track the status and history of an arbitrary object as represented textually¹⁰, while also avoiding the cost, speed, and complexity problems that result from the blockchain¹¹.

1.2. Implementation

"Well that's all well and good," I hear you say, "but how does a 20 year old source code version control system do the same thing as the shiny new blockchain I pay IBM $100k per month to use?"

Well, let's first consider how Walmart¹² uses theirs:

1.2.1. Walmart¹³ ^,¹⁴

In their blockchain¹⁵, Walmart tracks individual packages through their product ID, lot/batch codes, purchase order, and timestamps for each phase (e.g., for a piece of produce, harvesting, processing, shipping, and receiving). At each point in the supply chain, whoever is currently touching the product adds a new "transaction" indicating what the touch was, which product it was, the batch, and when the touch happened. This history is recorded in the blockchain, and it allows an analyst to grab the entire history for a single product in an instant. This may seem daunting for the non-technical supply chain professional, but it's actually quite simple in concept.

Example Structure
First, I'll define a structure for the repository in which the products that will be tracked will live in. There are a handful of ways to do this¹⁶, but the most simple way is to define it by SKU. We will then "place an order" for one of those SKUs, using a textual representation based on s-expressions.
```
>> mkdir $product_sku
>> cd $product_sku
>> git init
```
Now, we will insert the basic metadata about the item: the SKU number¹⁷, the purchase order¹⁷, and the timestamp for when the order was placed¹⁸.
```
>> touch $purchase_order	# Plaintext file to hold sexps
```
```
((sku . my-product-sku)
 (po . my-purchase-order)
 (po-timestamp . my-po-timestamp))
```
Afterwards, we add the PO to the repository and officially begin its history.
```
>> git add $purchase_order
>> git commit -m "PO $purchase_order placed" # Automatically timestamped and searchable
```
Now, the next person to touch the PO (e.g., the harvesting company) will have a copy of our Git history, synchronized with our central company server; they can then take a branch of it, update the information in the current PO, commit the change, and then push it upstream to us¹⁹. We sync it²⁰, then the next processor branches and records their changes, then merges, and so on until it is returned to us and placed on the shelf. When the product is finally sold, the final timestamp in its s-expression is placed and committed, and then it sits in the archive in case any review is necessary. The s-expressions themselves can be stored in a primary database, and the Git record is used as the blockchain record is, for verification²¹.

The key is that this is all automateable with simple and easy-to-understand tools, and can be self-administered without exorbitant payments to IBM, while also being light enough that the processing can be done manually in the event of mass system failure. this also means that you don't need to be Walmart to use it, and any business with a supply chain can gain the benefits that Walmart's blockchain provides at a fraction of the cost. Moreover, the way that the system has been structured means that, at small scales, Git itself can do the heavy lifting and output reports in human-readable format, without any extra programming effort required!

1.3. Q & A

So, I've discussed why Git beats the blockchain, demo'd the system, and am now hankering for dinner. However, I'm sure there are a few questions that you want to ask.

1.3.1. What about fakery? Doesn't the blockchain absolve me from the need to manually audit my suppliers?

Nope, it never did. Garbage in, garbage out applies to all record-keeping systems²².

1.3.2. How do I administer this?

Hire me!²³ Seriously, though, it's a matter of basic Git administration skills²⁴ and knowing what you want to record and how you want to do it²⁵.

1.3.3. Does this scale?

Sure, especially if you parcel your repositories in a manner conducive to how your business operates.

1.3.4. But what about transparency? I want everyone to be able to see what I and my suppliers are doing!

Just throw it on a server! If you don't want to do that, services such as Codeberg, GitHub, and GitLab will do it for you.

1.4. Conclusion

The main benefit of any of these systems is actually tracking where things are coming from and going to in a single, coherent, machine-processable way. The Merkle tree makes it easy to run and difficult to defraud, but you could get similar results with EDI, or RSS feeds from your suppliers, or even template emails sent to a company server.

Footnotes:

I.e., within the past 8 years.

E.g., if Chipotle knew where the E. coli was coming from in 2015, they could have quickly isolated the contaminated shipments and kept doing business, instead of shitting the bed.

In the mathematical sense.

⁴

In the cryptographic sense.

⁵

As tampering with any point in the history requires tampering with the history that references the piece that you've tampered with.

⁶

Yeah, yeah, git blame someone-else. Don't @ me. The master copy should be on your own server, with proper controls in place to handle merges.

⁷

In the BitCoin sense.

⁸

Hence mining.

⁹

I.e., you can control what is added to/subtracted from your repository of files.

¹⁰

Or in another format, but the greatest benefit comes from human-readable and easily-processable formats.

¹¹

And cannot be bypassed due to its inherent design.

¹²

The gold standard in supply chain management, as my professors in the Sam Walton School of Business[link] tell me.

¹³

This is based on Walmart's openly-available tech blogs and reports. Nobody there wanted to talk to me about it while I had my eyebrow raised.

¹⁴

I am going to ignore that 2022 HBR article discussing the blockchain implementation that solved a problem that didn't actually need a blockchain (just a standard data format and a bit of plumbing), because I think the Merkle tree is actually useful here.

¹⁵

I.e., in the contents of the Merkle tree's leaves

¹⁶

Depending largely on what aspects about them are the most important when tracking or querying.

¹⁷

Again, for easy access when the contents of the file are eventually loaded into the company's primary database.

¹⁸

Much of this will be pseudocode, since the actual implementation will depend heavily on the company's actual format for product data.

¹⁹

Where we can audit it at our leisure.

²⁰

Or "merge" in Git lingo.

²¹

This caching speeds the Git-based system up even further, and it was already faster than the blockchain.

²²

And the GIGO problem that mining solves isn't applicable here, because you can't double-spend a physical object.

²³

Disclaimer: N-C Softworks does not yet have a consulting division ready to implement this. However, raise an issue on my official GitHub repository and we can talk.

²⁴

For which many in-depth books are available .

²⁵

Which is easier said than done, but you'd have to do it for the blockchain anyway; IBM would make you pay more so their people could do it for you badly.