Recently I discovered some computer code Id written will outlive me for many centuries. A copy of it has been stored in a chilly cave in the Arctic Circle.
Its part of a fascinating project by GitHub, the 2020 Arctic Vault program, which brings modern technologies into a surprisingly primitive environment to deliver an unexpected honor for a wide swath of the 100 million code repositories currently hosted on GitHubs servers, by archiving of all this material in perpetuity in an exotic archipelago in Norway, near the northernmost town in the world.
GitHubs vice president of special projects, Thomas Dohmke, tells news.com.au that GitHub is uniquely positioned for the archival, and has the responsibility to protect and preserve the collaborative work of millions of developers around the world. On its webpage for the project, GitHub strikes a similarly grand tone, calling open source software a hidden cornerstone of modern civilization, and the shared heritage of all humanity.
We will protect this priceless knowledge by storing multiple copies, on an ongoing basis, across various data formats and locations, he said.
On a visit, GitHubs CEO Nat Friedman described the storage location, a decommissioned coal mine, as more mine-y and rustic and raw-hole-in-the-rock than I thought it would be, according to a recent article in Bloomberg. The news service goes on to note that, to Friedman, its a natural next step. Open source software, in his view, is one of the great achievements of our species, up there with the masterpieces of literature and fine art.
And its not the only priceless knowledge being stored in this remote location. According to Bloomberg,the other shelves in the mine include Vatican archives, Italian movies, Brazilian land registry records, and the recipe for a certain burger chains special sauce.
But whats the rationale for this massive effort? The projects page cites the threat of code being abandoned, forgotten, or lost. Worse yet, how would the code be otherwise saved in case of a global catastrophe?
There exists a range of possible futures in which working modern computers exist, but their software has largely been lost to bit rot. The GitHub Archive Program will include much longer-term media to address the risk of data loss over time, the site notes.
Of course, the code repository services has also given some thought to how the future might use our code. There is a long history of lost technologies from which the world would have benefited, as well as abandoned technologies which found unexpected new uses, explains the project web page. It is easy to envision a future in which todays software is seen as a quaint and long-forgotten irrelevancy until an unexpected need for it arises.
Future historians might see the significance in our age of open source ubiquity, volunteer communities, and Moores Law.
Which code blocks make the cut? According to GitHub: The archive will include every repo with any commits between the announcement at GitHub Universe on Nov. 13 and 02/02/2020, every repo with at least 1 star and any commits from the year before the snapshot (02/02/2019 02/02/2020), and every repo with at least 250 stars. Plus, gh-pages for any repository that meets the aforementioned criteria.
The Norwegian data-storing company Piql, whose custom film and archiving technologies will allow the project to store terabytes of data for over 1,000 years, brags that code is now headed into the gold standard of long-term data storage.
But besides offering vault storage services, Piql also offers a unique form of data digitization. Piql is storing the code on hundreds of reels of film made from polyester and silver halide. Bloomberg points out theyre coated with an iron oxide powder for added Armageddon-resistance. Each of its microfilm-like frames holds over 8.8 million pixels. Piql explains that its method involves converting 1s and 0s into QR code. No electricity or other human intervention is needed as the climatic conditions in the Arctic are ideal for long-term archival of film, explained a Piql web page.
By using a self-contained and technology-independent storage medium, future generations will be able to read back the information, according to Piql. The project also includes instructions on how to unpack and read the code.
Bloomberg even notes that theres a treaty in place which keeps Svalbard neutral in times of war. Because its all stored on offline film reels, GitHub doesnt have to worry about power outages. An added layer of security comes from its remote location. One GitHub video points out that the Svalbard archipelago is home to the northern-most town in the world as well as thousands of polar bears. The videos description notes that though its called the GitHub Arctic Code Vault, its actually closer to the North Pole than the Arctic Circle.
Its been fun to watch the reactions to GitHubs video. The future will be amazed by my JavaScript Calculator, joked one comment.
Others couldnt resist commenting on the Arctic location. (Now my code can freeze before it even gets run) Another naysayer even quipped, When your code is so bad that you need to bury it under the permafrost
GitHubs FAQ says the company plans to re-evaluate the program (and its storage medium) every five years at which point itll decide whether to take another snapshot.
And if youre curious what its like in a Svalbard mine, a nearby coal mine is offering tours. Most of Svalbards old Norwegian and Russian coal mines have shut down, explains Bloomberg, so locals have rebranded their vast acres of permafrost as an attraction to scientists, doomsday preppers, and scientist doomsday preppers.
Link:
GitHub's Plan to Freeze Your Code for Thousands of Years - thenewstack.io