UPDATED 12:00 EDT / JULY 16 2020

BIG DATA

GitHub preserves its open-source software code deep in the arctic for future generations

GitHub Inc. said today it has delivered a copy of all of the open-source software code stored on its website to a data repository at the Arctic World Archive, which is a very long-term archival facility buried 250 meters deep in the permafrost of an Arctic mountain.

The operation is part of the GitHub Archive Program, which is a project announced last year that aims to preserve today’s open-source software for future generations. To do that, GitHub said, it will store its code in an archive called the GitHub Arctic Code Vault, which it says has been built to last for a thousand years.

GitHub said it carried out the operation in partnership with a long-term data storage company called Piql, which copied the entire contents of its active public repositories and wrote that data to 186 reels of hardened microfilm. The microfilm was then shipped to the island of Svalbard in Norway, which is located inside the Arctic Circle, and transported to a decommissioned coal mine set within a mountain that’s now home to the Arctic World Archive.

Once there, the encoded microfilm was placed inside the GitHub Arctic Code Vault, which is a deep chamber that’s buried inside hundreds of meters of permafrost.

arctic-world-archive

The operation is still far from complete, though. In a blog post, GitHub said the next step is to create what it calls a “Tech Tree,” which is a document that contains the technical history and cultural context of the GitHub Archive Program.

The idea with the Tech Tree is to compile a bunch of existing works that help provide a more detailed understanding of modern computing and software development, open-source software and its applications, and popular programming languages. The Tech Tree will also explain the various technologies that make software possible, including such things as microprocessors, networking, electronics and even pre-industrial technologies. That, GitHub said, is intended to allow the archive’s inheritors to understand today’s world and its technologies better and perhaps even enable them to recreate computers to use the archived software.

To recognize the millions of developers who have contributed to the open-source software that’s now stored in the vault, GitHub has also created a new badge that will be displayed in the highlights section of each user’s profile.

badge

The archive program is actually just one of several initiatives GitHub has launched to try and preserve the open-source software code it hosts. In addition to that project, the company is also working with a nonprofit organization called the Internet Archive, which provides free public access to various collections of digitized content.

GitHub said the Internet Archive began archiving the content from its public software repositories in April this year, using its Wayback Machine to archive the raw data as Web ARChive files. To date, it has archived more than 55 terabytes of code, GitHub said.

In addition, The Internet Archive is planning to make those archived repositories available via a “git clone” that will also preserve things such as repo comments, issues and other metadata, and make it easily accessible via the internet. That initiative is “well underway” and initial archiving is expected to start this month, GitHub said.

In addition, GitHub is partnering with another nonprofit group, called the Software Heritage, to preserve and share the source code of its software commons. Software Heritage has already archived more than 130 million different projects, and 100 million of those came from GitHub, the company said. The Software Heritage archives can be accessed now from here.

GitHub further announced a partnership with Project Silica, which is a project that aims to develop a sustainable and reliable, long-term storage technology for long-lived data.

Project Silica is using new techniques in ultrafast laser optics to store data in fused quartz glass, using a process that permanently changes the physical structure of the material. Quartz glass is a durable storage media that offers “unparalleled data lifetimes of upwards of tens of thousands of years,” GitHub said. It can resist electromagnetic interference, water and heat as well, it said.

Images: GitHub

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU