Getty Images
An overview of Microsoft Project Silica and its archive use
Microsoft's Project Silica stores data in quartz glass, similar to the crystals in Superman films. The team now has four distinct areas of research.
Microsoft's project to store data on quartz glass has progressed in the four years since the company showed off its first proof of concept.
In collaboration with Warner Bros., Microsoft stored a digitized version of the 1978 Superman movie in a piece of quartz glass about the size of a drink coaster. The glass was only 2 mm thick and 75 mm2 in area, but it could hold the entire movie, which required more than 75 GB of storage. Now Microsoft can store up to 7 TB of data in a piece of glass the same size.
Project Silica is a Microsoft Research project that is developing an advanced archival storage system specifically for the cloud. The project's research team uses ultrafast laser optics to store the data, along with microscopy and artificial intelligence to read it. The result is a storage medium that can potentially last thousands of years without degradation.
What is Microsoft's Project Silica?
Existing storage technologies will soon reach their practical limits as demand for long-term cloud storage continues to grow at unprecedented rates. Most cold data is currently stored on magnetic tape, optical disks, HDDs and -- to a lesser degree -- SSDs. None of these are a cost-effective or sustainable platform for handling the vast amounts of archival data that will live in the cloud. Each one was created before the cloud existed and designed to support multiple uses.
No storage technology has been built specifically to store cold data at cloud scale until now. Microsoft Project Silica was designed from the ground up for cloud-based archival data storage. As the project has demonstrated, quartz glass can serve as a reliable medium to store cold data for long periods of time.
Project Silica is part of the broader Optics for the Cloud project, a Microsoft program to advance the adoption of optical technologies in the cloud. Most of the work on Project Silica occurs at Microsoft Research's Cambridge lab in the U.K., where a team of physicists, electrical engineers, optics experts and researchers are building a completely new type of storage system.
Microsoft Project Silica researchers use quartz glass for their storage medium because of its durability. They've baked it, boiled it, microwaved it, demagnetized it and scoured it with steel wool. After all that, they were still able to read the data.
Quartz glass is plentiful and relatively inexpensive compared to other media. It can store vast amounts of cold data indefinitely. It also does not require expensive environmental management such as temperature and humidity controls or protection from electromagnetic field (EMF) energy. It can retain data for thousands of years without experiencing bit rot, so it avoids the costly cycles of rewriting data to next-generation storage media.
The Project Silica team focuses solely on building storage for archiving data at cloud scale. It's not trying to provide consumers with new ways to run their personal computers or watch movies at home. The team is concerned only with developing a storage medium that can handle large amounts of cold data that's seldom accessed, whether that means every few months or every few years.
How does Project Silica work?
Project Silica builds on earlier efforts by researchers at the University of Southampton. They demonstrated the ability to store data in fused silica, a non-crystalline form of silicon dioxide that's found in quartz crystals, sand and other materials. Their first success was in 2013, when they stored a 300 KB text file in fused silica glass, which they dubbed 5D memory crystal.
The Project Silica team uses similar technologies to those employed by the University of Southampton researchers. The project is now much broader in scope, however, with the team's efforts broken down into four distinct areas of research: the Write Lab, Read Lab, Decode Lab and Library Lab.
The Write Lab
This lab encodes data in the quartz glass media, which are referred to as platters. To encode the data, the team aims a femtosecond laser -- the type of laser found in Lasik eye surgery -- at the glass. It etches nanoscale gratings (voxels) directly into the glass rather than on its surface or an embedded foil layer. The laser emits ultrashort optical pulses that permanently change the structure of the glass.
A voxel can be thought of as a 3D pixel capable of encoding multiple bits. The laser writes the voxels in 2D layers across the XY plane, focusing the beam at different positions to vary the voxel shapes. To create voxels in different layers, the laser changes the beam's focal depth within the glass. A piece of glass that's 2 mm thick can support hundreds of layers of voxels.
The Read Lab
This lab retrieves data from the glass platters after the data has been written. A platter is read immediately after it's written to verify its accuracy and then again whenever the data is needed at any point in the future.
The reading is achieved through a process called polarization-sensitive microscopy, which is carried out by a computer-controlled, high-speed microscope. The read drive takes advantage of a voxel characteristic known as form birefringence, in which the voxel exhibits refractive properties different from the surrounding silica.
When polarized light interacts with a voxel, nanometer shifts occur in its electric field. The range of the shift is referred to as the voxel's retardance. The light's polarization angle also changes. These two birefringence properties -- retardance and angle change -- make it possible to encode multiple bits per voxel. Once the voxels are created, the properties remain stable for the lifetime of the glass.
Data is read from the glass by shining regular light through the platter and measuring the two birefringence properties. The microscope includes a camera for capturing images that characterize polarization changes. To read different layers in the glass, the optics focus at different depths. The images are then sent to the decoder for interpretation.
The Decode Lab
This lab focuses on the technologies needed to decode the images produced by the read drive. Microsoft Project Silica uses machine learning algorithms to interpret these images. The algorithms require multiple images of each set of voxels to decode their patterns.
Project Silica also uses deep learning and neural network technologies to address potential variabilities and noise that come with reading the data. The output from these analytics is a 2D array of probability distributions, which are then an input for the error-correction processes. The final output is the usable data.
Although the decoding process is directly tied to the reading process, the two are treated as separate operations. The reader captures the images, and the decoder interprets the images. In this way, dealing with voxel complexities becomes an offline operation, separate from the process of physically reading the data.
The Library Lab
This lab writes, reads and houses the glass platters. When not being read, the platters sit in large media storage panels that look like bookcases. These panels, like the glass platters, are entirely passive. They require no electricity, special climate controls or protection from EMF energy.
The platters are not inserted into special cartridges or locked onto the shelves. They are held in place by gravity and remain stationary unless they are being moved to or from a read drive. When data is requested, a special robot, or shuttle, retrieves the platter and brings it to the reader. After the data has been read, the shuttle returns the platter to the shelf. The storage panels include multiple read drives at either end to streamline this process.
Numerous shuttles can run concurrently across the storage panels. The shuttles are battery-operated, self-contained units that traverse the shelves on rails. They can also move up or down between levels through a process called crabbing. Any shuttle can retrieve any platter from any shelf and bring it to any reader. The library's design also prevents a platter with stored data from being overwritten, while naturally accommodating air-gap storage.
What is the future of glass storage?
Project Silica researchers have come a long way since the Superman test case. They have now prototyped a full-scale media library that demonstrates the efficacy of all four labs. The researchers are currently working on the next developmental stage, although they've offered few details. Microsoft has not provided a timetable for when Project Silica might be ready to deliver large-scale, production-ready storage.
It's uncertain who might be best served by the new storage system or whether the system will find a home other than Microsoft's own data centers. Another unknown is the potential impact on the storage industry. Project Silica could benefit organizations with a wide range of archival storage needs, or it might prove worthwhile only to those with vast amounts of archival data.
What is clear is that bold new storage platforms are needed to handle the anticipated growth in data. Project Silica aims to address at least part of those needs by targeting cold storage requirements in the cloud. Whether quartz glass might one day be used for other purposes remains uncertain, but undoubtedly it has the potential to handle vast stores of archival data.
Robert Sheldon is a technical consultant and freelance technology writer. He has written numerous books, articles and training materials related to Windows, databases, business intelligence and other areas of technology.