The Digitization Process
What is a Wax Cylinder?
A wax cylinder is a 4-6 inch (10cm) long tube with a 2 inch (4cm) diameter. It is built up from a malleable wax that holds its shape but can be cut away. During recording, a cutting stylus cuts a groove that winds around the exterior of the cylinder. Depending on the time period of recording, and whether the cylinder was commercially produced, the grooves are anywhere from 4 to 20 μm deep. (A μm, (pronounced micron) is a millionth of a meter. For reference, a human hair is around 50 μm thick.) The cutter would bob up and down as it cut the groove, with a speed dictated by the volume of the sound it was recording. The up and down motion of the cutter causes the groove to change in depth. Traditionally, in order to play back the audio on a wax cylinder, a stylus is placed in the groove, and the cylinder is rotated underneath it. As the cylinder rotates, the bottom of the groove guides the playback stylus to bob up and down, recreating the motion of the cutting stylus and the original sound.
Creating Images of Cylinders
The cornerstone of the optical process is a chromatic confocal microscope, also referred to as a confocal probe. The probe shines a light and, by analyzing the reflected color and intensity of that light, measures the height at 180 points on a surface. During an optical scan, the probe makes many rapid measurements (around 1000 per second), successively as the cylinder rotates underneath it. Once the cylinder has made a full rotation, the probe is moved down the length of the cylinder to the next stretch that has not been measured, and the cylinder is rotated again. This sequence is repeated until the height of the cylinder's entire surface has been measured, and the individual measurements are stitched together to produce a high resolution, three dimensional map of the cylinder. That map can be archived as a preservation copy, cleaned and analyzed to produce the audio recorded on the cylinder. A file containing a complete map is between 1GB and 5GB and takes around 3 hours to create.
Turning Pictures Into Sound
The data taken during an optical scan is turned into sound through analysis with a computer program that reads in the height measurements and produces a depth image. The depth image displays the cylinder as if it is cut open, down it's length, and flattened out. The height of the cylinder is encoded in the gray-scale value of the image. Black corresponds to deepest and white to highest. Once the image is produced, an algorithm uses edge finding to find the bottom of the grooves, where the stylus would have rested and calculate the speed of the stylus over the surface (by taking a derivative along the groove bottom). The speed of the stylus is directly tied to the sound driving the recording stylus and so this is enough to recreate the sound recorded on the cylinder. During this process cleaning and noise reduction can be applied, using data analysis and image processing techniques.
Scanning at UC Berkeley
UC Berkeley is the latest institution to build an optical scanning workstation. The scanning machine at Berkeley is the only existing workstation that can accommodate three cylinders at once, which complicates the mechanics of the system, but allows for greater speed and efficiency of digitization. The system, built in 2015, is now running and will be dedicated exclusively to the Hearst Museum Collection for the next two years. After that point it will hopefully be made available to scholars with collections outside the university.
Why This Method?
Optical vs. Stylus Transfers: Resolution
The optical process captures higher frequency audio than is typically possible with the traditional method of transfer, a hard stylus. Higher frequency audio induces changes that happen very quickly as the cylinder rotates, which, translates into changes that are very close together. A traditional stylus is limited by its own size. If these changes are closer together than the stylus is wide, the stylus cannot fit and cannot resolve them. (See the image below on the left). Optical scanning uses light with a spot size that is much much smaller than a stylus and can capture the tightly spaced changes in height that translate into higher frequency audio. The much finer resolution of an optical scan recovers audio outside the range of a stylus transfer (see the image on the bottom right) and it allows for the collection of the data necessary to eliminate the high frequency effects of dirt and decay. It is also the reason for the high frequency "hissing" noise present in audio from an optical scan, which is the sound of the texture of the wax itself, small defects, and can be removed trivially with commercial audio processing software.
Noise Reduction: Digital Cleaning to Remove Defects
Using edge detection algorithms, an optical scan can be analyzed using "Blob Cleaning". During blob cleaning the image is searched for "blobs" which in this case are dust, surface decay, particles, cracks or damage. These features have sharp edges, which create very rapid changes in the surface height, and introduce undesirable high frequency noises, pops, or crackles, into the audio. Blob cleaning isolates the sharp edges of these defects from the slow gentle changes of recorded audio, smooths the edges and minimizes their effects.
Delicate Materials
Because the optical scanning method does not require contact with the cylinders, it is less invasive than traditional playback methods. If a cylinder can have light shone on it and be settled and moved on the machine, it can be scanned. This allows access to materials that could not withstand traditional playback (cylinders that are broken, cracked, in pieces, or otherwise unstable), and prevents inflicting further damage with a wandering or sharp stylus (see the middle image for a cylinder damaged by stylus). Methods to protect delicate cylinders during scanning, include using preservation-friendly polyethylene bands to hold together cylinders that might break and using custom built support pieces to ensure that cylinders with pieces missing are thoroughly supported.