Research

The Department of Physics at MIT (PDF)
The Scientific Data Flood: A Case Study of "How Much Information?"

Stuart Madnick, John Norris Maguire Professor of Information Technology, MIT Sloan School of Management & Professor of Engineering Systems, MIT School of Engineering

MacKenzie Smith, Associate Director of Technology, MIT Libraries

Kate Clopeck, Masters of Science, Technology and Policy Program, MIT

June 2009

Abstract:
MIT’s Physics department has about 90 experimental physics faculty, who generate massive amounts of data. The nature and size of each project varies, but they tend to run continuously over months or years. For example, the Compact Muon Solenoid detector, housed at CERN, produces about 8000 terabytes per year of experimental data, plus a similar amount of simulation data, all of which is processed multiple times. The data is stored in an internationally distributed, tiered system that provides backup and sharing. Scientists at MIT pull off data in chunks of about 500 TB, which are then filtered and analyzed on campus. About 500 worldwide users store working data at MIT, with 1 TB in a RAID array allocated for each. Although the underlying physical sensors can remain fixed for years, the amount of raw experimental data still increases, based on upgrades in methods of collection and processing. A rough extrapolation is that the Physics department as a whole stores about 2 * 1018 bytes a year (2 exabytes) of new data. Other papers examine other labs at MIT.