Research
The Department of Physics at MIT (PDF)
The Scientific Data Flood: A Case Study of "How Much Information?"
Stuart Madnick, John Norris Maguire Professor of Information Technology, MIT Sloan School of
Management & Professor of Engineering Systems, MIT School of Engineering
MacKenzie Smith, Associate Director of Technology, MIT Libraries
Kate Clopeck, Masters of Science, Technology and Policy Program, MIT
June 2009
Abstract:
MIT’s Physics department has about 90 experimental physics faculty, who generate massive amounts
of data. The nature and size of each project varies, but they tend to run continuously over months
or years. For example, the Compact Muon Solenoid detector, housed at CERN, produces about
8000 terabytes per year of experimental data, plus a similar amount of simulation data, all of which
is processed multiple times. The data is stored in an internationally distributed, tiered system that
provides backup and sharing. Scientists at MIT pull off data in chunks of about 500 TB, which are
then filtered and analyzed on campus. About 500 worldwide users store working data at MIT, with 1
TB in a RAID array allocated for each. Although the underlying physical sensors can remain fixed
for years, the amount of raw experimental data still increases, based on upgrades in methods of
collection and processing. A rough extrapolation is that the Physics department as a whole stores
about 2 * 1018 bytes a year (2 exabytes) of new data. Other papers examine other labs at MIT. |
|