These may seem like new-fangled problems spurred on by the Internet. But they're not. These are dilemmas faced by society at the advent of every new way of proliferating data — from the printing press to the Xerox machine.
"If you look at the early 20th century, the number of paper documents was growing exponentially as all of these huge government agencies were popping up," says John Wonderlich, policy director at the Sunlight Foundation in Washington. Advocates for institutional transparency, the Sunlight Foundation, among other functions, leans on government to make public records available. "Back then they didn't know what to throw out, what to standardize, or how to organize. The challenges we face with data are in similar scope — that's why it's so important that these issues are addressed head-on before it's too late."
Since before the modern Internet existed, the Departments of Defense and State have been heading for an information crisis. Data from the two agencies now consume about half of the US government's 2100 digital storage facilities, and twice as much memory as all federal departments used 10 years ago, on account of the raging war machine and new-age digital diplomacy. Even with those resources, the State Department only has enough capacity to keep six months' worth of e-mail backups, and, as evidenced by highly publicized disorganization in the VA health-care system, the Department of Defense is hardly a model bureaucracy itself. The disarray is daunting — especially to those responsible for organizing all the mess.
"[The] growth in redundant infrastructure investments is costly, inefficient, unsustainable, and has a significant impact on energy consumption," Chief Information Officer Kundra wrote in a March 2010 memo. "In addition to the energy impact, information collected from agencies in 2009 shows relatively low utilization rates of current infrastructure and limited reuse of data centers within or across agencies. The cost of operating a single data center is significant, from hardware and software costs to real-estate and cooling costs."
The 36-year-old Kundra cut his teeth as chief technology officer for former District of Columbia Mayor Adrian Fenty. Kundra has been criticized by some for the breadth of his plans, and in other cases for not doing enough, but his early efforts are already netting results. Next year, the Department of Defense will move most of its data storage into a new state-of-the-art 5000-square-foot facility, while the Department of Homeland Security is on track to consolidate 24 data centers into two over the next five years.
The model agencies are using to streamline data is called virtualization. A policy-speak term-du-jour, virtualization in this case means connecting vast resources and discarding overlapping information. Under the Federal Data Center Consolidation Initiative, all government agencies are being forced to clean out their server closets, starting with major users like the Department of the Treasury, which is scheduled to close 13 data centers in the near future. It's a slow start — only an estimated 20 percent of federal databases have been virtualized so far. But it's a start.
Not enough people
As a society, we've decided we want government to be more transparent and more accountable. The government has responded, but the more information we keep, the harder it becomes to search for what we want. Data, in and of itself, isn't the answer — unless you've got people to make sense of it. Even with millions being spent on storage solutions, public workers warn that at the bottleneck of information flow sits a shortage of good old-fashioned manpower.