On May 20, 2003, the Defense Advanced Research Projects Agency (DARPA) issued its "Report to Congress regarding the Terrorism Information Awareness Program" (TIA). The Report, mandated by Congress and written to "assess the likely impact of the implementation" of TIA on civil liberties and privacy, was an opportunity for DARPA to make a careful review of the components of TIA and require accountability for each of these components. Unfortunately, the Report did not take advantage of this opportunity.
The Report makes one thing quite clear: TIA is being tested on "real problems" using "real data" pertaining to U.S. persons, apparently from Defense Department (DoD) intelligence files.
Otherwise, the Report doesn't shed much light on the issues that concern EFF. It provides an overview of the various TIA components, including some that we hadn't heard of before. Unfortunately, several of these new programs only make us more worried about TIA: If successful, they'll make surveillance and dataveillance even more powerful.
The Report also provides a few not-very-reassuring clues to the government's thinking about privacy and civil liberties. As far as the government's concerned, existing law protects our privacy. But there's little concern for data accuracy, and there's no mention of TIA's accountability to individuals. Also conspicuously absent is any concrete discussion of privacy or civil liberties issues in the actual use of TIA.
In short, the Report is a major disappointment. The government had an opportunity to open public discourse about TIA; for the most part, it chose to hide behind broad and vague generalities.
Formerly "Total Information Awareness," TIA has been renamed "Terrorism Information Awareness." The renaming is intended to correct the impression "that TIA was a system to be used for developing dossiers on U.S. citizens." TIA's intent, DARPA says, is to "protect citizens by detecting and defeating foreign terrorist threats before an attack." Report, Executive Summary p. 1 (ES-1).
This change seems purely cosmetic, reminiscent of the FBI's renaming its Carnivore tool "DCS-1000." There is no question that TIA, if implemented, will process information about U.S. persons. For instance, TIA technologies will be tested on a "realistic world of synthetic transaction data" that simulates "the behavior of normal people, unusual-but-benign people, and terrorists." Appendix p. 11 (A-11).
More important, EFF's concerns are not limited to the compilation of dossiers. The problems with TIA and similar programs like CAPPS II include both privacy and its close cousin, accountability. These issues exist whenever the government can query and analyze vast amounts of personal information, whether in one giant database or divided among many smaller databases in both government and private hands. Keep in mind that a major goal of at least one of the components of TIA, the Genisys program, is to "create technology that enables many physically disparate heterogeneous databases to be queried as if it were one logical 'virtually' centralized database." A-11.
When TIA was first announced, we knew about these programs: Genoa (which was ending); Genoa II; Genisys; Evidence Extraction and Link Discovery (EELD); Wargaming the Asymmetric Environment (WAE); Translingual Information Detection, Extraction and Summarization (TIDES); Effective, Affordable, Reusable Speech-to-Text (EARS); Human Identification at a Distance (HumanID); Bio-Surveillance ; Communicator; and Babylon. Information on all of these programs is available from EFF's TIA pages.
The Report describes TIA as encompassing five overall "threads": secure collaborative problem solving; structured discovery with sources and methods security; link and group understanding; context-aware visualization; and decision making with corporate memory. ES-2,3; A-3-5. The programs fall into three categories: advanced collaborative and decision support programs; language translation programs; and data search, pattern recognition, and privacy protection programs. Report, p. 2-3 (R-2-3).
The new TIA programs are:
The Report's description of these programs is provided in the appendix to this review. Clearly, it is reasonable to expect that programs will continue to be added, which again highlights the need for close oversight. If TIA is permitted to continue, EFF will not be surprised if DARPA's new "LifeLog" program, for instance, joins the TIA "surveillance product line" in the next year or two.
The Report states that TIA funding "for FY 2003 through FY 2005 as proposed in the FY 2004 President's Budget submission is $53,752,000." ES-2.
This number is misleading, because it only counts the line item for TIA -- which is separate from the line items for EELD, Genisys, and so on. According to EFF's arithmetic, the budget for all TIA programs described in the Report is about $140 million in FY 2003 and about $169 million in FY 2004.
Unsurprisingly, the Report strongly emphasizes privacy issues. Most obviously, the Report highlights the Genisys Privacy Protection Program, a subcomponent of the Genisys database or "data repository" technology program.
While DARPA has talked about the need for operational or technical (as opposed to legal) TIA privacy safeguards for some time, and deserves credit for having done so, EFF is disappointed by the superficiality of the Report's discussion. The remainder of this review will identify shortcomings in the Report's approach to privacy.
The best example needs to be highlighted here: while the Report tries to reassure us that TIA is being developed with concern for privacy and civil liberties, it tells us that TIA is being tested on real data about real U.S. persons. "TIA's research and testing activities have depended entirely on (1) information legally obtainable and usable by the Federal Government under existing law, or (2) wholly synthetic, artificial data that has been generated to resemble and model real-world patterns of behavior." R-27 (emphasis in original).
This statement is troubling. It's OK that TIA R&D is using synthetic data (sort of like "The Sims" gone wild). It would be even more interesting to know the full set of "character attributes" used to generate these 10 million imaginary people. And that this synthetic data includes imaginary people "calling other imaginary people, traveling places, and buying things" tells us that TIA really is intended to analyze the full spectrum of transactions in everyday life.
But what about the "information legally obtainable and usable by the Federal Government under existing law"? The Report doesn't say much about how much or what kind of this information is actually being used. We know from TIA Program Directive Number 2 (Data Concerning Information About U.S. Persons) that DARPA is using DoD "intelligence entities" as "test nodes." Directives, p. 4 (D-4). These intelligence entities apparently include the Central Intelligence Agency, the National Security Agency, the Defense Intelligence Agency and DoD's Counterintelligence Field Activity. A-2.
More to the point, the directive says: "During experiments, DARPA, contract and contract support personnel analyze real data with various tools to examine real problems. . . . As a result of these experiments, interesting results from an intelligence perspective may be generated. Judgments regarding the value of such results and any subsequent production of intelligence is the purview of the operational users and analysts, not DARPA." D-5. We see here, all too clearly, that DARPA has already washed its hands as to the potential effects of using TIA on data about real people.
A clear message of the Report is that TIA is not intended to create a giant government database. R-27 ("the TIA Program is not attempting to create or access a centralized database that will store information gathered from various publicly or privately held databases").
But this message is no comfort. As noted above, part of TIA aims to make physically disparate heterogeneous databases seem like a giant "virtual" database. If so, does it really matter that there is no "real" centralized database? People are already concerned about the loss of "practical obscurity" as searchable public records databases go online and as search engines make it easier to find information about them across many websites.
The Report admits that "ultimate implementation of some of the component programs of TIA may raise significant and novel privacy and civil liberties policy issues." R-27. But it does little to address these issues. Instead, the Report addresses privacy issues that might arise during DARPA's development of TIA. And even here, the Report raises more questions than it answers: TIA is being tested on "real data" about real people.
When the Report does talk about specific TIA programs, it takes shortcuts. Of the 18 TIA programs, the Report identifies only eight that raise privacy concerns: Genisys, EELD, SSNA, MInDet, Bio-ALIRT, HumanID, ARM, and NGFR. But almost in the same breath, the Report sets aside Bio-ALIRT and the three "human identification" tools because "they are not the programs that have given rise to the greatest level of concern (or that gave rise to this report)." R-31.
This is a pretty blatant dodge. For example, ARM and NGFR weren't even funded in FY 2003; there's hardly been time (or public record information) for them to "give rise" to concerns.
The Report does recognize that "the various tools for human identification at a distance (HumanID, ARM, and NGFR) may raise significant privacy issues if deployed in particular contexts." R-35. But it doesn't discuss how those issues would or should be resolved.
Even for the remaining four programs -- Genisys, EELD, SSNA and MinDet -- the discussion is sparse. The Report identifies the main privacy issues for these programs as: aggregation of data, unauthorized access to TIA, and unauthorized use of TIA. R-33. It doesn't seem to think that authorized use of TIA raises a major privacy issue.
But it doesn't address those issues so much as it deflects them. First, the Report emphasizes DARPA's commitment to TIA's effectiveness and accuracy. R-33. Unfortunately, it's not clear how TIA's effectiveness will be evaluated. "We can never know for certain that there is a terrorist plan out there to be detected until after the fact; therefore, DoD is developing collateral measures of performance." R-15. DARPA admits that testing TIA's data-mining technologies is a "very difficult problem" and that it's just beginning these tests. R-17.
Second, the Report emphasizes privacy protection technologies, like automated audit trails, selective revelation, and anonymization. R-34. But the probable effectiveness of these technologies is not discussed. No information is given about the current state of these techniques or how well they will in a large and complex system.
Third, the Report relies heavily on the mantra that existing law protects privacy. For instance, each operational component of DoD that hosts TIA tools or technologies is supposed to "prepare a substantive legal review that . . . analyzes the legal issues raised by the underlying program to which the TIA tools will be applied." R-34. Maybe these reviews will be more enlightening than the Report itself.
The Report tells us that TIA must "operate within the confines of existing law." R-32; R-28 ("This report does not recommend any changes in statutory law"). There are three problems here. First, there's no reason to think that existing law adequately protects personal privacy or civil liberties. For example, Watergate-era laws like the Privacy Act are widely regarded today as under-enforced and riddled with loopholes; the Foreign Intelligence Surveillance Act's reliance on secret courts and proceedings is of highly questionable constitutionality; and the Fourth Amendment's constitutional protections have been greatly weakened by the Supreme Court's restricted concept of "reasonable expectation of privacy."
Second, the gaps in existing privacy law are widening because of new technologies that expose more of our lives to others and eliminate "reasonable" privacy expectations. The rise of the Internet and e-mail means that warrantless surveillance can gather information about people's reading and viewing habits -- something that was unlikely to happen before the Internet.
Third, "existing" privacy law changes. Since 9/11, the passage of the USA-PATRIOT Act, the Homeland Security Act, and the Aviation Security Act have caused tectonic shifts in the privacy landscape, and not for the better. Tellingly, the Report actually lists the USA-PATRIOT Act and the Homeland Security Act as laws that "might either constrain or (as a logistical matter) completely block deployment of TIA search tools." R-18.
And sometimes, when new technologies might make surveillance harder for the government, new laws or regulations lighten the government's burden. The use of encryption has been held back by government regulation of encryption export; when the FBI told Congress that digital telephony might hinder its ability to wiretap phone, the Communications Assistance to Law Enforcement Act (CALEA) was enacted to ensure that the FBI will always be able to intercept phone conversations.
By law, this Report was required to "assess the likely impact of the implementation" of TIA on civil liberties as well as of privacy. EFF's concerns about programs like TIA and CAPPS II always include accountability, because accountability is essential to both "fair information principles" and civil liberties other than privacy. But there's even less discussion in the Report of accountability and civil liberties issues.
The Report emphasizes administrative controls on TIA's development, such as a DoD oversight board and a Federal Advisory Committee of outside experts. R-31. Such administrative controls are no substitute for true public accountability; the lack of information in this congressionally-mandated report should make that clear.
Privacy Act concepts like the right to a copy of one's records, the right to dispute or correct information believed to be inaccurate, the right to know how one's personal information is used and who has access to it, and the right to know what institutions and record systems contain personal information all revolve around accountability. But the Report doesn't discuss these issues -- even though TIA is already being tested on real data about real people. For the ordinary person, TIA is a giant suspicion-generating machine. TIA's most obvious purpose is to identify suspected terrorists (although, given the recent allegations about the use of the Homeland Security Department to track Democratic legislators in Texas, one should be concerned that TIA will be used for other purposes). How do you clear your name if a TIA analyst, aided by an "intelligent agent," mistakenly decides that you're suspicious? Will you even know? Amazingly, while EFF worries about the accuracy and quality of the data that TIA would use, the Report blithely dismisses the issue: "TIA does not, in and of itself, raise any particular concerns about the accuracy of individually identifiable information." R-32. The Report's logic is that TIA is "simply a tool for more efficiently inquiring about data in the hands of others," and this concern about data quality "would exist regardless of the method employed." R-32-33. It's remarkable that the government can so easily ignore the harm that suspicion based on bad data might cause to people, given the problems we already see with "no-fly" and other watchlists.
The Report defines civil liberties as "relat[ing] primarily to the protection of the individual's constitutional rights to, among others, freedom of expression, freedom of the press and assembly, freedom of religion, interstate travel, equal protection, and due process of law." R-27. But it says nothing meaningful about how implementing TIA might affect these civil liberties, even though some impacts are pretty obvious.
We noted above, for instance, that the Report recognized that TIA's human identification tools raised privacy issues. But they raise obvious civil liberties issues as well. ARM (Activity Recognition Monitoring) is intended to improve the ability to interpret crowd behavior. In conjunction with NGFR (Next-Generation Face Recognition) and HumanID, the ability to monitor political demonstrations, religious assemblies, and gatherings of all kinds will be enhanced. We don't even have to add in the other tools -- the potential for chilling effects on protected expressional activity is clear.
Even without TIA, we've had hints of the problems. One example: an FBI database, the Violent Gang and Terrorist Organization File (VGTOF), is expanding. In 1995 VGTOF was mainly used to track violent urban street gangs; today, it includes categories like "anarchists," "militia," "white supremacist," "black extremist," "animal rights extremist," "environmental extremist," "radical Islamic extremist," and "European origin extremist." And of course, data accuracy is a problem here. The Denver police department had for years been keeping secret files on political activists such as the American Friends Service Committee, a Quaker peace-activist group, and the pro-gun lobby. Last summer, when a man listed in the Denver files as a gun-rights group member got into a fender-bender, a police officer checking VGTOF found him described as "a member of a terrorist organization" and part of a "militia." According to a Denver police memo, the officer reported the stop to the FBI as a "terrorist contact." The Denver police and the FBI decline to comment on how the man ended up in VGTOF.
We have no good information about how many mistakes are in these databases; we should be especially concerned by their reliance on inherently fuzzy concepts like "extremist." And yet only recently the Justice Department exempted the FBI's National Crime Information Center (NCIC) database, which provides over 80,000 law enforcement agencies with access to data on wanted persons, missing persons, gang members, stolen cars, boats, and other information, from the Privacy Act requirements of accuracy, relevance, timeliness, and completeness. Why? Because "it is impossible to determine in advance what information is accurate, relevant, timely and complete."
Finally, the Report is almost silent on how the very existence and use of TIA might cause mission creep, source creep, and so on. There are hints -- the Report recognizes that human identification tools might be used "to justify longer retention of stored surveillance tapes of public places." R-35. Here the Report implicitly recognizes that technology not only can make surveillance practices more efficient, but can also expand their range or scope.
Elsewhere, however, the Report is blind to this dynamic. For instance, the Report finds that because TIA "take[s] the data as it finds it" in private databases, TIA does not pose the privacy concern that "parties whose databases would be queried [would] begin collecting data that they do not already collect." R-32. But it is just as plausible that once TIA begins using private databases, there will be political, legal or social pressure for private parties to collect more information or to store it longer. The obvious historical precedent is the bank records retention requirements of the Bank Secrecy Act made infamous in California Bankers Association v. Shultz. Even without TIA, there has been talk of requiring ISPs to retain records of their subscriber's Internet use.
EFF's criticisms may seem unfairly harsh. Congress certainly did not expect DARPA to produce a rigorous dissertation on privacy and civil liberties. Nevertheless, we are disappointed by the lack of concrete discussion. In our experience, researchers usually think a great deal about how their work might be used, and often have a better idea of their work's implications than do outsiders. EFF hoped, perhaps vainly, that some of that concrete thinking about TIA's implications would be revealed in the Report. Instead, the Report is largely content to speak in broad and vague terms about what TIA may accomplish and how the privacy and civil liberties concerns might be addressed if everything works.