EFF REVIEW OF MAY 20 REPORT ON TOTAL INFORMATION AWARENESS

<?php

include("eff_setup2.php");

$smarty = new EFFSmarty;

$smarty->assign('title','Review of May 20 Report on Total Information Awareness');

// if breadcrumb == true, then it fill in the right trail in the issue
// array
$smarty->assign('breadcrumb','true');

// example:
$issue = array("Issues" => "/issues/", "Privacy" => "/issues/privacy/", "TIA" => "/issues/privacy/tia/");

$smarty->assign('issue',$issue);

$content  = '
<div id="featuretext">

<h1>EFF REVIEW OF MAY 20 REPORT ON TOTAL INFORMATION AWARENESS</h1>
<br />
<h2>EXECUTIVE SUMMARY</h2>
<br />
<p> On May 20, 2003, the Defense Advanced Research Projects Agency
(DARPA)
issued its "<a href="20030520_tia_report.php">Report to Congress regarding the Terrorism Information
Awareness
Program</a>" (TIA).  The Report, mandated by Congress and written to
"assess[] the
likely impact of the implementation" of TIA on civil liberties and
privacy, was
an opportunity for DARPA to make a careful review of the components of
TIA and
require accountability for each of these components.  Unfortunately,
the Report
did not take advantage of this opportunity. </p>

<p> The Report makes one thing quite clear:  TIA is being tested on
"real
problems" using "real data" pertaining to U.S. persons, apparently
from Defense
Department (DoD) intelligence files. </p>

<p> Otherwise, the Report doesn\'t shed much light on the issues that
concern
EFF.  It provides an overview of the various TIA components, including
some that
we hadn\'t heard of before.  Unfortunately, several of these new
programs only
make us more worried about TIA:  If successful, they\'ll make
surveillance and
dataveillance even more powerful. </p>

<p> The Report also provides a few not-very-reassuring clues to the
government\'s
thinking about privacy and civil liberties.  As far as the
government\'s
concerned, existing law protects our privacy.  But there\'s little
concern for
data accuracy, and there\'s no mention of TIA\'s accountability to
individuals. 
Also conspicuously absent is any concrete discussion of privacy or
civil
liberties issues in the actual use of TIA. </p>

<p> In short, the Report is a major disappointment.  The government
had an
opportunity to open public discourse about TIA; for the most part, it
chose to
hide behind broad and vague generalities. </p>

<ol type="I">
<li><h3>Is there anything new in the Report?</h3>
<ol type="A">
<li><h4>A new name</h4>
<p> Formerly "Total Information Awareness," TIA has been renamed
"Terrorism
Information Awareness."  The renaming is intended to correct the
impression
"that TIA was a system to be used for developing dossiers on U.S.
citizens." 
TIA\'s intent, DARPA says, is to "protect citizens by detecting and
defeating
foreign terrorist threats before an attack."  Report, Executive
Summary p. 1
(ES-1). </p>

<p> This change seems purely cosmetic, reminiscent of the FBI\'s
renaming its
Carnivore tool "DCS-1000."  There is no question that TIA, if
implemented, will
process information about U.S. persons.  For instance, TIA
technologies will be
tested on a "realistic world of synthetic transaction data" that
simulates "the
behavior of normal people, unusual-but-benign people, and terrorists."
Appendix
p. 11 (A-11). </p>

<p> More important, EFF\'s concerns are not limited to the compilation
of
dossiers.  The problems with TIA and similar programs like CAPPS II
include both
privacy and its close cousin, accountability.  These issues exist
whenever the
government can query and analyze vast amounts of personal information,
whether
in one giant database or divided among many smaller databases in both
government
and private hands.  Keep in mind that a major goal of at least one of
the
components of TIA, the Genisys program, is to "create technology that
enables
many physically disparate heterogeneous databases to be queried as if
it were
one logical \'virtually\' centralized database."  A-11. </p>
</li>
<li><h4>New programs</h4>
<p>
When TIA was first announced, we knew about these programs: Genoa
(which was ending); Genoa II; Genisys; Evidence Extraction and Link
Discovery (EELD); Wargaming the Asymmetric Environment (WAE);
Translingual Information Detection, Extraction and Summarization
(TIDES); Effective, Affordable, Reusable Speech-to-Text (EARS); Human
Identification at a Distance (HumanID); Bio-Surveillance ;
Communicator; and Babylon.  Information on all of these programs is
available from EFF\'s <a href="index.php">TIA pages</a>. 
</p>

<p>
The Report describes TIA as encompassing five overall "threads":
secure collaborative problem solving; structured discovery with
sources and methods security; link and group understanding;
context-aware visualization; and decision making with corporate
memory.  ES-2,3; A-3-5.  The programs fall into three categories:
advanced collaborative and decision support programs; language
translation programs; and data search, pattern recognition, and
privacy protection programs.  Report, p. 2-3 (R-2-3).
</p>

<p>
The new TIA programs are:
</p>
<ul>
	<li>Rapid Analytical Wargaming (RAW), which seeks to provide
decision-makers with the ability to better anticipate future
political, policy, security, and military/terrorism activity;</li>
	<li>Futures Markets Applied to Prediction (FutureMAP), which seeks
to use "policy markets" in which experts trade "outcome futures" to
answer questions like "will terrorists attack Israel with bioweapons
next year?";</li>
	<li>Global Autonomous Language Exploitation (GALE), which seeks to
teach computers to find critical foreign intelligence information from
broadcasts, conversations, newswires and the Internet and then provide
it to humans without their specifically requesting it;</li>
	<li>Scalable Social Network Analysis (SSNA), which aims to model
networks of connections like social interactions, financial
transactions, telephone calls, and organizational memberships;</li>
	<li>MisInformation Detection (MInDet), which seeks to detect
intentional misinformation and inconsistencies in publicly available
data and to identify false or misleading statements in textual
documents;</li>
	<li>Activity, Recognition, and Monitoring (ARM), which seeks to
automate the ability to capture, identify and classify human
activities in surveillance environments (including crowds) using
video, agile sensors, low power radar, infrared, and radio frequency
tags; and</li>
	<li>Next-Generation Facial Recognition (NGFR), which seeks to
improve face-recognition technology using 3-D imagery and processing
techniques, infrared and multispectral imagery, and expression
analysis.</li>
</ul>
<p>
The Report\'s description of these programs is provided in the
appendix to this
review.  Clearly, it is reasonable to expect that programs will
continue to be
added, which again highlights the need for close oversight.  If TIA is
permitted
to continue,  EFF will not be surprised if DARPA\'s new "LifeLog"
program, for
instance, joins the TIA "surveillance product line" in the next year
or two.
</p>
</li>
<li><h4>How much are we spending on TIA?</h4>

<p> The Report states that TIA funding "for FY 2003 through FY 2005 as
proposed
in the FY 2004 President\'s Budget submission is $53,752,000."  ES-2.
</p>

<p> This number is misleading, because it only counts the line item
for TIA --
which is separate from the line items for EELD, Genisys, and so on.
According
to EFF\'s arithmetic, the budget for all TIA programs described in the
Report is
about $140 million in FY 2003 and about $169 million in FY 2004. </p>
</li>
<li>

<h4>Lip service paid to privacy and civil liberties concerns even
while TIA is
experimenting with real data about real U.S. persons
</h4>

<p>
Unsurprisingly, the Report strongly emphasizes privacy issues.  Most
obviously,
the Report highlights the Genisys Privacy Protection Program, a
subcomponent of
the Genisys database or "data repository" technology program.
</p>

<p>
While DARPA has talked about the need for operational or technical (as
opposed
to legal) TIA privacy safeguards for some time, and deserves credit
for having
done so, EFF is disappointed by the superficiality of the Report\'s
discussion. 
The remainder of this review will identify shortcomings in the
Report\'s approach
to privacy.
</p>

<p>
The best example needs to be highlighted here:  while the Report tries
to
reassure us that TIA is being developed with concern for privacy and
civil
liberties, it tells us that TIA is being tested on real data about
real U.S.
persons. "TIA\'s research and testing activities have depended
entirely on (1)
information legally obtainable and usable by the Federal Government
under
existing law, or (2) wholly synthetic, artificial data that has been
generated
to resemble and model real-world patterns of behavior."  R-27
(emphasis in
original).
</p>

<p>
This statement is troubling.  It\'s OK that TIA R&amp;D is using
synthetic data (sort
of like "The Sims" gone wild).  It would be even more interesting to
know the
full set of "character attributes" used to generate these 10 million
imaginary
people.  And that this synthetic data includes imaginary people
"calling other
imaginary people, traveling places, and buying things" tells us that
TIA really
is intended to analyze the full spectrum of transactions in everyday
life.
</p>

<p>
But what about the "information legally obtainable and usable by the
Federal
Government under existing law"?  The Report doesn\'t say much about
how much or
what kind of this information is actually being used.  We know from
TIA Program
Directive Number 2 (Data Concerning Information About U.S. Persons)
that DARPA
is using DoD "intelligence entities" as "test nodes."  Directives, p.
4 (D-4).
These intelligence entities apparently include the Central
Intelligence Agency,
the National Security Agency, the Defense Intelligence Agency and
DoD\'s
Counterintelligence Field Activity.  A-2.
</p>

<p>
More to the point, the directive says:  "During experiments, DARPA,
contract and
contract support personnel analyze real data with various tools to
examine real
problems. . . .  As a result of these experiments, interesting results
from an
intelligence perspective may be generated.  Judgments regarding the
value of
such results and any subsequent production of intelligence is the
purview of the
operational users and analysts, not DARPA."  D-5.   We see here, all
too
clearly, that DARPA has already washed its hands as to the potential
effects of
using TIA on data about real people.
</p>
</li>
</ol>
</li>
<li>
<h3>The Report\'s discussion  of privacy is too limited</h3>
<ol type="A">
<li><h4>A red herring:  giant databases</h4>
<p>
A clear message of the Report is that TIA is not intended to create a
giant
government database.  R-27 ("the TIA Program is not attempting to
create or
access a centralized database that will store information gathered
from various
publicly or privately held databases").
</p>

<p>
But this message is no comfort.  As noted above, part of TIA aims to
make
physically disparate heterogeneous databases seem like a giant
"virtual"
database.  If so, does it really matter that there is no "real"
centralized
database?  People are already concerned about the loss of "practical
obscurity"
as searchable public records databases go online and as search engines
make it
easier to find information about them across many websites.
</p>
</li>
<li><h4>Few TIA programs are actually evaluated</h4>
<p>
The Report admits that "ultimate implementation of some of the
component
programs of TIA may raise significant and novel privacy and civil
liberties
policy issues."  R-27.  But it does little to address these issues.
Instead,
the Report addresses privacy issues that might arise during DARPA\'s
development
of TIA. And even here, the Report raises more questions than it
answers:  TIA is
being tested on "real data" about real people.
</p>

<p>
When the Report does talk about specific TIA programs, it takes
shortcuts.  Of
the 18 TIA programs, the Report identifies only eight that raise
privacy
concerns:  Genisys, EELD, SSNA, MInDet, Bio-ALIRT, HumanID, ARM, and
NGFR.  But
almost in the same breath, the Report sets aside Bio-ALIRT and the
three "human
identification" tools because "they are not the programs that have
given rise to
the greatest level of concern (or that gave rise to this report)."
R-31.
</p>

<p>
This is a pretty blatant dodge.  For example, ARM and NGFR weren\'t
even funded
in FY 2003; there\'s hardly been time (or public record information)
for them to
"give rise" to concerns.
</p>

<p>
The Report does recognize that "the various tools for human
identification at a
distance (HumanID, ARM, and NGFR) may raise significant privacy issues
if
deployed in particular contexts."  R-35.  But it doesn\'t discuss how
those
issues would or should be resolved.
</p>
</li>
<li>
<h4>Little concrete discussion of privacy</h4>
<p>
Even for the remaining four programs -- Genisys, EELD, SSNA and MinDet
-- the
discussion is sparse.  The Report identifies the main privacy issues
for these
programs as:  aggregation of data, unauthorized access to TIA, and
unauthorized
use of TIA.  R-33.  It doesn\'t seem to think that authorized use of
TIA raises a
major privacy issue.
</p>

<p>
But it doesn\'t address those issues so much as it deflects them.
First, the
Report emphasizes DARPA\'s commitment to TIA\'s effectiveness and
accuracy.  R-33.
 Unfortunately, it\'s not clear how TIA\'s effectiveness will be
evaluated.  "We
can never know for certain that there is a terrorist plan out there to
be
detected until after the fact; therefore, DoD is developing collateral
measures
of performance."  R-15.  DARPA admits that testing TIA\'s data-mining
technologies is a "very difficult problem" and that it\'s just
beginning these
tests.  R-17.
</p>

<p>
Second, the Report emphasizes privacy protection technologies, like
automated
audit trails, selective revelation, and anonymization.  R-34.  But the
probable
effectiveness of these technologies is not discussed.  No information
is given
about the current state of these techniques or how well they will in a
large and
complex system.
</p>

<p>
Third, the Report relies heavily on the mantra that existing law
protects
privacy.  For instance, each operational component of DoD that hosts
TIA tools
or technologies is supposed to "prepare a substantive legal review
that . . .
analyzes the legal issues raised by the underlying program to which
the TIA
tools will be applied."  R-34.  Maybe these reviews will be more
enlightening
than the Report itself.
</p>
</li>
</ol>
</li>
<li><h3>The report ignores problems in existing privacy law</h3>
<p>
The Report tells us that TIA must "operate within the confines of
existing law."
 R-32; R-28 ("This report does not recommend any changes in statutory
law"). 
There are three problems here.  First, there\'s no reason to think
that existing
law adequately protects personal privacy or civil liberties.  For
example,
Watergate-era laws like the Privacy Act are widely regarded today as
under-enforced and riddled with loopholes; the Foreign Intelligence
Surveillance
Act\'s reliance on secret courts and proceedings is of highly
questionable
constitutionality; and the Fourth Amendment\'s constitutional
protections have
been greatly weakened by the Supreme Court\'s restricted concept of
"reasonable
expectation of privacy."
</p>

<p>
Second, the gaps in existing privacy law are widening because of new
technologies that expose more of our lives to others and eliminate
"reasonable"
privacy expectations.  The rise of the Internet and e-mail means that
warrantless surveillance can gather information about people\'s
reading and
viewing habits -- something that was unlikely to happen before the
Internet.
</p>

<p>
Third, "existing" privacy law changes.  Since 9/11, the passage of the
USA-PATRIOT Act, the Homeland Security Act, and the Aviation Security
Act have
caused tectonic shifts in the privacy landscape, and not for the
better. 
Tellingly, the Report actually lists the USA-PATRIOT Act and the
Homeland
Security Act as laws that "might either constrain or (as a logistical
matter)
completely block deployment of TIA search tools."  R-18.
</p>

<p>
And sometimes, when new technologies might make surveillance harder
for the
government, new laws or regulations lighten the government\'s burden.
The use of
encryption has been held back by government regulation of encryption
export;
when the FBI told Congress that digital telephony might hinder its
ability to
wiretap phone, the Communications Assistance to Law Enforcement Act
(CALEA) was
enacted to ensure that the FBI will always be able to intercept phone
conversations.
</p>
</li>
<li><h3>The report gives short shrift to other civil liberties
issues</h3>

<p>
By law, this Report was required to "assess[] the likely impact of the
implementation" of TIA on civil liberties as well as of privacy.
EFF\'s concerns
about programs like TIA and CAPPS II always include accountability,
because
accountability is essential to both "fair information principles" and
civil
liberties other than privacy. But there\'s even less discussion in the
Report of
accountability and civil liberties issues. 
</p>
<ol type="A">
<li><h4>Public accountability in TIA\'s development </h4>
<p>
The Report emphasizes administrative controls on TIA\'s development,
such as a
DoD oversight board and a Federal Advisory Committee of outside
experts.  R-31. 
Such administrative controls are no substitute for true public
accountability;
the lack of information in this congressionally-mandated report should
make that
clear.    
</p>
</li>
<li><h4>Accountability in the use of TIA</h4>
<p>
Privacy Act concepts like the right to a copy of one\'s records, the
right to
dispute or correct information believed to be inaccurate, the right to
know how
one\'s personal information is used and who has access to it, and the
right to
know what institutions and record systems contain personal information
all
revolve around accountability.  But the Report doesn\'t discuss these
issues --
even though TIA is already being tested on real data about real
people.

For the ordinary person, TIA is a giant suspicion-generating machine.
TIA\'s
most obvious purpose is to identify suspected terrorists (although,
given the
recent allegations about the use of the Homeland Security Department
to track
Democratic legislators in Texas, one should be concerned that TIA will
be used
for other purposes).  How do you clear your name if a TIA analyst,
aided by an
"intelligent agent," mistakenly decides that you\'re suspicious?  Will
you even
know?

Amazingly, while EFF worries about the accuracy and quality of the
data that TIA
would use, the Report blithely dismisses the issue:  "TIA does not, in
and of
itself, raise any particular concerns about the accuracy of
individually
identifiable information."  R-32.  The Report\'s logic is that TIA is
"simply a
tool for more efficiently inquiring about data in the hands of
others," and this
concern about data quality "would exist regardless of the method
employed." 
R-32-33.  It\'s remarkable that the government can so easily ignore
the harm that
suspicion based on bad data might cause to people, given the problems
we already
see with "no-fly" and other watchlists.

</p>
</li>
<li><h4>Civil liberties and TIA</h4>
<p>
The Report defines civil liberties as "relat[ing] primarily to the
protection of
the individual\'s constitutional rights to, among others, freedom of
expression,
freedom of the press and assembly, freedom of religion, interstate
travel, equal
protection, and due process of law."  R-27.  But it says nothing
meaningful
about how implementing TIA might affect these civil liberties, even
though some
impacts are pretty obvious.
</p>

<p>
We noted above, for instance, that the Report recognized that TIA\'s
human
identification tools raised privacy issues.  But they raise obvious
civil
liberties issues as well.  ARM (Activity Recognition Monitoring) is
intended to
improve the ability to interpret crowd behavior.  In conjunction with
NGFR
(Next-Generation Face Recognition) and HumanID, the ability to monitor
political
demonstrations, religious assemblies, and gatherings of all kinds will
be
enhanced.  We don\'t even have to add in the other tools -- the
potential for
chilling effects on protected expressional activity is clear.
</p>

<p>
Even without TIA, we\'ve had hints of the problems.  One example: an
FBI
database, the Violent Gang and Terrorist Organization File (VGTOF), is
expanding.  In 1995 VGTOF was mainly used to track violent urban
street gangs;
today, it includes categories like "anarchists," "militia," "white
supremacist,"
"black extremist," "animal rights extremist," "environmental
extremist,"
"radical Islamic extremist," and "European origin extremist."   And of
course,
data accuracy is a problem here.  The Denver police department had for
years
been keeping secret files on political activists such as the American
Friends
Service Committee, a Quaker peace-activist group, and the pro-gun
lobby.  Last
summer, when a man listed in the Denver files as a gun-rights group
member got
into a fender-bender, a police officer checking VGTOF found him
described as "a
member of a terrorist organization" and part of a "militia."
According to a
Denver police memo, the officer reported the stop to the FBI as a
"terrorist
contact."  The Denver police and the FBI decline to comment on how the
man ended
up in VGTOF.
</p>

<p>
We have no good information about how many mistakes are in these
databases; we
should be especially concerned by their reliance on inherently fuzzy
concepts
like "extremist."  And yet only recently the Justice Department
exempted the
FBI\'s National Crime Information Center (NCIC) database, which
provides over
80,000 law enforcement agencies with access to data on wanted persons,
missing
persons, gang members, stolen cars, boats, and other information, from
the
Privacy Act requirements of accuracy, relevance, timeliness, and
completeness.  
Why?   Because "it is impossible to determine in advance what
information is
accurate, relevant, timely and complete."
</p>
</li>
<li><h4>The report ignores how deploying TIA might expand
surveillance.</h4>
<p>
Finally, the Report is almost silent on how the very existence and use
of TIA
might cause mission creep, source creep, and so on.  There are hints
-- the
Report recognizes that human identification tools might be used "to
justify
longer retention of[] stored surveillance tapes of public places."
R-35.  Here
the Report implicitly recognizes that technology not only can make
surveillance
practices more efficient, but can also expand their range or scope.
</p>

<p>
Elsewhere, however, the Report is blind to this dynamic.  For
instance, the
Report finds that because TIA "take[s] the data as it finds it" in
private
databases, TIA does not pose the privacy concern that "parties whose
databases
would be queried [would] begin collecting data that they do not
already
collect."  R-32.  But it is just as plausible that once TIA begins
using private
databases, there will be political, legal or social pressure for
private parties
to collect more information or to store it longer.  The obvious
historical
precedent is the bank records retention requirements of the Bank
Secrecy Act
made infamous in California Bankers Association v. Shultz.  Even
without TIA,
there has been talk of requiring ISPs to retain records of their
subscriber\'s
Internet use.
</p>
</li>
</ol>
</li>
<li><h3>Conclusion</h3>
<p>
EFF\'s criticisms may seem unfairly harsh.  Congress certainly did not
expect
DARPA to produce a rigorous dissertation on privacy and civil
liberties. 
Nevertheless, we are disappointed by the lack of concrete discussion.
In our
experience, researchers usually think a great deal about how their
work might be
used, and often have a better idea of their work\'s implications than
do
outsiders.  EFF hoped, perhaps vainly, that some of that concrete
thinking about
TIA\'s implications would be revealed in the Report.  Instead, the
Report is
largely content to speak in broad and vague terms about what TIA may
accomplish
and how the privacy and civil liberties concerns might be addressed if
everything works.  
</p>
</li>
</ol>
</div>
';

global $REQUEST_URI;
$smarty->assign('content',$content);
$smarty->display('generic.tpl',$REQUEST_URI);

?>