Now that we know about data mining, let's put it in context. Data mining sits between two activities that can have serious policy implications: data gathering and decision making. I will use the concepts in this diagram, and the diagram itself, to organize the rest of the talk.

Data gathering provides the data to which data mining algorithms are applied. One of the most contentious issues that have arisen recently with respect to data gathering is database matching, in which multiple, previously separate databases are integrated into a single database for the purpose of analysis. This process would use database integration or data fusion technologies discussed on the previous slide. This can raise serious privacy concerns because the separation of the databases can provide a type of de facto privacy. Database matching removes this protection. Another obvious issue is new data collection. Any new collection of data raises privacy issues, as it did in the OTA study I mentioned previously.

Decision making is the process which data mining hopes to inform, by providing useful knowledge. Of course, decision making, particularly in counter-terrorism investigations, raises policy issues. One of the most common is profiling, the potential for investigation and arrest decisions to be made on characteristics such as race, ethnicity, or gender, rather than on deeper, more meaningful indicators. Another important issue is the prevalence of false positives, where an entirely innocent individual or group is targeted for investigation because of poor decision making.

Clearly, the policy implications of decisions about data gathering and decision making deserve serious attention. That said, the approaches taken to these two processes are largely separate from the use of data mining techniques. As I will discuss in a few minutes, the decision to use data mining techniques places relatively few constraints on the size or level of integration of data collection, and using data mining techniques does not require particular types of decision making.