Collection of internet traffic data at the gates of a large Enterprise necessarily involves data cleaning, integration, selection and transformation, especially if data streaming strategies are employed. The huge quantities of packets that typically cross the Enterprise gateway – some 1½ to 2½ gigabytes an hour of IP traffic here at George Mason University – makes multiple passes through the data cost prohibitive. Data cleansing, customarily perceived as the removal of noise and inconsistent data, is instead seen as a flagging and tagging procedure to facilitate detection of malformed or corrupted IP packets associated with malicious intrusion, or subtle reconnaissance activity as a precursor to a massive attack on the Enterprise computing infrastructure. Since real-time or near real-time implementation of data analysis comprising such innovative concepts as data streaming or evolutionary graphics, fast in-line data cleansing and preparation is required. This paper discusses and illustrates the strategies we have incorporated into our data collection and analysis effort to identify and correlate anomalous activities that point to the discovery of underlying attack strategies or the theft of its knowledge base.