I

N

V

I

T

E

D

 

S

E

S

S

I

O

N

S

Statistics and Information Technology
Organizer: Alan Karr
(
karr@niss.org)
National Institute of Statistical Sciences

Description:
The 1999 report of the President's Information Technology Advisory Council (PITAC), available online at www.ccic.gov/ac/report/, pointedly notes that the future of our society depends heavily on information technology (IT). The report presents a compelling case that economic and social benefits of IT cannot be realized without massive, continuing investment in IT research and development. The research agenda advanced by PITAC has four principal themes that address both the future information infrastructure and the people who use it:

  • Software that is reliable, adaptable, and predictable;
  • Data networks that are powerful, flexible, and scalable;
  • High-end computing systems for researchers and industry; and
  • Socio-economic impacts of IT, especially on commerce, learning, government and the workforce.

This agenda clearly calls for important research in computer science and social science. Both this research and the problems that underlie it are driven by data, models, and evaluation---that is, they are inherently statistical. However, the implications for the discipline of statistics are only beginning to be articulated. Some implications seem clear. New statistical methods that work in the staggeringly complex settings of the future must be created, in collaboration with the other disciplines. Scalable statistical techniques are needed to cope with vast amounts of data of disparate types (for example, streaming video) and qualities. Science and policy, equally, demand methodology that merges, combines, and assimilates data from laboratory experiments, observational studies, and numerical (computer) models. Comprehensible presentation and visualization of results to multiple audiences are crucial.

Format:
The session will consist of three presentations illustrating some of the research needs and opportunities for statistics related to information technology.

Participants:
Todd L. Graves (presentation, How Should We Publish Data Analyses in the Web Age?)
Todd is a Technical Staff Member of the Statistical Sciences Group at the Los Alamos National Laboratory. He holds a Ph.D. and an M.S. in Statistics from Stanford University and a B.S. in Statistics and Probability from Michigan State University. After finishing school, he served as a postdoctoral fellow of the National Institute of Statistical Sciences, performing research on software engineering and transportation. Additional current interests include design and analysis of computer experiments, massive data set analysis, and how data analytic papers and journals can take advantage of the capabilities of the web.

Ashish Sanil (presentation, Geographic Aggregation Procedures for Data Disclosure Limitation)
Ashish Sanil has B.Sc. and M.Sc. degrees in Mathematics from the Indian Institute of Technology, Kharagpur, India; M.S. in Social and Decision Sciences, M.S. and Ph.D. in Statistics from Carnegie Mellon University, Pittsburgh, Pennsylvania. He has been a Junior Fellow at the National Institute of Statistical Sciences since 1998 where he has been working on designing prototype Web-based query systems which allow users to conduct statistical analyses on selected databases subject to confidentiality restrictions. His research interests involve computationally intensive applications of statistics. Specifically, confidentiality and data disclosure issues, nonparametric regression, and the analysis of large datasets.

Nandini Raghavan (presentation, Detecting Defection: Mining Massive Online Data to Model ISP Customer Churn)
Nandini Raghavan currently has a visiting appointment in the Statistics group at AT&T Labs-Research. She was a Visiting Fellow at the National Institute of Statistical Sciences (NISS) in North Carolina from Sept. '98 - Apr. '99; and was on the faculty of the Department of Statistics at Ohio State University from '93-'99. Her currents research interests are in scaling statistical methodologies for massive datasets; with applications to packet data modeling and analysis. Other interests include spatial statistics, Bayesian inference and nonparametric function estimation.

 

Invited Sessions Home