Wing herself never intended to study computer science. In the mid-1970s, she entered MIT to pursue electrical engineering, inspired by her father, a professor in that field. When she discovered her interest in computer science, she called him up to ask if it was a passing fad. After all, the field didn’t even have textbooks. He assured her that it wasn’t. Wing switched majors and never looked back.
Formerly corporate vice president of Microsoft Research and now executive vice president for research at Columbia University, Wing is a leader in promoting data science in multiple disciplines.
Anil Ananthaswamy recently asked Wing about her ambitious agenda to promote “trustworthy AI,” one of 10 research challenges she’s identified in her attempt to make AI systems more fair and less biased.
Q: Would you say that there’s a transformation afoot in the way computation is done?
A: Absolutely. Moore’s Law carried us a long way. We knew we were going to hit the ceiling for Moore’s Law, [so] parallel computing came into prominence. But the phase shift was cloud computing. Original distributed file systems were a kind of baby cloud computing, where your files weren’t local to your machine; they were somewhere else on the server. Cloud computing takes that and amplifies it even more, where the data is not near you; the compute is not near you.
The next shift is about data. For the longest time, we fixated on cycles, making things work faster—the processors, CPUs, GPUs, and more parallel servers. We ignored the data part. Now we have to fixate on data.
Q: That’s the domain of data science. How would you define it? What are the challenges of using the data?
A: I have a very succinct definition. Data science is the study of extracting value from data.
You can’t just give me a bunch of raw data and I push a button and the value comes out. It starts with collecting, processing, storing, managing, analyzing, and visualizing the data, and then interpreting the results. I call it the data life cycle. Every step in that cycle is a lot of work.
Q: When you’re using big data, concerns often crop up about privacy, security, fairness, and bias. How does one address these problems, especially in AI?
A: I have this new research agenda I’m promoting. I call it trustworthy AI, inspired by the decades of progress we made in trustworthy computing. By trustworthiness, we usually mean security, reliability, availability, privacy, and usability. Over the past two decades, we’ve made a lot of progress. We have formal methods that can assure the correctness of a piece of code; we have security protocols that increase the security of a particular system. And we have certain notions of privacy that are formalized.
Trustworthy AI ups the ante in two ways. All of a sudden, we’re talking about robustness and fairness—robustness meaning if you perturb the input, the output is not perturbed by very much. And we’re talking about interpretability. These are things we never used to talk about when we talked about computing.