First Person• DATA

The Dawn of the Age of Data: Science for Social Good at IBM

With a companywide emphasis on science and engineering, IBM’s labs and working groups have proven fertile ground for humanitarian and socially conscious projects.
Chris Mohney

Few companies survive for more than a hundred years, and even among those few, not many have changed as much over their lifetimes as IBM. With corporate history stretching back to the days of manufacturing time clocks and cash registers, up through the era of mainframes and personal computers, IBM is now a pure technology company. But perhaps just as important as the tech is the science that powers it, as well as the extremely deep reservoir of global talent that drives IBM’s research and development.

Like many large corporations, IBM has a tradition of philanthropy and cause-based community work. But lately, a new variety of much more direct engagement with the humanitarian world has taken root in the form of its Science for Social Good initiative. Cofounded by IBM researchers Saška Mojsilović and Kush Varshney, the program partners research fellows with nongovernmental organizations to do real science in support of addressing major societal problems. 

The Science of Social Good team has tackled a wide array of challenges so far, including creating algorithms that help hunt cognitive disease, incorporating natural structures into human designs, tracking the potential progression of online hate speech to offline violence, adding a moral dimension to digital agents in video games, and teaching a neural network to ignore skin color when interacting with humans. For their next annual collection of projects, Mojsilović and Varshney intend to focus on the themes of fairness, bias, and discrimination in artificial intelligence and machine learning. They spoke with Overture about the origins, progress, and future of Science for Social Good.

SAŠKA MOJSILOVIĆ: My PhD was in 1997, and it was in electrical engineering. I was actually in computer vision, so I was trying to do image recognition and all the things that people are talking about so much today. I came to Bell Labs in 1998 during the dot-com bubble, and it was glorious—Bell Labs, Lucent Technologies—totally the rock-star experience. But then the market crashed. I just happened to talk to folks from IBM at a conference, and they said we have a great thing and a great group that matches your background. So I came to IBM in 2000, and [I’ve] stayed ever since. The funny thing was, no one cared about computer vision back then. Who wants to teach computers to see? So I decided with my skills, I could go to IBM’s math department. They were doing some interesting work around helping the company make decisions, modeling, and stuff like that. And that’s essentially how we created the whole business analytics play for IBM, by working on internal problems—like predicting which clients will buy something, which clients might terminate their engagements, and a whole bunch of other things.

KUSH VARSHNEY: I did my PhD in electrical engineering and computer science at MIT in 2010, and at the time I was just applying to a lot of different research labs. Saška saw my résumé and invited me to come to an interview at IBM, and it just felt like a good place to be doing a mix of applied research, as well as theoretical machine learning research. It just felt like a place to do some groundbreaking stuff that interacts with society and can really make an impact on the world. At that point, I didn’t even know the word “analytics”—the group was known as the business analytics and mathematical sciences department. I was coming in blind and didn’t realize that machine learning tools and techniques were being used in all sorts of human domains. 

SM: Around the early 2000s, data became more available. As enterprises started to collect more data, and most of that was really structured or enterprise resource planning system–type data, we had the opportunity to use math and modeling and analytics and machine learning on that data. What IBM did, and where we largely played a role, was to illuminate the minds of people in the industry with respect to what is really possible when you collect that data. For example, we were able to predict which sales opportunities were more likely to happen, so that you could direct sellers to work on them, or which clients might change their minds so that you can efficiently direct your efforts. We’ve done a lot of very interesting work that today might not seem so innovative, but back then it was. 

KV: In my opinion, the tools actually have not gotten so much better as compared to 10 or 15 years ago. I think the data is what’s driving a lot of the progress that we’re seeing. Neural networks—which are the rage right now—have been known for more than 20 years. Of course, there have been improvements, but the real difference is that there’s so much more data. 

SM: In terms of neural nets and deep learning, as we acquired more computational power, that allowed people to create really complex models. Coupled with these massive data sets, it created the kind of accuracy that was in some applications superhuman. Algorithmically, though, most of these things we knew already. 

KV: They’ve gotten better, but it’s not like there’s been some completely new paradigm. We’ve made a lot of progress based on the massive data sets, the computation, and algorithms to do some things that are of superhuman capability. That doesn’t mean everything is now at a point where it’s really surpassing human abilities. There are many spectacular failures of machine learning and artificial intelligence. 

SM: It’s natural to see these evolutions of different applications. Typically you see the applications first in a commercial space or consumer space, like mobile phones, and apps, and things that help you shop or find a restaurant, and then the enterprise-level applications. Well, we’ve come up with all these great apps and solutions for somewhat trivial things, but we should also really be thinking about bigger problems.

“I think the data is what’s driving a lot of the progress that we’re seeing.”

KV: Through IBM, we could see that if a company puts its mind to doing something for humanity, then there’s a lot that can happen. We started to think about how we could utilize the fact that we have this whole building and these global labs full of really smart people to work on some projects that affect humanity in good ways. 

IBM RESEARCHERS SUPPORT SOCIAL-GOOD
FELLOWS WORKING IN A WIDE VARIETY OF FIELDS AND CAUSES, INCLUDING ILLITERACY, HATE SPEECH, AND THE OPIOID CRISIS.

SM: It’s very hard to say, hey, I want to solve the problems of this world by asking people to volunteer a little bit of their time. It’s just not the way it should happen. We have access to probably three-thousand-plus scientists around the world, and also many other resources in IBM. Is there a way we can scale it better? And by doing these projects, we raise awareness that these kinds of problems really need to be solved. They are massively complex. In a way, they’re much more complex than building a product. 

KV: We’re conducting scientific research. We’re trying to do things that have never been done before, which is different than a lot of the other corporate social responsibility programs that use existing assets and technologies. When we partner with an NGO, we see it as a joint research arrangement. We have certain expertise, the NGO has certain expertise, and together we are doing research to solve problems that have never been solved before. I think that’s unique, and it’s not something you can point to as happening in other initiatives. 

SM: When you open your eyes to these problems, they inspire your scientific mind in a way that does not often happen. Because they’re multidisciplinary, you learn new things, and by forcing yourself to figure out where the challenges are and how they can be addressed, we come up with really phenomenal inspiration. The purpose of this program for us was to give back with science, but also to alert people that these kinds of problems need new solutions, and these solutions might not happen in the kind of state of the art when you’re working on something that is very well defined. 

KV: At IBM Research, we’ve always had this mode of operation where we work with clients and have a semiconsulting relationship to work across all sorts of industries. We have experience with encountering new problems that we’ve never worked on, and we know how to formulate problems that are not very well defined. If you are a tech company focused solely on your own products, then you don’t necessarily have the full scope of those sorts of experiences to draw on. 

SM: The humanitarian problems everyone is struggling with require attention from many different stakeholders. We need technology companies or skills like ours to create these solutions. But you first need to work with people who are fighting these battles on the front lines, like NGOs or public-sector agencies, to really understand what is needed. And we also need investments to create the solutions. Sometimes making an algorithm or creating a solution may not be that costly, but implementing it, and scaling it, and following through and operationalizing and supporting it—these are massive investments. And I don’t think that we have completely figured out what is the right business model to create and operationalize these kinds of technologies. 

KV: The way technology has evolved is to be more interactive and more integrated with society. If you looked at what IBM Research was doing 30 years ago, like physics research or designing computers, that didn’t necessarily need back-and-forth with people outside the company. As times have changed, data science has been the point at which the fully technical solutions and the societal things interact. It seems natural that where the societal aspect is coming in, we should also be looking at doing things for societal good.

SM: Artificial intelligence, data, and algorithms are really permeating everything around us. There are so many opportunities to apply them to good use. Chances are that if you have good data on a certain problem, you probably have lots of opportunities to address that problem in ways that have not been done before.

“We’re trying to do things that have never been done before.”

KV: When I first started working here, I didn’t realize that a lot of problems could be approached mathematically and in terms of data. If you asked someone 10 years ago, “What is data,” all it meant was a spreadsheet of numbers. But now people recognize that images are data, text is data … 

SM: Data culture. I’m a data junkie. I really believe that if we have data, you can do anything. It’s like my religion. We’ve been looking into using data to inform policy decisions. For example, should I, as a country, make this or that kind of investment? If I make this kind of an investment, will the healthcare or health outcomes of rural populations improve? Or we’ve been looking into the opioid problem—if we change how opioids are prescribed, or if we create benchmarks or guidelines, will it improve outcomes? Technologically and algorithmically, these are very difficult analyses. We have very beautiful solutions and algorithms for image classification. But for these kinds of causal models or inferring these kinds of relationships, we don’t yet have a very good algorithmic foundation. I think this involves next-generation data culture. 

KV: I’d never judge a data set just by looking at the date set itself. It’s all about understanding the context and the people behind it. We want to start with the problem—understanding the issues faced by the people in the NGO or the public sector. Then we ask, is there data available to try to answer these questions? I don’t view data as beautiful, unless…. 

SM: … unless the story is beautiful. Data tells stories. We try to pick projects that will somehow illuminate people’s minds with respect to the opportunity, or that will teach them something, that will expand someone’s horizons. Many of these problems are big, and we don’t want to claim that we are the ones solving them. But you maybe have a proof of concept that someone else will pick up. Or maybe just by doing this, you inspire somebody else to continue or do something similar, and then that thing propagates. This is the idea of really illustrating what’s possible—or what people cannot really imagine is possible until you tell them.