The Police Data and the Data Driven Justice Initiatives

listen on castbox.fmlisten on google podcastslisten on player.fmlisten on pocketcastslisten on podcast addictlisten on tuninlisten on Amazon Musiclisten on Stitcher



The Police Data Initiative and the Data Driven Justice Initiative

In this episode I speak with Clarence Wardell and Kelly Jin about their mutual service as part of the White House’s Police Data Initiative and Data Driven Justice Initiative respectively.

The Police Data Initiative was organized to use open data to increase transparency and community trust as well as to help police agencies use data for internal accountability. The PDI emerged from recommendations made by the Task Force on 21st Century Policing.

The Data Driven Justice Initiative was organized to help city, county, and state governments use data-driven strategies to help low-level offenders with mental illness get directed to the right services rather than into the criminal justice system.

Clarence Wardell of the US Digital Services team at the White House

Clarence Wardell


Clarence Wardell III is a member of the U.S. Digital Service team at the White House, and was previously a Presidential Innovation Fellow. As a fellow he worked with the U.S. Department of Energy and the White House on open data initiatives. He co-organized and lead the White House Police Data Initiative, which is the topic I invited him here to discuss today.
Kelly Jin of the White House Office of Science and Technology Policy

Kelly Jin


Kelly Jin serves as policy advisor to the U.S. Chief Technology Officer and the Chief Data Scientist in the White House Office of Science and Technology Policy. Previously, Kelly served as Citywide Analytics Manager at the City of Boston and helped build and co-led Boston's Analytics Team. In her current role, she's helping to scale the President's Data-Driven Justice (DDJ) initiative, which I've invited her here to discuss today.


The Police Data Initiative and the Data Driven Justice Initiative

INTRO: Data Skeptic features interviews with experts on topics related to data science, all through the eye of scientific skepticism.

HOST: In today's episode, we have two related interviews; first, with Clarence Wardell of the US Digital Services Team at the White House and secondly, Kelly Jin from the White House Office of Science and Technology Policy.

Clarence Wardell III is a member of the US Digital Services Team at the White House and was previously a Presidential Innovation Fellow. As a fellow, he worked with the US Department of Energy and the White House on open data initiatives. He co-organized and led the White House Police Data Initiative, which is the topic I invited him here to discuss today. Clarence, welcome to Data Skeptic.

CLARENCE: Hi Kyle. Thanks for having me.

HOST: So I don't want to take for granted that my listeners all know exactly what the Police Data Initiative is. Could you give me an overview of what the president's vision was when he put the program together?

CLARENCE: This is one of the President's efforts during the second term where he has really put a lot of energy behind trying to get more technologists, entrepreneurs, engineers, et cetera, to come into government to use their talents to serve. The Presidential Innovation Fellowship was one of the first programs started in that vein.

So, I was part of the third cohort of fellows. We started our tour of duty, if you will, in September 2014. It just so happened to coincide for me where that was almost a month to the day after Michael Brown was killed in Ferguson, Missouri.

So that whole fall, as I'm sure many of your listeners will remember, it was not just Ferguson but several other tragic events that led to a lot of protests and a lot of conversation nationally about the relationship between police and citizens and particularly minority communities.

As part of that conversation, folks began looking to the White House to really put forth some answers or pathways to how we address some of these issues. One of the things that the president did, to start to get a handle on what were some of the big issues that were out there and how we can move together as a community, right?

So both kinds of the activist community, community organizers, as well as the law enforcement community and government [can] start to move forward to solve some of these issues. So the president convened what was called his Task Force of 21st Century Policing, which was a body that really consisted as I said of law enforcement activists, lawyers, civil rights folks who put together eventually a series of hearings around the country.

They heard from folks in the community and then put together a list of recommendations that basically said that these are the things that a modern 21st century police department need to start doing to really kind of live up to what the president would hope to be a more trustful relationship between police and citizens.

On top of all of that effort that was going on, there was this national theme, national conversation. Again, I'm sure your listeners will remember. That started to focus in on this particular issue of data and the lack of data, particularly around officer-involved shootings. It's kind of one key component of it. It wasn't that data in and of itself was going to stop an armed man from being killed by police officers or through relationships by and large. But it was - people saw that as a really tangible starting point where it was like, hey, if we can't even understand the nature of the problem, how many people are being killed by police officers in this country on a yearly, weekly, monthly basis, we can't even really begin to have an informed dialogue around a lot of this.

So folks started looking to the White House to say, "Well, why can't you guys mandate that police departments release this type of data?" et cetera, et cetera... not really understanding that the White House doesn't have that power. We have a very highly federated law enforcement system in this country. The FBI does some of this data collection, but the White House isn't really in that business.

But what we could do and what the White House is good at doing is something that US CTO Megan Smith refers to as scouting and scaling. So it's this notion that if you're trying to solve a problem and you're trying to think through it on your own, you should do a scan of the environment first because it's likely that someone somewhere has either solved the problem or starting to try to solve the problem.

So the Police Data Initiative was really born kind of out of that, right? So it's this confluence of this notion of scouting and scaling and also kind of this - you know, how do we start to chip away at this problem that has just become really a national issue. So it was myself, Denise Ross, Lynn Overman - Denise was another Presidential Innovation Fellow - folks at Domestic Policy Council within the CTO, the Chief Data Scientist's office.

We started to have these conversations around these issues and what we saw as we looked around the country, we saw police departments - you know, the skepticism about whether or not police departments would release this type of data. But we saw departments already doing it, right?

So for instance, we saw the Dallas Police Department Chief Brown down there who had released in December of 2014 twelve years of officer-involved shooting data at a very disaggregated level, right? We looked and saw Montgomery County, Maryland doing something similar around traffic stops. The auto police department had - held recently a hack-a-thon working with members of the community, to try to figure out how they can read body cam data to better share that.

So what we thought in some sense was this is great that these departments are trying to move this forward on their own. But given a lot of the expertise that we have and that we have developed particularly through the President's push around open data at the federal level, as well as the burgeoning open data movement at the local level - that a lot of times police departments want a part of - we said, "How about we start a community of practice around this work?"

Could we get commitments from departments around the country to release data around police citizen interactions and so from my perspective, it was important not just that departments are releasing data around officer-involved shootings but that we start to create a more holistic picture of what does policing look like in our community. For us, that's not just data around officer-involved shootings, but it's use of force. It's traffic and pedestrian stops, uniformed citations issued. It's quite frankly meetings attended by officers in the community, assaults on officers, dispatch calls, 911, 311 calls, any and everything that speaks to what this interaction looks like in communities.

It just so happens that of the 59 recommendations of the task force report, 14 dealt in some way or another with this notion of increased transparency and better use of data and technology as a means to improve the relationship between police and citizens was viewed as critical by the task force in moving the ball forward. So those threads came together. Within that stew, the Police Data Initiative was born.

If any of your listeners have kind of a passive knowledge of law enforcement and data sharing previously, part of this point in time, there wasn't really much that was happening in that space other than departments sharing crime rate. Even at that point, we still don't have a really great handle on crime in this country; in the data collection there.

So it was a little bit out of the box thinking and a little bit audacious to think that departments would start voluntarily sharing information and quite frankly exposes them a bit and speaks to accountability in some respects.

So again, things like complains against officers. But what we saw at this moment was a cadre of departments who were willing to do this work and then believed this to be fundamental to the relationship and trust building with the constituents and the folks that they serve.

So we're hopeful that we would get maybe 10 departments committed upfront and as I talked to you today, we have 131 who are committed over the span of essentially a year and a half. It has been really exciting to work with a lot of departments and I can go on a little bit more. I'm sure this will come up in our conversation a little bit more, about exactly what the work of the Police Data Initiative looks like.

HOST: Yeah. So you had mentioned those 59 recommendations. I will have some links in the show notes for anyone that wants to go through the whole list. I don't know that we have time for each of them here. But if you could pick a few, maybe highlight some of the most important ones or most impactful ones.

CLARENCE: So there are some that actually have nothing to do with data and tech as you might expect. So quite a bit is focused on training; increasing training or officers, things looking at implicit biased training. So how do we make officers and really all of us cognizant of our own biases? How does that then translate to our actions in the day-to-day work environment?

It just so happens that in law enforcement, those decisions, and those biases can often have life or death consequences. So that was viewed as important by the task force. So there were explicit recommendations that were around kind of this data and tech piece. It was really just these broadest things like use the data to be more transparent and to start to give the community insight into your processes, your policies, things of that nature.

Obviously, from the technology piece, body-worn cameras were a big one, right? We've seen the president talk about that and we've seen DOJ grapple with a lot of the issues that come up around body-worn cameras. I think because video was such a strong component of a lot of the movement that we've seen since 2014, a lot of folks lashed on the body-worn cameras which in and of itself has a dedicated component too, right? And a much trickier one quite frankly than some of the more kind of traditional data sets that we've been focusing on through the Police Data Initiative and a lot of obviously privacy issues that one would have to take into consideration there.

Those are just a few - a little bit of flavor of the recommendation. But as you said, I would encourage your listeners to go and look at some of those recommendations that were put forth by the task force before. And as I said, at the broad level, there were about 14 that kind of touched in different ways on this increased transparency through data and technology as a means to begin to build trust and strengthen those relationships.

We've really taken that to heart and really kind of used those recommendations as our guiding light if you will for the Police Data Initiative work in part because the power of those recommendations, they didn't come from any one person, right? They didn't come from the president. They didn't come from our office. They didn't come from DOJ. It really came from kind of a consensus-driven organization group which was this task force which then received input from a broad spectrum of other opinions, which I think makes it really powerful and we've seen them really move the ball forward across a whole host of issues since they became final in May of 2015.

HOST: There has been in my opinion a really wide acceptance in government of being more progressive about open data. But I could definitely sympathize where there might be some pushback at the local level. Did you see any of that? Are there concerns about privacy for the officer or people involved? How does that roll out when you have such - you know, varying departments?

CLARENCE: We haven't seen a ton of pushback I would say. There is certainly I would say a spectrum of skepticism from some departments and I say this - quite frankly, it has worked like any startup would work or any new program or initiative, right? You're going to have your early adopters, some of your true believers, your folks who either believe and love your product or just believe that things have to change and that we should experiment with some new ways around transparency and openness. Really, that's what we had early on when the president announced this initiative formally in May 2015. It actually coincided with the finalization on the task force recommendations.

Now at that point in time, we had 22 jurisdictions who had signed on, who had - and I should say for your listeners, signing on meant that we had buy-in from the chief, buy-in from the mayor's office to release at least three data sets around police instances and interactions in an open data format.

Then what we had is that they designated a point of contact or a quarterback who's going to be responsible for shepherding that work forward. So what we have done since then, since that time is we've held essentially weekly or biweekly standup phone calls, 30-minute phone calls with the jurisdictions or round-robin just as any kind of product team would do. You know, on successes, blockers, challenges. We found that to be a good not only just check-in mechanism from our perspective, just to make sure departments are moving forward and that there are places that we can continue to help.

We've really seen kind of that cross-sharing and the confidence building that comes from that and then those folks being able to kind of talk about this work with other departments who might be on the fence, right?

So there was definitely some - a bit of concern, right? There's always the chief's perspective about issues of office privacy. That may vary depending on the chief and the local laws. As I said, Chief Brown in Dallas in December 2014 had released officer names and very detailed information about officer-involved shooting.

I would say as a general matter of practice; the more transparent you could be as a department or as an organization, period. The more that lends to trust-building over time. But then there are issues where privacy really matters quite a bit, and these are things that have come up as a result of an initiative and I would say some of the work that we've been most proud of because we felt that this voluntary, non-mandated effort can help in this initial community, help raise a lot of these issues which we think we're starting to find some solutions for.

So these are things like victim privacy, quite frankly, domestic violence, sexual assault. So when you think about this fine line of how do you balance the ability to be accountable to those folks and to represent their stories and their incidents in the data, while at the same time being cognizant of those privacy issues, it's a hard one to walk. One of the things that we did is we've helped organize in Orlando, Florida with the chief of police there, Chief Mina, as well as a local non-profit group there that were focused on victim advocacy as well as partners within the Office of the Vice President here who focus on violence on women as well as partners at DOJ.

We basically help to data-dive to specifically focus on these issues of central assault and victim privacy around the data. So we've seen those types of events. We've had others around the country focusing on different issues and how do you build those relationships that have kind of broken that ice, right? And kind of push back and get some of that skepticism.

Like yeah, we know that this is not necessarily easy work. But it's necessary work at this point in time and that what we really want are as - a coalition who's willing to do that work to help us sort through some of that messiness.

So it has been an evolving process - the 22 moving to 131. There were certainly a lot of chiefs that I talked to along the way who got it but were a little bit skeptical. The best thing for that is quite frankly not to talk to Clarence but I was able to put them in a contact with other chiefs who have been doing the work to really kind of lessen some of their concern about this.

One of the things that we tried to do upfront was to try to get a broad representation of different police departments. In that 131, we had departments like New York City Police Department, Philadelphia Police Department, police departments with lots of officers that serve a ton of people and they have a ton of resources versus we have police departments that serve populations of 3000 people and they have three or four officers.

The point that we've been trying to get across is that if you want to do this work, it may look different in different places. One city may have an open data portal that they have a vendor that they pay to do that work and another city may just be posting an Excel spreadsheet as a CSV file that folks can go and download.

That's OK. Really, what matters is that transparency. You don't have to have all the bells and whistles. That's really the point that we tried to get across to help with some of that skepticism or at least doubt. Quite frankly, in some departments minds that they can do this work.

HOST: In any data set, you're going to have outliers and strange anomalies and they always need to be cleaned. Anyone who works on data knows that. In Los Angeles here, there's a lot of great open data sets, I'm a mentor at a little boot camp called NewMet Data Science, and they were looking at some of the police data in the last iteration of the boot camp. They found this weird anomaly. They had to track down actually the - it looked like there was a spike in crime right at noon. Like, there was some sort of crime spree at lunchtime or something like that.

Ultimately, what we have seemed to figure out is that must either be a rounding or a default value or something like that. How do in general people engage or police departments get out metadata like that? How can they do better community engagement just to say like the things that the observations and rows don't tell the whole story about.

CLARENCE: That's a great point you bring up there and that's something that we've been intentionally conscious about or try to be conscious about as we move this initiative forward. So there are a couple of pieces there.

So one that you talked about is this notion of just the quality of the data. You know, quite frankly, these are a lot of new data sets that departments might have been collecting in some form or fashion for a while. But they weren't sharing it with the public, right? They weren't' necessarily being used for any type of internal analysis. So they were just kind of there and the internal auditors or the internal investigator to go back, to check on a particular officer, who might have gotten in trouble at some point in the future or some other case, might have used them.

So this is classic argument around open data, right? It's that once you kind of put it out there, the act of actually having to put your data forward, to expose it to some level of that scrutiny, it's A, going to cause you to be a little bit more conscious around A, how you're taking the data and then how you're putting it out there. But then also once you get those community eyes on it, right? You can quickly find things that you might - that might have just been in your blind spot.

So that has been one of the big pushes that we've had for departments and it's quite frankly one of the cautions that I've had for a lot of folks who just want to jump in and do analysis on the data, right? You raised an excellent point by doing some of this cleaning is that we're still at a point in time where the data collection systems have not been that great.

So I often caution folks around just kind of doing the analysis without really understanding how this data ended up in this system. Who collected it? When did it get there? Was it transcribed from someone's paper notes and then put in the system? Was it directly entered there? Things like that.

So one of the things in there as you brought up, that we've been trying to push is this notion of kind of these data-dive events. My colleague Denise has coined this kind of four-hour data dive. We're not necessarily doing any like this big, sophisticated analysis on this. But how do you use data as a convenient? Bring the departments, the chiefs in to the room, the patrol officers into the room, the city CIO and CTO and chief data officer into the room, along with those community groups, right?

How do you have a conversation? How do you put the data on the screen? You know, let the kids or whoever it is or your civic hackers, your local brigade play around with it. Ask questions and really start this dialogue. So one of the first events that Denise actually helped catalyze from her town of - she actually moved up to DC from New Orleans. We still have great relationships there with some of the local civic tech community.

So it was actually a really - we wrote about this on a Medium post that I could share with you and your listeners should definitely check that out. We did a data dive event with the local - a group there called Operation Spark, which brought kids. The local community there was trying to teach the new tech and data skills.

One of the events that they actually held was essentially kind of this hackathon or data dive on 311 and policing data from their community. So you had kids who were learning about data extraction and cleaning on data that represented essentially their community.

Alongside them, they had the chief of police there. They had the city CIO there who came from the same community, as well as several of the officers.

There's actually a really cool picture there of a young woman who was actually teaching the police chief how to code at this event. I know it sounds a little bit cheesy. But those are the relationships that really matter, right?

If you can kind of remove this guard of like it's us versus them or it's the police and we can't talk to them, then we're not in a good place. You can break down those barriers and silos and actually have people having a conversation around quite frankly data as a convener, right? I don't want to discount the role of anecdote and people have lived experiences. But marrying that with empirical evidence, I think you can start to have a rich dialogue.

That's really for us what this has always been about is like how do you A, have this culture shift where we can move police departments to a place where this level of transparency is just part of a modern 21st century police force. But then also move it to the point where we're having dialogue and we're having these feedback loops between citizens, community and the police officers in the departments to really move the ball forward here.

HOST: So you had mentioned the participating departments had to release a minimum of three data sets. Were those just any appropriate data set or are there some standards we can see across all the participating departments?

CLARENCE: What we ask for is departments to commit to releasing at least three data sets. So many of those departments have already executed on those commitments. When I say 131 jurisdictions that has also been a rolling basis. We've had some departments who have been in that from the beginning, who have most likely have already released those data sets and then some who have just joined within the past couple of months or so.

What it is is it's really a signal from departments to say, "We want to learn how to do this work and/or we might have already been in a good place and we have a robust, open data infrastructure and we can do this tomorrow." We've seen some of that as well, departments who could almost flip the switch and do it.

We didn't require departments to do it on any particular timeline. What we said is that you should be working towards it, right? We want this to be a sustainable, repeatable process and not just a rush to get out a one-time data set.

Ultimately and the thing that we have stressed is that this is not a commitment to the White House. This is a commitment to your community, right? So it's not about meeting the White House deadline. It's about being - like once you make the commitment, it's going to be the community that holds you accountable, right? So it's not necessarily you need to get this done within the next couple months...

That was kind of the same piece for this issue of standards, right? Across data sets comes up. So what we left at is police-citizen interaction data sets. When we think of that officer-involved shootings, use of force, traffic and pedestrian stops, police complaints, citizen complaints about police, attacks on/ assaults on officers, things of that nature. We've also seen new data sets that we haven't even thought of - being created such as like community engagement data sets, right? The departments trying to basically represent the work that they do in the communities. That isn't necessarily issuing citations or arresting people, right?

So [we're] trying to have again this holistic picture. So we actually didn't prescribe any standards across that, also recognizing that depending on what city that you're in, you may be bound by certain local or state laws where you can only release a certain amount of data whereas another city or department can release a lot more.

So we didn't want to constrict what departments can release on one level. But we also didn't want to set up this initial barrier to say if you can't meet the standard, then we don't want your data, right? We thought it would be a much more useful process to have departments grappling with the issue of how do we get something out.

If it's not up to par, right? The community will let you know and then at that point, we can have this process of really trying to evolve to a place where there's something that's satisfiable from the community's perspective as well as the department's perspective.

Then hopefully - and we've seen this, right? We're starting to see this with laws being passed in California as well as other states where they're now starting to mandate reporting of this type of traffic data or traffic stop data, right? And things like use of force and officer-involved shootings. It's starting to be mandated in state and local levels as well.

So what we are hoping is that quite frankly that something like - that the Police Data Initiative in name doesn't exist four, five years from now But that this work that we've started now can actually help seed a lot of these efforts and initiatives that will happen maybe at the federal level. But we've seen a lot more movement at the state and local level that we hope this work can influence.

HOST: How can I, in my own community wherever that may be and how the technologies may vary, [get involved.] What are the best steps for me to find out about what's available or maybe attend one of these data-dives you were mentioning?

CLARENCE: Yeah, I would encourage your listeners to check out - they could go to and/or the and that's - it's the same site but it's hosted by the Police Foundation, which has been one of the kind of early supporters of the Police Data Initiative. They have essentially this kind of aggregate clearinghouse if you will that A, will show your listeners who the participating jurisdictions are as well as what data sets have been released to-date. The White House does not collect any data as part of the Police Data Initiative. Neither does the Police Foundation. What the Police Foundation does is kind of collect that metadata, right? We will link out to the departments who have released that in the local open data portal, released that CSV file, right?

So that's a great place to get started there. One of the things that I probably didn't touch on enough, particularly for this - at your show is that this work was born in part out of - I've talked a lot about the police departments, right?

But one of the first conveniences that we had around this and I should say like DJ Patil, US Chief Data Scientist, this is - a lot of this work has been coming out of his portfolio as well. We firmly believe that this is - this work has to happen in partnership with the technologist at the table. So one of the first conveniences that we did, we made sure to bring the data scientist into the room, the local CTOs, CIOs, local brigade members, Code for America folks.

We've had the pleasure of working with folks like Bayes Impact, DataKind and others - University of Chicago's Data Science for Social Good folks. This work has happened with a lot of effort and support from the data science and technology community and that's going to be more critical in moving forward because departments quite frankly would love to have that talent A, in-house. Well, if they can't get it in-house, they're looking for other ways to partner.

We've seen that and that has been one of the biggest things that we've been able to do through the Police Data Initiative is actually to make some of those partnerships and to make some of this work happen through that. So one of the early engagements was a Code for America team working with the Indianapolis Police Department to build - what they're going to be releasing as open source kind of data extractor and publication tool.

We have to have those critical voices at the table to continue to push this forward. I would say folks are thinking about starting companies in this space. There will be a lot of opportunity in the years moving forward. One of the biggest complaints that we're hearing from departments is that software is never built for them. Software is built and then they're forced to conform to it and that they have a very antagonistic relationship with a lot of their vendors quite frankly. We've seen - we've started to see a few companies that have taken a different approach. It's not just about these companies taking a different approach.

What that then allows, if you can build software that makes it easier for the officer to input that data, we just get better data, right?

HOST: Yeah.

CLARENCE: We get better data all the time. So we're able to do more of this analysis and we're - and it's of higher quality later on. So I think that there's a ton of ways that folks like your listeners could contribute to the space.

Quite obviously, you know, folks I think are familiar with the Code for America Brigades, plugging them there at the local level. But quite frankly, the Police Foundation has been a great resource for this and has been a matchmaker if you will in helping do some of that work and I think they will be - moving forward, they have some support and some runway to continue to move this work forward.

So I know they're always looking for partnerships and support from that community as well. So I encourage your listeners to check out the Police Foundation's work, the Public Safety Data Portal that I mentioned, but also I think there will be a lot of opportunities to plug in across a lot of those groups that I mentioned beforehand. If there's not, start something new.

HOST: Absolutely. Well, to wind up, I think at least as far as I can tell, the real objective here is let's get the data available and it's all about openness and transparency and accountability. As you had warned earlier, you can't focus too much on anecdotes. But to wind up, maybe do you have a favorite anecdote of something that has happened as a result of the Police Data Initiative?

CLARENCE: The ones that I'm most proud of - again, I say don't really have much to do with any type of kind of sophisticated analysis. It's those events that have - that are happening at the local level. You know, one of the questions that we get honestly is what's next for the Police Data Initiative. I say this work is fundamentally local. It's about the relationship between local police departments, their constituents, those citizens, the communities they're serving and then having an open dialogue.

So the one that I mentioned earlier in New Orleans, that's one of the ones that we're really proud of. I think the chief there has stepped up in a really big way to really have that dialogue and create a space for that community to really learn more about the department.

Like I said, we thought we might be able to get 10 to 15 departments to do this work and we've really seen this way kind of take off. There's still a ton more work to do, but we've been encouraged by the progress thus far and the folks who have been willing to step up and like I said, do the hard work.

HOST: Yeah, absolutely. I think there has in my opinion been a lot of great progress and I'm excited to see that continue at the local level and see where we are in a few years with open data as a result of this initiative.

Well, Clarence, thank you so much for taking the time to come on the show and share a little bit about your work and your collaborator's work and all that has been achieved with the Police Data Initiative.

CLARENCE: Great. Thanks so much for having me, Kyle.


HOST: Kelly Jin serves as policy adviser to the US Chief Technology Officer and the Chief Data Scientist in the White House Office of Science and Technology Policy.

Previously, Kelly served as citywide analytics manager at the City of Boston and helped build and co-lead Boston's analytics team. In her current role, she's helping to scale the president's data-driven justice initiative, EDJ, which I've invited her to discuss today. Kelly, welcome to Data Skeptic.

KELLY: Thanks so much Kyle, really happy to be here.

HOST: Well, to begin with, maybe we should start with a high-level definition of what exactly is the data-driven justice initiative.

KELLY: Sure. So the Data-Driven Justice Initiative actually just kicked off this past June 2016 and it's really a way for different jurisdictions throughout the country to better think about how individuals should be diverted away from jail.

So for those who aren't very familiar with the criminal justice system and policing in America, there are huge numbers of individuals on low-level offenses who are cycling through our local jails. The numbers are over 10 million cycling through over 3100 local jails and costing us at the very lowest end at least $20 billion a year. So this is something that's impacting not only human outcomes but also our wallets as taxpayers as well.

So I am very privileged to be a part of this broader team. We now have about 140 jurisdictions impacting 95 million Americans, so very, very excited to be here to talk more about it.

HOST: One solution to reduce those costs that taxpayers pay is just to raise the bar of what it takes to get thrown in the jail. You know, maybe if - I don't know. We change the laws. I can go punch people and I don't go to jail or something like that.

I don't think it's about savings or reducing the amount of people per se. It seems to be about shifting who goes and what services they get. Is that right?

KELLY: Yes, definitely. So a huge part of this is how we end up impacting those individuals with mental illness here throughout the US. That is one population that we've been working on initially. Thinking about individuals who really need to get the services that they need to treat the health issues that they have, instead of sending them to jail, or instead of sending them to an emergency room, thinking about how do we actually provide the services that they need.

HOST: So I'm not quite sure how you would measure this. But do you have any sense of what percentage of people going through the justice system are - let's call them actual career criminals versus people that might not have taken those actions if they had had the proper medical and maybe mental health source of facilities available.

KELLY: Yeah, so that's a really, really good question. The very short answer of it is that the data is very disparate across many different systems throughout our country at a very, very high level. We do know that over 60 percent of individuals who cycle through our jails have some sort of mental illness.

HOST: Wow.

KELLY: So I can't pinpoint for across the nation. We know that at a very high level, these numbers are incredibly high and anyone - and I've talked to many people over the last six, eight months. Whenever you talk to someone, they say this is common sense. Why aren't we actually doing something about this?

So a lot of our work has been how do we take what really is common sense and scale that across the country?

HOST: So we could - we have a certain amount of money. It may be allocated to - we could throw someone in a jail cell or we could rehabilitate them. I'm not saying that they don't get rehabilitated in the cell, but sending them to a hospital or to a counselor is definitely a different option from jail. Are there any ways we can talk about the long-term benefits of that? I think I would much rather have someone rehabilitated and contributing to society than costing money for room and board in a center. Do you know anything about success rates for interventions?

KELLY: Yeah. So that's something we're actually working on right now. I will actually pinpoint a specific example which is we work with the University of Chicago and Johnson County, Kansas of identifying these individuals. So I talked a lot about these very high-level figures. So we've been working with jurisdictions and saying, "How do we actually pinpoint those individuals?" using the data and then saying, "If we go out and we proactively help connect them or reconnect them to the services that they need, what are the success rates?"

So my short answer to that is that we're still working on that. But we were able to work on basically a risk score for individuals in Johnson County over the last few months. But a lot of the work in the next six months, next year is really going to be on how do we then measure this. What are the success rates? A lot of this work for anyone who knows anything about criminal justice is that - because it's all really, really hard and people have very long academic research cycles and we're seeing how can we shorten a lot of that and think more operationally about what staff and what local governments need on the ground.

HOST: So I imagine there are a lot of issues of privacy and HIPAA compliance that come into play. Can you talk about the role of those things in this work?

KELLY: We've been fortunate enough to work very closely with Health and Human Services (HHS) and for those of you who don't know, at the federal level, they oversee HIPAA. HIPAA are a set of guidelines about how law enforcement health - basically anybody can share information about you when it comes to your health records.

For anyone who's thinking about the population that I just talked about, that is a huge question and actually a huge hurdle. So what we found when we went out to different communities is that they all raised their hands and they said, "HIPAA is a huge problem. We talked to our lawyers, but we're not exactly sure. What can we do and actually not do?"

So a lot of the work over the past few months has been working with HHS and actually with our communities as well. So the communities are able to stay in this specific scenario. What are we able to share and not share? The short of that is that law enforcement can share and push data to help providers and hospitals and there are specific scenarios that we've worked out and saying, "When can law enforcement actually have access to information about an individual that they're trying to make a decision?" They're at that spot where they're right there with an individual.

So a lot of the work being done there in the next year, we're also working on how do we simplify a lot of the same data integration challenges that communities have. Our smallest jurisdiction is Potter County, Pennsylvania. There are 16,000. Our largest is LA County over 10 million.

In between there, there are so many different levels of analytics capacity. For anyone who has worked in local government, it's a little bit like Parks and Recreation sometimes. You got to be a little bit strappy to think of what are the resource that I have.

So a lot of our work has been on how do we make these processes simpler. How do we unlock data from a lot of systems just like we're doing at the federal level through the past eight years? How do we actually help equip local communities with that skill set as well?

HOST: If I understand correctly, that sounds like a clever approach for the privacy and HIPAA issues rather than saying we're going to give out medical data. The police can push their data behind the HIPAA wall if you will. Is that right?

KELLY: Yes, it is. The other piece I didn't mention is that we are working on how to anonymize all this data as well. So this is all in the individual level data. Once you anonymize that information, people can do analysis on the anonymized data. That can be pushed back to the jurisdictions who can then figure out who those individuals are that they provide treatment to.

HOST: A winning story for me would be if let's say someone is out of town. They're picked up by an officer and somehow that officer is able to determine that they're from another area where they're being serviced by mental health and that helps adjust the way that that officer renders services. Is that a realistic scenario or are we going to only have that - will that officer only get access to the anonymized data? Can we make those sorts of human connections?

KELLY: So I think so. We aren't quite there right now. Something that I can talk about is we were just up in Boston last week and we're working very closely with Middlesex County which is right outside of Boston. It's where Cambridge and of course all of the great universities are located as well. It has got a population of 1.6 million and we have 22 of the police departments on board in addition to the sheriff's office.

So we were sitting there and of course, - me raising my hand - I said, "How much do you all talk to one another? How much do these departments across towns and across city lines talk to one another?" They all looked at me and they said, "Well, it's really hard," right? Just across - same thing across organizations, same things across governments. It's really, really hard.

So we are moving forward in the next few months, fingers crossed on this. How can we connect students? How can we connect data scientists to the Middlesex County community? Because you're exactly, right? Individuals are moving through different systems and just because you live in one town and you happen to be in another town, let's say if you're picked up, our systems and our roles as government, we should be able to - be able to help regardless of where you are. So that's definitely something top of mind for us.

But first, we're still tackling within communities themselves. How do we connect those systems? And then as a next step, it's really going to be, "How do we connect those communities across one another as well?"

HOST: Yeah, I would love to talk about how things operate at the community level, especially given your prior background in the City of Boston. Can you talk about some of the differences between working at the national and the local level?

KELLY: It's very interesting. My quick background was I was just at the City of Boston for the last two years building up the data team. So the kind of laughing piece with that, I spent a lot of time helping individuals figure out where to fill potholes, where parking permits needed to be, allocated to, where we were sending trash pick-up. I think they're really on ground hands-on work is so, so critical at the local level that when I came up to the federal level, I realized one big disconnect was that at the local level, we're making a decision, right? Am I going to be doing analytics? Am I going to be thinking on this? Am I actually going to be going out and filling the pothole?

If you think about that for criminal justice and law enforcement, if you ask a police officer to help define the requirements about how data is flowing between systems, they might not spend as much time on that as opposed to actually being out in the community and building trust and working on delivering services to citizens.

So one thing that I really realized - and I was at the federal level before City of Boston, it was just this - oftentimes there's disconnect, right? So how can we get the federal government speaking better with cities and counties and states? Because there's a lot of funding going in that direction, a lot of our work - and I really think beyond the Data-Driven Justice Initiative and you guys probably heard about the Police Data Initiative and many other things that we're doing. A lot of this is how do we move and shift the needle at the local community level, all the while knowing that resources are very, very constrained, all the while knowing that there's a ton of great work being done at the local level that just needs a little bit of umph and help from the federal level.

HOST: So something that excites me, particularly when we talk about open data, which obviously is - some of that we're discussing can't be open data. But do you see opportunities for local, civically minded data scientists to contribute something to their community?

KELLY: Most definitely. So I will say that in the past week or two weeks, I've gone to a few speaking engagements and pretty much after every single one, a bunch of people line up and they say, "How can I help?" So there are a few different channels that you can pursue. One is really getting involved in your local community. I'm always surprised - I live here in DC. I used to live up in Boston. I'm always surprised by how limited knowledge people have of the inner workings of how your local community really works.

So there's this basic kind of civic engagement piece. The second piece is getting involved obviously in your local data and tech community. There are great groups like Code for America. You can join a local brigade as another piece.

Then I will say as the third piece, which is the one that I care about is the data-driven justice work is going to continue. So a lot of this criminal justice work, if you are at all interested in this work, we're going to be continuing this. I will be happy to share that information and get it out to the listeners as well. But we have a huge ecosystem and the piece, the missing link, the piece that we really need - are our data scientists, integrators, people who understand all this work at the ground, or even just people really interested in this work, to roll up their sleeves and say, "Hey, I'm really willing to help because this is really going to improve outcomes for my local community."

HOST: So I could see that there might be - I don't know if I would call it a challenge or an educational barrier or something along those lines where you have departments that have operated in a particular way for quite some time before there was the availability of data like this. Essentially, when you come in and say, "Hey, we now have this assembled. It has been perhaps somewhat federated. It's time for you to use this in your everyday work," would it be fair to say there could be some resistance or someone wants to think that through? Can you talk about some of the challenges of engaging with the departments and getting sheriffs to really use data?

KELLY: A huge part of this Kyle is one how do we build trust. There's not just a community trust but there's also a, "I'm not parachuting in," "I'm not saying I can solve all of your problems." We're really coming into every situation and community and saying, "How can we help?" We know there are some challenges. We have some skills. You guys know what it's like on the ground, right? I will never, ever say I know what it's like running in the face of - running towards danger as opposed to running in the other way.

So a lot of those challenges really, really can be mitigated by working very closely with the community starting out and I think the other piece that I always think of is building out reports or products or analysis. Always, always making that an iterative approach and never saying, "I'm going to sit in this backroom by myself for a few months and do the data analysis without an understanding of the real world."

So everything that I've seen the last six months, I will honestly say criminal justice and all of this work is completely new to me. All of my work was at a very local level working on data across many, many different areas. The last six months actually going into a jail and actually going for a ride along and talking to police officers and sheriffs about what are your challenges day to day.

One of the best stories that I heard was one group of students that didn't understand why the data quality was so poor in a particular community's system around arrest and interactions that police officers were having with their local communities.

They couldn't understand. They said, "why aren't things filled in," and the students went out for a ride-along and realized that these cops were just - whenever they had a moment in between driving around were filling out things kind of on the fly on a laptop. Suddenly it clicked in their minds. All of this frustration, they were like, "Oh, I think your quality is very, very poor. We can't do the write-up ... "

Suddenly it clicked in their minds. So there's a whole piece, not just the data analysis piece on the backend. They're also thinking about the delivery - whether that's delivery to the individual or delivery of digital services to the police officer. This whole cycle, this whole loop, understanding how that works from a real person's perspective that has to deal with this 24/7 instead of just having a very siloed data tech view I think always, always informs the work. I did a ton of work with firefighters in the City of Boston and just under - getting your hands dirty literally and like understanding how the processes work is a huge part of how do we better inform our thinking and how do we better inform our recommendations as well.

HOST: So I know the Data-Driven Justice Initiative. is relatively young then. We're sprawled out for about six months now. I suspect a lot of the wins to-date are in infrastructure and record linking and this sort of thing. But I'm curious if you have any favorite stories of the ways it has touched actual people.

KELLY: So the Johnson County work which I touched on a little bit earlier really is one of the best examples, and I will say that the University of Chicago and the work that we did with them. How do we link up local universities with local jurisdiction, whether that's University of Chicago linking up with Kansas, the work that we're doing in Middlesex. I'm hoping to link them up with Harvard and MIT. Getting students on the ground and the University of Chicago students, fellows did their work in 14 weeks, which is remarkable if you take a moment and think about how long it typically takes to operationalize or even get a list of individuals.

Then really what I'm most excited about is that now we have that list and actually going out and say, "We have a list of 200 individuals. Can we find out more about these individuals? Can we dig in as a community and bring everybody to the table?" That's a huge part of the work that we're doing. That's one piece. And then I will touch on a second piece, which is for us - I mentioned at the federal level, all of these different federal agencies are doing something related to data-driven justice, specifically on individuals with mental health issues, so whether that's a passing of the 21st Century Cures Act in the last week, or that it's just connecting across federal agencies - I will say you think of any government system. The individuals that are cycling through and we are focusing on as the Data-Driven Justice Initiative are connected in some way.

So think about homeless individuals, right? That has to do with housing. Think about veterans. That's the Veterans Affairs. I talked about Health and Human Services. There's a whole group and a broader community that's thinking about this in a very siloed way and so some of the biggest wins quite honestly for me have been just bringing these individuals together in a room and that is literally ER doctors and police officers and local students, chief data officers.

We have a superpower here, which ends unfortunately in January at the White House of convening, and so our hope is that we continue to convene, continue to spark these conversations and catalyze things across communities and across our country as well.

HOST: Yeah. So that's maybe a good point to kind of wrap up on. There are uncertainties about what will happen in the next administration. I mean we could speculate. But we probably really just don't know what the plans are either way. What do you anticipate that the Data-Driven Justice Initiative will look like and how will it work locally and federally?

KELLY: So we're definitely continuing the work. I am very excited that I personally will be continuing the work. I'm joining the Laura and John Arnold Foundation come January to continue this. It will be a part of a broader ecosystem and that will also include - the National Association of Counties is going to convene all 140 plus jurisdictions - and continue to be just a really good community hub for a lot of that work. Or communities just really want to talk to one another and they're going to be a really good partner in that. We also have what I mentioned at the federal level. We've built up an inter-agency task force and our hope is that that work is going to continue to be pushed.

Then, the other pieces I touched a little bit on. But at the local level, a lot of these connections are just starting to be made. So, whether that's universities and local communities and the police departments and the hospitals and all of these players, that to me is just beginning. So we haven't even seen really a crescendo of that work being done. So, I'm super excited to - come January and I think I'm only taking a week of vacation, so that should show how dedicated I am to this work - but we're really, really excited to continue pushing this forward. And also, I think to your listeners, really, really excited in explaining these as social good problems that we need more people to help run at.

Something I personally - now that I'm going into philanthropy - know that people are always thinking about how do I get involved in government. Does that mean I have to be there for 30 years? So for us thinking through, "How do people do tours of duty?"

So go into the private sector because those skills are going to be really, really helpful and then go into government or go into non-profit and go into philanthropy. Go into all of these different pieces because that is going to make you not only a more well-rounded person, but that's also going to help you make better decisions and be a better person at the table on these different challenges as you face them throughout your career. So anyway, very, very excited to be continuing on this work and also very excited. We kind of keep a lot of our team together as well. We will be scattered about but really, really looking forward to it.

HOST: Excellent. Well, I look forward to following the continuing story.

KELLY: Thanks so much Kyle.

HOST: Yeah, Kelly. Thanks so much for coming on this show and sharing a lot of your work and insights.

KELLY: Great. Thank you.

HOST: Take care.

A few quick notes before we sign off. In the first interview when I was speaking with Clarence, I briefly mentioned an anomaly around noontime crimes that could be seen in the Los Angeles open data set. This was observed by participants in NewMet Data Science Boot Camp. A blog post about the specific issue is up at the brand new blog that has just been launched called We Quant LA. You can find that and a few other interesting posts at I'm going to be posting over there myself as well hopefully with like a monthly column. If you would like to see a few details about that or learn about the Veil of Darkness hypothesis that the team was exploring, thanks to the availability of open data, head over to and read about the results as they unfold. Thanks again for joining me everybody. Until next week, keep thinking skeptically of and with data.

EXIT VOICE-OVER: For more on this episode, visit If you enjoyed the show, please give us a review on iTunes or Stitcher.