The Data Science and AI for Good landscape has exploded, with more organizations doing data for good work than ever before. At the same time, there’s an abundance of challenges that the social sector still faces when leveraging data science and artificial intelligence applications for social good initiatives. For example, significant challenges that limit social sector organizations ability to advance their data capabilities include:
- identifying data science problems,
- understanding the data,
- building proof-of-concept models,
- implementing models at scale,
- evaluating a project’s ethical implications .
Additionally, the need for using data responsibly and evaluating the project with an ethical review lens has gained recent attention through the important work of the Algorithmic Justice League and others. There are various training materials, such as We All Count’s data equity workshops, that offer organizations insight into what it means to be responsible in data use. Still, social sector practitioners often wonder how to practically embed ethical review into their data science project process end-to-end.
In an effort to accelerate data science initiatives in the social sector, DataKind has assembled learnings from its nearly decade-long experience of implementing and running Data Science and AI for Good projects in a newly released Playbook.
The Playbook demonstrates our project processes and best practices, which are told through the lens of global volunteers, domain experts, technologists, and partner organizations. It intends to enable the user to complete high quality and ethical data science projects in the service of humanity.
Doing Data Science and AI for Good well is all about careful and inclusive reflection and evaluation at every stage of the project process. Ethical data science is an ongoing process and can’t be done with simply checking in at the beginning, middle, and end. A high quality project can only be completed through intentional thought by everyone and at every stage of the project. By embedding an ethical review process into each piece of the project, from discovering a partnership to evaluating project impact, you can ensure that your Data Science and AI for Good project makes a positive impact on humanity.
Here’s a summary of those best practices and lessons learned within the framework of our six-step project process.
1. Discover: Will partnership be a good fit?
Many social sector organizations choose to partner with data science experts to use AI in order to take advantage of the deep data science knowledge, but the decision to form a partnership should weigh criteria beyond the offered technical expertise.
The first step of the project process is to evaluate a potential partnership for mutual buy-in, bandwidth, and values alignment. An essential piece of this is drafting a project impact map through a simplified logic model summarized as a project statement. Through this, the team can identify if they’re working towards goals that are aligned with both teams values and that they have the right partnership in place for the project.
Finally, before fully designing the project, critique the project idea by evaluating the ethical implications and assessing possible transferable solutions. Taking on a critical lens to reflect on what might not work from the start is essential in embedding responsible data science practices throughout your project process, before even touching the data.
2. Design: Explore the data and goals
Next, complete a data audit to evaluate project feasibility, data quality, inclusion or exclusion concerns, and bias before committing to a project. Ensure accountable and high quality project design by including subject matter experts in the design process and defining accountability to communities impacted.
As it comes up often in the sector, involving stakeholders in a participatory design process is essential, but it’s also important to understand what’s reasonable from a responsible data science lens with the existing data as part of this design process. Do not assume that a participatory design process is sufficient for ethical data science project design.
An ethical review of the data and potential solution needs to be fully explored in the design process, before any work is done on the project. This includes conducting a project risk assessment, creating mitigation strategies, and identifying a clear pathway to success to ensure sustainability.
3. Prepare: Recruit a skilled team
Although many team members need to be onboard for the project design process, it’s best to wait to recruit the full project team until after the project has been fully designed, ensuring the right skills and experience are represented on the project team, based on the project design.
Additionally, selecting a diverse team with antiracist practices embedded in the recruitment process enables you to produce the highest quality projects possible. Part of preparing the team is also onboarding project volunteers to ensure collaborative, inclusive teamwork.
A comprehensive onboarding process, to both the project and collaboration norms, is key to creating a positive work environment that will lead to stronger teams. This also includes setting up a project plan with agile sprints, right-sized for the team’s capacity and timeline.
4. Execute: Create a prototype and adjust
At step four, build a prototype of the data product and uncover the data’s insights by collaborating closely with the social sector actors throughout the development process. Think of executing on the project as a key part of knowledge sharing, in that the social actors should be intimately involved in the analysis, so that they can be confident in implementing the tool in the long-term.
Create a prototype for full team review and feedback from end users and communities impacted before committing to a final product deliverable, as feedback on the prototype can pivot a project to maximize positive impact and minimize any possible negative consequences.
Throughout this stage of the project process, evaluate bias and the ethics of end products by following through – and adjusting – the mitigation strategies outlined during project design. Balance getting creative and experimenting to find new insights, while staying within the project’s scope to deliver a project that will be actionable.
All the while, don’t overlook the importance of high quality coding and strong code management practices. Quality coding enables the creation of quality and ethical products. For example, readable code enables greater transparency, more complete code reviews, and points at which to evaluate how technical decisions have been made with an ethical lens.
You can build an impactful Data Science and AI for Good project with participatory methods, prototyping, ethical review, high quality coding, and experimentation.
5. Share: Lessons learned and reusable tools
It’s easy to speed through sharing about a project, but this is one of the most important steps in high quality data science for social good work, so that:
- the project is set up for successful implementation and potential scale and
- others can learn from your mistakes and successes.
Create practical documentation to enable sustainability and use: This includes “how to” guides with screenshots, video tutorials, an outline of project maintenance requirements, and a roadmap outlining clear next steps. At this point in the project process, the data science team should again work closely with users, social sector actors, and subject matter experts to decide on and implement system integration for the project deliverables.
Through close collaboration and clear documentation, you set the product up for success in future iterations. At the same time, share lessons learned externally for the greater community to benefit from what you learned.
6. Evaluate: What worked and what didn’t?
The final step in an ethical and high quality Data Science and AI for Good project is evaluating the project after enough time has passed for the team to learn from the initial implementation. Check in to see if the project maintenance plan was sustained, and re-engage with the team to adjust the strategy accordingly if not.
This is also an opportunity to measure project success and learn from how well the project outputs map to the desired outcomes. Was the impact map that you outlined when first discovering the project accurate? Additionally, reflect again on the project outputs and outcomes with an ethical lens.
Understanding how the project is actually being used post-implementation, does this raise any additional ethical concerns? With all this in mind, consider your next project to continue the use of data science in the service of humanity. What have you learned from this project, and what should you do next?
By Rachel Wells, Senior Manager, Center of Excellence, DataKind
A great initiative to support isolated and interested upstarts on ICT literacy.
Thanks to ICT works for setting the platform for members to express themselves.
Thanks to the writer to for sharing this good experience.
I will soon post mine to