Using public data to measure diversity in computer science research communities: A critical data governance perspective

Rachelle Bosua*, Marc Cheong, Karin Clark, Damian Clifford, Simon Coghlan, Chris Culnane, Kobi Leins, Megan Richardson

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review


Encouraging and supporting diversity and inclusion in computer science research communities is a critical issue for many reasons, including the ethical and robust design, delivery and publication of research that addresses real-world situations ranging from the use of digital tools in health to predictive policing to workplace hiring practices, just to name a few. One way to measure diversity is to apply analytical research methods to data sourced from the public domain for use in research. However, attempts to measure diversity using public data may themselves raise legal and ethical questions about the provenance of the data, research methods adopted, and treatment of diversity in the publication of results. This article interrogates the challenges of measuring diversity using public data, examining an illustrative case study framed around an academic research project at an Australian university using a public data set to identify gender representation in computer science communities. Employing a critical data governance perspective, we point to a range of ethical and legal concerns and recommend greater regulatory guardrails to better balance public interests in research and the privacy, data protection and other ethical interests of research subjects.

Original languageEnglish
Article number105655
Number of pages10
JournalComputer Law and Security Review
Publication statusPublished - Apr 2022


  • Critical data governance
  • Data protection
  • Diversity
  • Privacy
  • Public data


Dive into the research topics of 'Using public data to measure diversity in computer science research communities: A critical data governance perspective'. Together they form a unique fingerprint.

Cite this