Nothing Special   »   [go: up one dir, main page]

Skip to main content

Ethical and Socially-Aware Data Labels

  • Conference paper
  • First Online:
Information Management and Big Data (SIMBig 2018)

Abstract

Many software systems today make use of large amount of personal data to make recommendations or decisions that affect our daily lives. These software systems generally operate without guarantees of non-discriminatory practices, as instead often required to human decision-makers, and therefore are attracting increasing scrutiny. Our research is focused on the specific problem of biased software-based decisions caused from biased input data. In this regard, we propose a data labeling framework based on the identification of measurable data characteristics that could lead to downstream discriminating effects. We test the proposed framework on a real dataset, which allowed us to detect risks of discrimination for the case of population groups.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://bit.ly/2Hoa2q7.

  2. 2.

    See https://bit.ly/1XMKh5R.

  3. 3.

    See https://datanutrition.media.mit.edu.

  4. 4.

    See https://en.wikipedia.org/wiki/Garbage_in,_garbage_out.

  5. 5.

    For the definitions of inherent quality measures see [10].

  6. 6.

    https://www.kaggle.com/uciml/default-of-credit-card-clients-dataset.

  7. 7.

    See https://bit.ly/2NyNVPx.

  8. 8.

    Details on how to compute each metric can be retrieved from [10].

References

  1. Barocas, S., Selbst, A.D.: Big data’s disparate impact. Calif. Law Rev. 104(3), 671–732 (2016)

    Google Scholar 

  2. Corrales, D.C., Corrales, J.C., Ledezma, A.: How to address the data quality issues in regression models: a guided process for data cleaning. Symmetry 10(4), 99 (2018). https://doi.org/10.3390/sym10040099. https://bit.ly/2xOLVzN

    Article  Google Scholar 

  3. Doshi-Velez, F., et al.: Accountability of AI under the law: the role of explanation. Berkman Center Research Publication Forthcoming, Harvard Public Law Working Paper 18(07) (2017)

    Google Scholar 

  4. Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemeln, R.: Fairness through awareness. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pp. 214–226. ACM (2012)

    Google Scholar 

  5. Friedler, S.A., Scheidegger, C., Venkatasubramanian, S.: On the (im) possibility of fairness. arXiv preprint arXiv:1609.07236 (2016)

  6. Gebru, T., et al.: Datasheets for datasets. arXiv:1803.09010 (2018)

  7. Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. In: Advances in Neural Information Processing Systems (2016)

    Google Scholar 

  8. Hosni, H., Vulpiani, A.: Forecasting in light of big data. Philos. Technol. 13, 1–13 (2017)

    Google Scholar 

  9. ISO-IEC: ISO/IEC 25012:2008 Software engineering - Software product Quality Requirements and Evaluation (SQuaRE) - Data quality model. Standard, International Organization for Standardization, Geneva, CH, December 2008

    Google Scholar 

  10. ISO-IEC: ISO/IEC 25024:2015 - Systems and software engineering - Systems and software Quality Requirements and Evaluation (SQuaRE) - Measurement of data quality. Standard, International Organization for Standardization, Geneva, CH, October 2015

    Google Scholar 

  11. Karim, N.S.A., Ammar, F.A., Aziz, R.: Ethical software: integrating code of ethics into software development life cycle. In: 2017 International Conference on Computer and Applications (ICCA), pp. 290–298, September 2017. https://doi.org/10.1109/COMAPP.2017.8079763

  12. Lepri, B., Staiano, J., Sangokoya, D., Letouzé, E., Oliver, N.: The tyranny of data? The bright and dark sides of data-driven decision-making for social good. In: Cerquitelli, T., Quercia, D., Pasquale, F. (eds.) Transparent Data Mining for Big and Small Data. SBD, vol. 11, pp. 3–24. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54024-5_1

    Chapter  Google Scholar 

  13. O’Neil, C.: Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Crown Publishing Group, New York (2016)

    MATH  Google Scholar 

  14. Torchiano, M., Vetrò, A., Iuliano, F.: Preserving the benefits of open government data by measuring and improving their quality: an empirical study. In: 2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC), vol. 1, pp. 144–153, July 2017. https://doi.org/10.1109/COMPSAC.2017.192

  15. Vetrò, A., Canova, L., Torchiano, M., Minotas, C.O., Iemma, R., Morando, F.: Open data quality measurement framework: definition and application to open government data. Gov. Inf. Q. 33(2), 325–337 (2016). https://doi.org/10.1016/j.giq.2016.02.001

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elena Beretta .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Beretta, E., Vetrò, A., Lepri, B., De Martin, J.C. (2019). Ethical and Socially-Aware Data Labels. In: Lossio-Ventura, J., Muñante, D., Alatrista-Salas, H. (eds) Information Management and Big Data. SIMBig 2018. Communications in Computer and Information Science, vol 898. Springer, Cham. https://doi.org/10.1007/978-3-030-11680-4_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-11680-4_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-11679-8

  • Online ISBN: 978-3-030-11680-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics