Contained in:
Book Chapter

Procedure informatiche di tutela della trasparenza e riservatezza dei dati

  • Simone Marinai

This chapter initially describes the possible types of anonymization and analyzes the document formats on which it is necessary to operate. After analyzing the state of the art of automatic document anonymization techniques, a prototype of a semi-automatic sentence anonymization application is described in detail. Finally, experimental results related to the use of the prototype within the Agile Justice project are analyzed.

  • Keywords:
  • anonymisation,
  • prototip,
+ Show More

Simone Marinai

University of Florence, Italy - ORCID: 0000-0002-6702-2277

  1. Csányi, Gergely Márk, Dániel Nagy, Renátó Vági, János Pál Vadász, and Tamás Orosz. 2021. "Challenges and Open Problems of Legal Document Anonymization" Symmetry 13, no. 8: 1490.
  2. Di Martino, B., Marulli, F., Lupi, P., & Cataldi, A. 2021. A machine learning based methodology for automatic annotation and anonymisation of privacy-related items in textual documents for justice domain. In Complex, Intelligent and Software Intensive Systems: Proceedings of the 14th International Conference on Complex, Intelligent and Software Intensive Systems (CISIS-2020) (pp. 530-539). Springer International Publishing.
  3. Garat, Diego, and Dina Wonsever. 2022. "Automatic Curation of Court Documents: Anonymizing Personal Data" Information 13, no. 1: 2
  4. Gemelli Andrea, Vivoli Emanuele, Marinai Simone. 2022. Graph neural networks and representation embedding for table extraction in PDF documents. In 2022 26th International Conference on Pattern Recognition (ICPR) (pp. 1719-1726). IEEE.
  5. Gupta, D., Saul, M., & Gilbertson, J. 2004. Evaluation of a deidentification (De-Id) software engine to share pathology reports and clinical documents for research. American journal of clinical pathology, 121(2), 176-186.
  6. Jurafsky Daniel and Martin James H.. 2000. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition (1st. ed.). Prentice Hall PTR, USA.
  7. Lison, P., Pilán, I., Sánchez, D., Batet, M., & Øvrelid, L. 2021. Anonymisation models for text data: State of the art, challenges and future directions. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (pp. 4188-4203).
  8. Smith, R. 2007. An overview of the Tesseract OCR engine. In Ninth international conference on document analysis and recognition (ICDAR 2007) (Vol. 2, pp. 629-633). IEEE.
  9. Venkatesan T. Chakaravarthy, Himanshu Gupta, Prasan Roy, and Mukesh K. Mohania. 2008. Efficient techniques for document sanitization. In Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM 2008, pages 843–852, Napa Valley, California, USA.
  10. Witten, Ian H. 2004. Text Mining. The Practical Handbook of Internet Computing
PDF
  • Publication Year: 2023
  • Pages: 213-228
  • Content License: CC BY 4.0
  • © 2023 Author(s)

XML
  • Publication Year: 2023
  • Content License: CC BY 4.0
  • © 2023 Author(s)

Chapter Information

Chapter Title

Procedure informatiche di tutela della trasparenza e riservatezza dei dati

Authors

Simone Marinai

Language

Italian

DOI

10.36253/979-12-215-0316-6.14

Peer Reviewed

Publication Year

2023

Copyright Information

© 2023 Author(s)

Content License

CC BY 4.0

Metadata License

CC0 1.0

Bibliographic Information

Book Title

Giustizia sostenibile

Book Subtitle

Sfide organizzative e tecnologiche per una nuova professionalità

Editors

Paola Lucarelli

Peer Reviewed

Number of Pages

270

Publication Year

2023

Copyright Information

© 2023 Author(s)

Content License

CC BY 4.0

Metadata License

CC0 1.0

Publisher Name

Firenze University Press

DOI

10.36253/979-12-215-0316-6

ISBN Print

979-12-215-0315-9

eISBN (pdf)

979-12-215-0316-6

Series Title

Studi e saggi

Series ISSN

2704-6478

Series E-ISSN

2704-5919

85

Fulltext
downloads

98

Views

Export Citation

1,361

Open Access Books

in the Catalogue

2,368

Book Chapters

3,870,371

Fulltext
downloads

4,536

Authors

from 943 Research Institutions

of 66 Nations

67

scientific boards

from 357 Research Institutions

of 43 Nations

1,249

Referees

from 381 Research Institutions

of 38 Nations