Re-Identification in Court Decisions with Knowledge Graphs
Re-Identification in Court Decisions with Knowledge Graphs
This project is available as a Master's project.
Introduction
Swiss court decisions are anonymized to protect the privacy of the involved people (parties, victims, etc.). Previous research [1] has shown that it is possible to re-identify companies involved in court decisions by linking the rulings with external data in certain cases. Our project tries to further build an automated system for re-identifying involved people from court rulings. This system can then be used as a test for the anonymization practice of Swiss courts. For more information regarding the overarching research project please go here.
We propose to approach the general problem of re-identification (not only for a specific domain) with knowledge graphs (ontologies). In a first step, you will construct knowledge graphs from both external data and court decisions. Then you will perform ontology alignment [2] to connect the two knowledge graphs (a big one from external data and a small one from the court decision to be re-identified). An anonymized node from the knowledge graph of the court decision being aligned to a node from the external data knowledge graph is a re-identification. In a last step, you will analyze the success of your system by measuring the recall and precision of the re-identifications.
Research Questions
So far, to the best of our knowledge, no one has tried re-identifying individuals occurring in court decisions with ontology alignment.
RQ1: How many individuals can be re-identified using ontology alignment (recall)?
RQ2: What percentage of re-identifications are actually correct (precision)?
Steps
- Construct a knowledge graph from external data like newspaper articles (data is already available)
- Construct knowledge graphs from court decisions
- Perform ontology alignment between the knowledge graph from the court decision and the one from the external data
- Detect and verify re-identifications
- Analyze the experimental results
Activities
⬤⬤⬤◯◯ Programming
⬤⬤⬤⬤◯ Experimentation
⬤◯◯◯◯ Literature
Prerequisites
Good programming skills (preferably in Python)
Preferably experience in deep learning (transformers)
Contact
References
[1] Vokinger, K.N., Mühlematter, U.J., 2019. Re-Identifikation von Gerichtsurteilen durch «Linkage» von Daten(banken). Jusletter 27.
[2] Iyer, Vivek, Arvind Agarwal and Harshit Kumar. “VeeAlign: a supervised deep learning approach to ontology alignment.” OM@ISWC (2020).