Ilias Chalkidis Homepage

Research Interests

My general field of research is in Artificial Intelligence, especially Natural Language Processing:

NLP for Politics (Political Bias / LLMs as Voting Assistants)
Legal Text Analytics (Legal NLP / NLU)
AI Alignment (Societal Issues)
Explainability / Interpretability of NLP models (XAI)
Trustworthy and Responsible AI (Fairness & Robustness of NLP models)
Large-scale Multi-label Text Classification (LMTC)

Publications

-2025

From Citations to Criticality: Predicting Legal Decision Influence in the Multilingual Swiss Jurisprudence

Ronja Stern, Ken Kawamura, Matthias Stürmer, Ilias Chalkidis, Joel Niklaus

In the Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025), Vienna, Austria, July 27 - August 1, 2025

Article Arxiv pre-print

-2024

Investigating LLMs as Voting Assistants via Contextual Augmentation: A Case Study on the European Parliament Elections 2024

Ilias Chalkidis

In the Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2024), Miami, FL, USA, November 12-16, 2024

Article Arxiv pre-print 🤗 Dataset
Hyperbolic Contrastive Learning for Document Representations - A Multi-View Approach with Paragraph-level Similarities

Jaeeun Nam, Ilias Chalkidis and Mina Rezaei

In the Proceedings of the European Conference for Artificial Intelligence (ECAI 2024), Santiago de Compostela, Spain, October 19-24, 2024

Article
Attention-Driven Dropout: A Simple Augmentation Method to Improve Self-supervised Contrastive Sentence Embeddings

Fabian Stermann, Ilias Chalkidis, Amirhossein Vahidi, Bernd Bischl, Mina Rezaei

In the Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2024), Vilnius, Lithuania, September 9-13, 2024

Article
MultiLegalPile: A 689GB Multilingual Legal Corpus (Outstanding Paper Award - ACL 2024 🏆)

Joel Niklaus, Veton Matoshi, Matthias Stürmer, Ilias Chalkidis and Daniel E. Ho

In the Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024), Bangkok, Thailand, August 11–16, 2024

Arxiv pre-print Article 🤗 Dataset 🤗 Models
On the Interplay between Fairness and Explainability

Stephanie Brandl, Emanuele Bugliarello and Ilias Chalkidis

In the Proceedings of the Workshop on Trustworthy Natural Language Processing (TrustNLP 2024), Mexico City, Mexico, June 21, 2024

Arxiv pre-print Article
Llama meets EU: Investigating the European Political Spectrum through the Lens of LLMs

Ilias Chalkidis* and Stephanie Brandl*

In the Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024), Mexico City, Mexico, June 16–21, 2024

Arxiv pre-print Article 🤗 Dataset 🤗 Models Code
One Law, Many Languages: Benchmarking Multilingual Legal Reasoning for Judicial Support

Vishvaksenan Rasiah, Ronja Stern, Veton Matoshi, Matthias Stürmer, Ilias Chalkidis, Daniel E. Ho, Joel Niklaus

In the Proceedings of the Data-centric Machine Learning Research (DMLR) Workshop, Vienna, Austria, May 11, 2023

Arxiv pre-print

-2023

Rather a Nurse than a Physician - Contrastive Explanations under Investigation

Oliver Eberle*, Ilias Chalkidis*, Laura Cabello and Stephanie Brandl

In the Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2023), Singapore, December 6–10, 2023

Arxiv pre-print Article 🤗 Dataset Code
Regulation and NLP (RegNLP): Taming Large Language Models

Catalina Goanta, Nikolaos Aletras, Ilias Chalkidis, Sofia Ranchordas and Gerasimos Spanakis

In the Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2023), Singapore, December 6–10, 2023

Arxiv pre-print Article
LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain

Joel Niklaus*, Veton Matoshi*, Pooja Rani, Andrea Galassi, Matthias Stürmer, and Ilias Chalkidis

In Findings of Empirical Methods in Natural Language Processing (EMNLP 2023), 2023

Arxiv pre-print 🤗 Dataset Code
SCALE: Scaling up the Complexity for Advanced Language Model Evaluation

Vishvaksenan Rasiah, Ronja Stern, Veton Matoshi, Matthias Stürmer, Ilias Chalkidis, Daniel E. Ho and Joel Niklaus

In the Proceedings of the Workshop on Natural Legal Language Processing - co-located with EMNLP 2023, Singapore, December 6–10, 2023

Arxiv pre-print Article 🤗 Dataset
Retrieval-augmented Multi-label Text Classification

Ilias Chalkidis* and Yova Kementchedjhieva*

ArXiv Pre-print, 2023

Arxiv pre-print
LeXFiles and LegalLAMA: Facilitating English Multinational Legal Language Model Development

Ilias Chalkidis*, Nicolas Garneau*, Catalina Goanta, Daniel Katz and Anders Søgaard

In the Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023), Toronto, Canada, 9-14 July, 2023

Arxiv pre-print Article 🤗 Dataset Code 🤗 Models
An Exploration of Encoder-Decoder Approaches to Multi-Label Classification for Legal and Biomedical Text

Yova Kementchedjhieva* and Ilias Chalkidis*

In Findings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023), 2023

Arxiv pre-print Article Code
Efficient Document Embeddings via Self-Contrastive Bregman Divergence Learning

Daniel Saggau, Mina Rezaei, Bernd Bischl and Ilias Chalkidis

In Findings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023), 2023

Arxiv pre-print Code
PokemonChat: Auditing ChatGPT for Pokémon Universe Knowledge

Laura Cabello, Jiaang Li, Ilias Chalkidis

ArXiv Pre-print, 2023

ArXiv pre-print
Textual Information and IPO Underpricing: A Machine Learning Approach

Apostolos Katsafados, George Leledakis, Emmanouil Pyrgiotakis, Ion Androutsopoulos, Ilias Chalkidis and Manos Fergadiotis

In the Journal of Financial Data Science, Spring 2023.

Article
ChatGPT may Pass the Bar Exam soon, but has a Long Way to Go for the LexGLUE benchmark

Ilias Chalkidis

SSRN Pre-print, 2023

SSRN pre-print Code

-2022

Processing Long Legal Documents with Pre-trained Transformers: Modding LegalBERT and Longformer

Dimitris Mamakas*, Petros Tsotsi*, Ion Androutsopoulos, and Ilias Chalkidis

In the Proceedings of the Workshop on Natural Legal Language Processing - co-located with EMNLP 2022, Abu Dhabi, UAE, December 7–11, 2022

Article Arxiv pre-print
Legal-Tech Open Diaries: Lesson learned on how to develop and deploy light-weight models in the era of humongous Language Models

Stelios Maroudas*, Sotiris Legkas*, Prodromos Malakasiotis, and Ilias Chalkidis

In the Proceedings of the Workshop on Natural Legal Language Processing - co-located with EMNLP 2022, Abu Dhabi, UAE, December 7–11, 2022

Article Arxiv pre-print
An Exploration of Hierarchical Attention Transformers for Efficient Long Document Classification

Ilias Chalkidis, Xiang Dai, Manos Fergadiotis, Prodromos Malakasiotis, and Desmond Elliott

ArXiv Pre-print, 2022

Arxiv pre-print Code 🤗 Models
Revisiting Transformer-based Models for Long Document Classification

Xiang Dai, Ilias Chalkidis, Sune Darkner, and Desmond Elliott

In Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), 2022

Article Arxiv pre-print
An Empirical Study on Cross-X Transfer for Legal Judgment Prediction

Joel Niklaus*, Matthias Stürmer, and Ilias Chalkidis*

In the Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (AACL-IJCNLP 2022), (Held online due to COVID-19), November 20-23, 2022

Article Arxiv pre-print 🤗 Dataset Code
Realistic Zero-Shot Cross-Lingual Transfer in Legal Topic Classification

Stratos Xenouleas, Alexia Tsoukara, Giannis Panagiotakis, Ilias Chalkidis, and Ion Androutsopoulos

In the Proceedings of the 12th Hellenic Conference on Artificial Intelligence (SETN 2022), Corfu. Greece, September 7-9, 2022

Article Arxiv pre-print 🤗 Dataset Code
Improved Multi-label Classification under Temporal Concept Drift: Rethinking Group-Robust Algorithms in a Label-Wise Setting

Ilias Chalkidis and Anders Søgaard

In Findings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022), 2022

Article Arxiv pre-print Code
FiNER: Financial Numeric Entity Recognition for XBRL Tagging

Lefteris Loukas, Manos Fergadiotis, Prodromos Malakasiotis, Ilias Chalkidis, Eirini Spyropoulou, Ion Androutsopoulos, and Georgios Paliouras

In the Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022), Dublin, Ireland, 23-25 May, 2022

Article Arxiv pre-print 🤗 Dataset Code 🤗 Models
FairLex: A Multilingual Benchmark for Evaluating Fairness in Legal Text Processing

Ilias Chalkidis, Tommaso Pasini, Sheng Zhang, Letizia Tomada, Letizia, Sebastian Felix Schwemer, and Anders Søgaard

In the Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022), Dublin, Ireland, 23-25 May, 2022

Article Arxiv pre-print 🤗 Dataset Code 🤗 Models
Challenges and Strategies in Cross-Cultural NLP

Daniel Hershcovich, Stella Frank, Heather Lent, Miryam de Lhoneux, Mostafa Abdou, Stephanie Brandl, Emanuele Bugliarello, Laura Cabello Piqueras, Ilias Chalkidis, Ruixiang Cui, Constanza Fierro, Katerina Margatina, Phillip Rust, and Anders Søgaard

In the Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022), Dublin, Ireland, 23-25 May, 2022

Article Arxiv pre-print
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

Ilias Chalkidis, Abhik Jana, Dirk Hartung, Michael Bommarito, Ion Androutsopoulos, Daniel Martin Katz, and Nikolaos Aletras

In the Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022), Dublin, Ireland, 23-25 May, 2022

Article Arxiv pre-print 🤗 Dataset Code

-2021

Swiss-Judgment-Prediction: A Multilingual Legal Judgment Prediction Benchmark

Joel Niklaus, Ilias Chalkidis, and Matthias Stürmer

In the Proceedings of the Workshop on Natural Legal Language Processing - co-located with EMNLP 2021, Punta Cana, Dominican Republic, November 7–11, 2021

Article Arxiv pre-print 🤗 Dataset Code
Multi-granular Legal Topic Classification on Greek Legislation

Christos Papaloukas, Ilias Chalkidis, Konstantinos Athinaios, Despina-Athanasia Pantazi, and Manolis Koubarakis

In the Proceedings of the Workshop on Natural Legal Language Processing - co-located with EMNLP 2021, Punta Cana, Dominican Republic, November 7–11, 2021.

Article Arxiv pre-print 🤗 Dataset Code
MultiEURLEX - A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer

Ilias Chalkidis, Manos Fergadiotis, and Ion Androutsopoulos

In the Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2021), Punta Cana, Dominican Republic, November 7–11, 2021.

Article Arxiv pre-print 🤗 Dataset Code
Deep Neural Networks for Information Mining from Legal Texts

Ilias Chalkidis

PhD Thesis, Department of Informatics, Athens University of Economics and Business, 2021

Thesis
Paragraph-level Rationale Extraction through Regularization: A case study on European Court of Human Rights Cases

Ilias Chalkidis, Manos Fergadiotis, Dimitrios Tsarapatsanis, Nikolaos Aletras, Ion Androutsopoulos and Prodromos Malakasiotis

In the Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2021), (Held online due to COVID-19), June 6–11, 2021

Article Arxiv pre-print 🤗 Dataset
Neural Contract Element Extraction Revisited: Letters from Sesame Street

Ilias Chalkidis, Manos Fergadiotis, Prodromos Malakasiotis and Ion Androutsopoulos

Pre-print (Update of the article "Neural Contract Element Extraction Revisited"), February 23, 2021

Arxiv pre-print
Regulatory Compliance through Doc2Doc Information Retrieval: A case study in EU/UK legislation where text similarity has limitations

Ilias Chalkidis, Manos Fergadiotis, Nikolaos Manginas and Prodromos Malakasiotis

In the Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2021), (Held online due to COVID-19), April 21–23, 2021

Article Arxiv pre-print Dataset
Using Textual Analysis to Identify Merger Participants: Evidence from the U.S. Banking Industry

Apostolos Katsafados, Ion Androutsopoulos, Ilias Chalkidis, Manos Fergadiotis, George Leledakis and Emmanouil Pyrgiotakis

In Finance Research Letters, January 26, 2021.

Article SSRN pre-print

-2020

Layer-wise Guided Training for BERT: Learning Incrementally Refined Document Representations

Nikolaos Manginas, Ilias Chalkidis and Prodromos Malakasiotis

In the Proceedings of the Workshop on Structured Prediction for NLP (SPNLP 2020) - co-located with EMNLP 2020, (Held online due to COVID-19), November 16–20, 2020

Article Arxiv pre-print
An Empirical Study on Large-Scale Multi-Label Text Classification including Few and Zero-Shot Labels

Ilias Chalkidis, Manos Fergadiotis, Sotiris Kotitsas, Prodromos Malakasiotis, Nikolaos Aletras and Ion Androutsopoulos

In the Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), (Held online due to COVID-19), November 16–20, 2020.

Article Arxiv pre-print Code
LEGAL-BERT: "The Muppets straight out of Law School"

Ilias Chalkidis, Manos Fergadiotis, Prodromos Malakasiotis, Nikolaos Aletras and Ion Androutsopoulos

In Findings of Empirical Methods in Natural Language Processing (EMNLP 2020),(Held online due to COVID-19), November 16–20, 2020.

Article Arxiv pre-print 🤗 Models
Greek-BERT: The Greeks Visiting Sesame Street

John Koutsikakis, Ilias Chalkidis, Prodromos Malakasiotis and Ion Androutsopoulos

In the Proceedings of the 11th Hellenic Conference on Artificial Intelligence (SETN 2020), Athens, Greece, September 2-4, 2020.

Article Arxiv pre-print 🤗 Models

-2019

Neural Contract Element Extraction Revisited

Ilias Chalkidis, Manos Fergadiotis, Prodromos Malakasiotis and Ion Androutsopoulos

In the Proceedings of the Document Intelligence Workshop - co-located with NeurIPS 2019, Vancouver, Canada, December 8-14, 2019.

Article Arxiv pre-print Poster
Large-Scale Multi-Label Text Classification on EU Legislation

Ilias Chalkidis, Manos Fergadiotis, Prodromos Malakasiotis and Ion Androutsopoulos

In the Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019) (Short Papers), Florence, Italy, July 28 - August 2, 2019.

Article Arxiv pre-print Poster Code Dataset
Neural Legal Judgment Prediction in English

Ilias Chalkidis, Ion Androutsopoulos and Nikolaos Aletras

In the Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019) (Short Papers), Florence, Italy, July 28 - August 2, 2019.

Article Arxiv pre-print Dataset
Extreme Multi-Label Legal Text Classification: A case study in EU Legislation

Ilias Chalkidis, Manos Fergadiotis, Prodromos Malakasiotis, Nikolaos Aletras and Ion Androutsopoulos

In the Proceedings of the Workshop on Natural Legal Language Processing - co-located with NAACL-HLT 2019, Minneapolis, USA, June 2-7, 2019.

Article Arxiv pre-print
Towards a Decentralized, Trusted, Intelligent and Linked Public Sector: A Report from the Greek Trenches

Themis Beris, Iosif Angelidis, Ilias Chalkidis, Charalampos Nikolau, Christos Papaloukas, Panagiotis Soursos and Manolis Koubarakis

In the Proceedings of the Workshop on Linked Data on the Web and its Relationship with Distributed Ledgers (LDOW/LDDL) of the Web Conference 2019, San Francisco, USA, May 13-17, 2019.

Article Arxiv pre-print

-2018

Named Entity Recognition, Linking and Generation for Greek Legislation

Iosif Angelidis, Ilias Chalkidis and Manolis Koubarakis

In the Proceedings of the 31st International Conference on Legal Knowledge and Information Systems (JURIX 2018), Groningen, The Netherlands, December 12-14, 2018.

Article Arxiv pre-print
Deep Learning in Law: Early Adaptation and Legal Word Embeddings Trained on Large Corpora

Ilias Chalkidis and Dimitrios Kampas

In Artificial Intelligence and Law, Volume 27:2, Pages 171–198, Springer, December 11, 2018.

Article Dataset
Obligation and Prohibition Extraction Using Hierarchical RNNs

Ilias Chalkidis‚ Ion Androutsopoulos and Achilleas Michos

In the Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018) (Short Papers), Melbourne, Australia, July 15-20, 2018. Pages 254–259 . 2018.

Article Arxiv pre-print Poster
The Predictions of Bank Merger Targets and Acquirers with Textual Analysis

George Leledakis, Apostolos Katsafados, Ion Androutsopoulos, Ilias Chalkidis, Emmanouel Fergadiotis

In the Proceedings of the 9th National Conference of the Financial Engineering and Banking Society (FEBS 2018), Athens, Greece, 2018

-2017

The Effect of Textual Information on IPO Underpricing

Ion Androutsopoulos, Ilias Chalkidis, Emmanouel Fergadiotis, Apostolos Katsafados, George Leledakis and Alexandra Ntetsika

In the Proceedings of the 8th National Conference of the Financial Engineering and Banking Society (FEBS 2017)‚ Athens‚ Greece‚ December 18-19‚ 2017. 2017.
A Deep Learning Approach to Contract Element Extraction

Ilias Chalkidis and Ion Androutsopoulos

In the Proceedings of the 30th International Conference on Legal Knowledge and Information Systems (JURIX 2017)‚ Luxembourg City‚ Luxembourg‚ December 13-15‚ 2017. 2017.

Article Arxiv pre-print
Extracting Contract Elements

Ilias Chalkidis‚ Ion Androutsopoulos and Achilleas Michos

In Proceedings of the 16th International Conference on Artificial Intelligence and Law (ICAIL 2017), London, UK, June 12–16, 2017, Pages 19–28. 2017.

Article Arxiv pre-print
Modeling and Querying Greek Legislation using Semantic Web Technologies

Ilias Chalkidis‚ Charalampos Nikolaou‚ Panagiotis Soursos and Manolis Koubarakis

In the Proceedings of the 14th European Semantic Web Conference (ESWC 2017)‚ Portorož‚ Slovenia‚ May 28 − June 1‚ 2017. Pages 591–606. 2017.

Article Arxiv pre-print

Service

Organizing Committee

NLLP: 2022, 2023, 2024

Program Committee

NLLP: 2020, 2021

WNUT: 2021

AI4LEGAL: 2020, 2021, 2022

DeeLIO: 2021, 2022

Reviewing

Conferences

ARR: 2021, 2022, 2023, 2024 (Action Editor / Area Chair)

EMNLP: 2020 (Outstanding Reviewer), 2021, 2022, 2023

ACL: 2020, 2021, 2022, 2023

COLM: 2024,2025

EACL: 2023

NAACL: 2020, 2021

NeurIPS: 2023

Journals

AI & Law Journal (Member of the Editorial Board)

Advances in Data Science and Artificial Intelligence for Legal Research and Applications (Member of the Editorial Board)

Transactions of the Association for Computational Linguistics

Nature

Machine Learning

PeerJ

Computer Speech & Language

Philosophical Transactions of the Royal Society

Invited Talks / Guest Lectures

Investigate the EU political spectrum through the lens of LLMs.

Invited Talk @ JRC DISINFO Workshop, Joint Research Centre - European Commission, Ispra, Italy, 26 September 2024
Investigate the EU political spectrum through the lens of LLMs.

Invited Talk @ NLP Group Talk Series, Athens University of Economics and Business, Athens, Greece, 3 December 2023
Recurrent Neural Networks (RNNs) / Transformers

Guest Lectures @ Advanced Topics in Deep Learning, Department of Computer Science, University of Copenhagen, Denmark, 15-16 May 2023
Unleashing the potential of legal-oriented Language Models one step at a time…

Invited Talk @ AI Thomson Reuters Invited Speaker Series, Zug, Switzerland, 1 February 2023
Regulatory Compliance through Doc2Doc Matching: A case study on EU-UK transpositions

Invited Talk @ Bridging the gap between Logic and NLP-based Approaches for Automating Regulatory Compliance, Law & Tech Lab, Maastricht University, 13 September 2022
A short NLP story from bag-of-words to the Muppet Show (and desiderata for trustworthy assistive legal NLP technologies)

Guest Lecture @ Information Retrieval and Data Mining, University College London, 3 March 2022
Transforming law with augmented lawyering: Advances and challenges in legal text processing (Oriented to NLP audience)

Invited Talk @ Digital Humanities Seminar Series, University of Wolverhampton, 26 January 2022
Transforming law with augmented lawyering: Advances and challenges in legal text processing (Oriented to Law audience)

Invited Talk @ AI & Law Conference (co-hosted by recode.law, and Oxford Fintech & Legaltech Society), 20 January 2022
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding - NLP in the era of Muppets

Guest Lecture @ NLP Seminars (Universität Bern), 3 December 2021
Paragraph-level Rationale Extraction through Regularization: A case study on European Court of Human Rights Cases

Invited Talk @ NLLP Talk Series, 16 April 2021
Large-scale Multi-label Text Classification (LMTC): Labelling documents with hierarchically organized labels from taxonomies

Invited Talk @ AI4LEGAL Workshop, ISWC, 2 November 2020
Large-scale Multi-Label Text Classification on EU Legislation

Invited Talk @ Maastricht Law and Tech Lab Launching Event, 17 October 2019
Natural Language Processing in the Deep Learning Era: The story so far…

Invited Talk @ 9th PyData Athens Meetup, 23 April 2019

Press

Popular chatbot is a politically left-leaning EU supporter

Maria Hornbek, KU News (Denmark), 2024
Gluing together a benchmark for AI Natural Language Legal Understanding

Lance Eliot , The Daily Journal (USA), 2021
AI learns to predict the outcomes of human rights court cases

Donna Lu, New Scientist (UK), 2019