

A fast and sound tagging method for discontinuous named-entity recognition

Caio Corro

SaulLM-7B: A pioneering large language model for law

Pierre Colombo, Telmo Pessoa Pires, Malik Boudiaf, Dominic Culver, Rui Melo, Caio Corro, Andre F. T. Martins, Fabrizio Esposito, Vera Lúcia Raposo, Sofia Morgado, Michael Desa

CroissantLLM: A Truly Bilingual French-English Language Model

Manuel Faysse, Patrick Fernandes, Nuno Guerreiro, António Loison, Duarte Alves, Caio Corro, Nicolas Boizard, João Alves, Ricardo Rei, Pedro Martins, Antoni Bigata Casademunt, François Yvon, André Martins, Gautier Viaud, Céline Hudelot, Pierre Colombo

Discrete latent structure in neural networks

Vlad Niculae, Caio F. Corro, Nikita Nangia, Tsvetomila Mihaylova, André F. T. Martins

Publications & Technical Reports


Sparse logistic regression with high-order features for automatic grammar rule extraction from treebanks

Santiago Herrera, Caio Corro, Sylvain Kahane
LREC-Coling 2024 - Joint International Conference on Computational Linguistics, Language Resources and Evaluation
paper - code - extracted rules

Régression logistique parcimonieuse pour l'extraction automatique de règles de grammaire

Santiago Herrera, Caio Corro, Sylvain Kahane
TALN 2024 - Conférence sur le Traitement Automatique des Langues Naturelles

Actes de la journée d’étude sur le traitement automatique des langues frugal et la recherche d'information frugale

Caio Corro, Gaël Lejeune


Structural generalization in COGS: Supertagging is (almost) all you need

Alban Petit, Caio Corro, François Yvon
EMNLP 2023 - Conference on Empirical Methods in Natural Language Processing

A dynamic programming algorithm for span-based nested named-entity recognition in \(\mathcal O(n^2)\)

Caio Corro
ACL 2023 - Annual Meeting of the Association for Computational Linguistics

On graph-based reentrancy-free semantic parsing

Alban Petit, Caio Corro
TACL 2023 - Transactions of the Association for Computational Linguistics

On the inconsistency of separable losses for structured prediction

Caio Corro
EACL 2023 - European Chapter of the Association for Computational Linguistics


Actes de la journée d’étude sur la robustesse des systemes de TAL (Robustal 2022)

Caio Corro, Gaël Lejeune

Un algorithme d'analyse sémantique fondée sur les graphes via le problème de l'arborescence généralisée couvrante

Alban Petit, Caio Corro
TALN 2022 - Conférence sur le Traitement Automatique des Langues Naturelles

Ré-ordonnancement via programmation dynamique pour l'adaptation cross-lingue d'un analyseur en dépendances

Nicolas Devatine, Caio Corro, François Yvon
TALN 2022 - Conférence sur le Traitement Automatique des Langues Naturelles
paper - slides

GPU-Accelerated Forward-Backward algorithm with Application to Lattice-Free MMI

Lucas Ondel, Léa-Marie Lam-Yee-Mui, Martin Kocour, Caio Filippo Corro, Lukáš Burget
ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing


Preventing posterior collapse in variational autoencoders for text generation via decoder regularization

Alban Petit, Caio Corro
NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications

Auto-encodeurs variationnels : contrecarrer le problème de posterior collapse grâce à la régularisation du décodeur

Alban Petit, Caio Corro
TALN 2021 - Conférence sur le Traitement Automatique des Langues Naturelles


Span-based discontinuous constituency parsing: a family of exact chart-based algorithms with time complexities from O(n^6) down to O(n^3)

Caio Corro
EMNLP 2020 - Conference on Empirical Methods in Natural Language Processing

Sur l'impact des contraintes structurelles pour l'analyse en dépendances profondes fondée sur les graphes

Caio Corro
TALN 2020 - Conférence sur le Traitement Automatique des Langues Naturelles
paper - code


Learning Latent Trees with Stochastic Perturbations and Differentiable Dynamic Programming

Caio Corro, Ivan Titov
ACL 2019 - Annual Meeting of the Association for Computational Linguistics
paper - poster (landscape) - poster (portrait)

Differentiable Perturb-and-Parse: Semi-Supervised Parsing with a Structured Variational Autoencoder

Caio Corro, Ivan Titov
ICLR 2019 - Seventh International Conference on Learning Representations
paper - poster


Lagrangian Based Approaches for Lexicalized Tree Adjoining Grammar Parsing

Caio Corro
PhD thesis
pdf - slides


Efficient Discontinuous Phrase-Structure Parsing via the Generalized Maximum Spanning Arborescence

Caio Corro, Joseph Le Roux, Mathieu Lacroix
EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing
paper - poster

Transforming Dependency Structures to LTAG Derivation Trees

Caio Corro, Joseph Le Roux
TAG+ 2017 - 13th International Workshop on Tree-Adjoining Grammar and Related Formalisms
paper - slides


Dependency Parsing with Bounded Block Degree and Well-nestedness via Lagrangian Relaxation and Branch-and-Bound

Caio Corro, Joseph Le Roux, Mathieu Lacroix, Antoine Rozenknop, Roberto Wolfler Calvo
ACL 2016 - Annual Meeting of the Association for Computational Linguistics
paper - slides

Méthode lagrangienne pour les arborescences couvrantes avec application en traitement automatique des langues

Caio Corro, Joseph Le Roux, Mathieu Lacroix, Antoine Rozenknop, Roberto Wolfler Calvo