Dr Theodorus Fransen

BA., MA., M.Phil, PhD

Contact Details

Postdoctoral researcher
E: theodorus.fransen@nuigalway.ie
 
researcher
 

Biography

I hold an M.Phil. in Speech and Language Processing and a Ph.D. in Computational Linguistics from Trinity College Dublin, where I was affiliated with the Irish Speech and Language Technology Research Centre (School of Linguistic, Speech and Communication Sciences). My doctoral thesis (2019) is entitled “Past, present and future: Computational approaches to mapping Old and Modern Irish cognate verb forms”. Previous to that I studied at Utrecht University, the Netherlands, where I obtained a BA in Linguistics and both a BA and MA in Celtic Languages and Culture. My teaching experience includes undergraduate courses in statistics and computational morphology. 

I am currently a Postdoctoral Researcher on the Irish Research Council-funded Cardamom project (Comparative deep models for minority and historical languages), led by Dr. John McCrae, in the Unit for Linguistic Data, Data Science InstituteInsight Centre for Data Analytics, National University of Ireland, Galway. While my focus so far has been on finite-state two-level morphology for Old Irish (c. 700–900 A.D), I have a growing interest in the broader area of Natural Language Processing (NLP) for historical texts and languages, and the interface between NLP, lexicography and Digital Humanities. In addition to this, I am dedicated to and involved in the creation of Language Resources and tools for under-resourced historical and minority languages, particularly in the Irish context.

Google Scholar
GitHub: ThFransen84
ORCID: 0000-0001-5639-8626
Twitter: @ThFransen

Book Chapters

  Year Publication
(2020) 'Automatic morphological analysis and interlinking of historical Irish cognate verb forms'
Fransen, Theodorus (2020) 'Automatic morphological analysis and interlinking of historical Irish cognate verb forms' In: Morphosyntactic Variation in Medieval Celtic Languages: Corpus-based approaches. Berlin: De Gruyter. [DOI] [ARAN Link] [Details]

Conference Publications

  Year Publication
(2020) Language Resources and Evaluation (LREC) 2020
Rani, Priya; Suryawanshi, Shardul; Goswami, Koustava; Chakravarthi, Bharathi Raja; Fransen, Theodorus; McCrae, John Philip (2020) A Comparative Study of Different State-of-the-Art Hate Speech Detection Methods for Hindi-English Code-Mixed Data Language Resources and Evaluation (LREC) 2020 , pp.42-48 [Details]
(2020) The 28th International Conference on Computational Linguistics (COLING 2020)
Goswami, Koustava; Rani, Priya; Chakravarthi, Bharathi Raja; Fransen, Theodorus; McCrae, John Philip (2020) ULD@NUIG at SemEval-2020 Task 9: Generative Morphemes with an Attention Model for Sentiment Analysis in Code-Mixed Text The 28th International Conference on Computational Linguistics (COLING 2020) Barcelona, , 12-DEC-20 - 13-DEC-20 [Details]
(2019) International Conference Language Technologies for All (LT4All): Enabling Linguistic Diversity and Multilingualism Worldwide
McCrae, John Philip; Fransen, Theodorus (2019) Cardamom: Comparative Deep Models for Minority and Historical Languages International Conference Language Technologies for All (LT4All): Enabling Linguistic Diversity and Multilingualism Worldwide Paris, France, , 05-DEC-19 - 06-DEC-19 [ARAN Link] [Details]

Magazine Article

  Year Publication
(2020) Rekenen met taal: computationele taalkunde en historisch Iers.
Fransen, Theodorus (2020) Rekenen met taal: computationele taalkunde en historisch Iers. Magazine Article [Details]