News
Thanks! + Final assignment guidance
Written on 25.02.2025 15:28 by Kate McCurdy
Firstly, let me say thank you to those of you who participated in last week's seminar sessions, for your deep attention to the materials and excellent presentations and discussions! I've now finished reviewing the reading summaries, which reflect the same high quality of engagement. It's a real pleasure to teach such sharp and motivated students.
Those of you taking the seminar for 4 credits should already have official grades available in LSF; I submitted them a few hours ago. If this is not the case, please email and let me know, and I'll try to address it.
The rest of you are taking the seminar for 7 credits, and will prepare a final project to be submitted by Friday, March 28. As promised, I will include some additional guidance.
Here is the project description from the intro session slides:
-
Case study: pick a language from UniMorph and compare two different approaches to modeling its inflection system
-
Can select a specific focus element (e.g. verbal morphology, past tense) or overall
-
Can choose any type of modeling approach - most important: how you motivate and analyze the experiment
-
-
Maximum 6 pages using the ACL paper template
I've added some more concrete guidance below. If you still have questions, feel free to drop me an email.
Expectations: While the computational element is important (see the point on methods below), your evaluation will also strongly depend upon the written component for the paper. I will be looking for elements such as:
- Motivation: why did you make the high-level choices you did? What is interesting about the morphology the language that you selected, and why are the computational models that you selected appropriate to model these morphological phenomena?
- Experiment design + methods: your experiment should follow standard machine learning practice, i.e. separate data sets to train a model, validate any hyperparameter selection, and test for accuracy. As *reproducibility* is a core aspect of the scientific method, your description should thoroughly cover any steps that another researcher would need to know to reproduce your findings. I also expect that you will sanity-check your implementation to ensure there are no bugs; if your numbers look very implausible, I will have questions about your methods.
- Results: key results should be clearly communicated in tables or plots. Accuracy on the test data set should always be reported, and you can also report any other result of interest - for example, you may want to compare accuracy on different subgroups (e.g. nouns versus verbs, different inflection classes, or any category relevant to modeling the language you selected). You may also want to include a more detailed *error analysis* of the results, such as giving examples that your model predicts incorrectly, and discussing possible reasons.
- Optional - related literature: this paper is primarily about modeling, so I do not expect a detailed literature review. However, relevant sources should be cited, such as UniMorph and any models that you use. Similarly, if your approach is informed by other papers - either papers we read in the seminar, or outside literature - then citing those sources would be appropriate, and would likely positively contribute to my evaluation of your submission. In particular, if your language has appeared in any of the SIGMORPHON benchmarks, then it would be useful to compare your results to those reported with other modeling approaches.
Resources: Previous SIGMORPHON shared tasks will have many valuable resources that you can draw on.
- Data The 2017 shared task released train, dev, and test splits for more than 40 languages.
- Example papers System descriptions from shared tasks illustrate how you might go about writing up your model and analysis.
- Models SIGMORPHON shared tasks also typically release baseline models (e.g. 2017, 2018) which may give a useful starting point. Another recent useful source is this code repo for Weissweiler et al. (2023), who compare a range of morphological baseline systems to ChatGPT.
Best of luck with the paper, and I look forward to seeing your results!