
Dr Alistair Willis
Senior Lecturer In Computing
School of Computing & Communications
Biography
Professional biography
Alistair Willis is a Senior Lecturer in Computing at the Open University. He holds a BA in Physics and Philosophy from the University of Cambridge, an MSc in Artificial Intelligence from the University of Edinburgh, and a DPhil in Computer Science from the University of York. After his DPhil, Alistair worked in the software engineering group at Philips Research Laboratories on automatic software testing, before joining the Open University's Computing Department in 2003.
Alistair leads the Artificial Intelligence and Natural Language Processing research group in the School of Computing and Communications, and leads the Data Science theme of the Institute of Coding.
Research interests
Alistair's research focuses on Natural Language Processing (NLP), and in particular how people interpret ambiguous text. At the Open University, he has investigated how to build computational models that predict when different people will disagree on the interpretation of ambiguous text. His work has looked at both the theoretical foundations of the phenomenon, and its potential impact in the area of requirements engineering.
He is also interested in the problem of how to recognise semantic similarity between texts. His research in this area applies inductive logic programming to learn systems for automatically grading students' written work. This work is currently being extended to use deep learning models for grading, as well as traditional symbolic methods.
The methods underlying these tasks are generalisable beyond NLP. Alistair has collaborated with computational musicologists on machine learning techniques to support automatic music composition for games, and with social scientists on using social media to understand the audience for the Russian television network RT.
Teaching interests
Alistair currently chairs the final year module Data Management and Analysis. This gives students a holistic view of the data lifecycle, developing a range of technical and discursive skills to enable them to use data to tell a story. By using open datasets, the module also attempts to demonstrate to students the possibility of using open data to effect change in the world.
Projects
The Cultural Value of Shakespeare Lives 2016
This projects aims at evaluating and visualising the impact of the British Council’s Social Media and Digital Resources associated with their global programme of events and projects to mark the 400th anniversary of Shakespeare. The project will 1. assess the value of the Shakespeare anniversary programme and the impact it has had around the world especially on how Britain is perceived – whether the UK is seen as creative, welcoming, diverse, innovative. 2. assess the extent to which the Shakespeare Lives (SL) programme encouraged overseas publics to visit, work, do business, study in the UK and consume UK culture. 3. go beyond quantitative measures and assessment of reach to arrive at a deeper understanding of the quality of international interactions and intercultural dialogue generated by SL 4. provide a range of big data analyses and forms of evidence in attractive visual formats about how and where users have engaged with SL.
Enhanced Technology for Open Intelligent Learning Environments. (XD-10-054-JJ)
Complex systems science is highly interdisciplinary. The silo-based education provided by most Universities creates scientists expert in one discipline but ignorant of most others. Most PhD programmes require students to learn key ideas from other disciplines relevant to their topic, but knowledge across the community is patchy. This includes basic social science and even core disciplines of mathematics, statistics, physics and ICT. The Complex Systems Society has identified this as a major hindrance to the development of science. We will develop and use a remarkable new way of automatically creating educational resources. It is scalable so that the cost of educating large numbers of students is very low. It is adaptive to changes in the curriculum in fast-moving research fields creating new learning resources as new topics emerge, and it is adaptive to students by creating learning resources that reflects their personal styles of learning, background, and language. Etoile is a big step towards personalised learning.
Virtual Biodiversity Research and Access Network for Taxonomy. (XC-09-072-DM)
iversity science brings information science and technologies to bear on the data and information generated by the study of organisms, their genes, and their interactions. ViBRANT will help focus the collective output of biodiversity science, making it more transparent, accountable, and accessible. Mobilising these data will address global environmental challenges, contribute to sustainable development, and promote the conservation of biological diversity. Through a platform of web based informatics tools and services we have built a successful data-publishing framework (Scratchpads) that allows distributed groups of scientists to create their own virtual research communities supporting biodiversity science. The infrastructure is highly user-oriented, focusing on the needs of research networks through a flexible and scalable system architecture, offering adaptable user interfaces for the development of various services. In just 28 months the Scratchpads have been adopted by over 120 communities in more than 60 countries, embracing over 1,500 users. ViBRANT will distribute the management, hardware infrastructure and software development of this system and connect with the broader landscape of biodiversity initiatives including PESI, Biodiversity Heritage Library (Europe), GBIF and EoL. The system will also inform the design of the LifeWatch Service Centre and is aligned with the ELIXIR and EMBRC objectives, all part of the ESFRI roadmap. ViBRANT will extend the userbase, reaching out to new multidisciplinary communities including citizen scientists by offering an enhanced suite of services and functionality.
A data infrastructure to support agricultural scientific communities (XC-10-110-DM)
agINFRA is an Integrated Infrastructure Initiative (I3) project that will try to introduce the agricultural scientific communities into the vision of open and participatory data-intensive science. In particular, agINFRA aims to design and develop a scientific data infrastructure for agricultural sciences that will facilitate the development of policies and the deployment of services that will promote sharing of data among agricultural scientists and develop trust within and among their communities. agINFRA will try to remove existing obstacles concerning the open access to scientific information and data in agriculture, as well as improve the preparedness of agricultural scientific communities to face, manage and exploit the abundance of relevant data that is (or will be) available and can support agricultural research. Ultimately, agINFRA will demonstrate how a data infrastructure for agricultural scientific communities can be set up to facilitate data generation, provenance, quality assessment, certification, curation, annotation, navigation and management.
Publications
Book Chapter
Detecting dangerous coordination ambiguities using word distribution (2007)
The availability of partial scopings in an underspecified semantic representation (2002)
Dataset
Digital Artefact
Journal Article
Seeing the smart city on Twitter: Colour and the affective territories of becoming smart (2019)
Developing and evaluating computational models of musical style (2016)
Mapping networks of influence: tracking Twitter conversations through time and space (2015)
The FuturICT education accelerator (2012)
A hybrid model for automatic emotion recognition in suicide notes (2012)
Towards the bibliography of life (2011)
Analysing anaphoric ambiguity in natural language requirements (2011)
Presentation / Conference
Bringing Timbral Shapes to Interactive Music Systems (2023)
Identifying Annotator Bias: A new IRT-based method for bias identification (2020)
Automatically calculating tonal tension (2020)
Towards a Cross-article Narrative Comparison of News (2020)
Developing Students' Written Communication Skills with Jupyter Notebooks (2020)
Agreement is overrated: A plea for correlation to assess human evaluation reliability (2019)
Evaluation methodologies in Automatic Question Generation 2013-2018 (2018)
Rethinking the Agreement in Human Evaluation Tasks (2018)
Search Personalization with Embeddings (2017)
Personalised Query Suggestion for Intranet Search with Temporal User Profiling (2017)
Adverse Drug Reaction Classification With Deep Neural Networks (2016)
Using NLP to support scalable assessment of short free text responses (2015)
Modelling time-aware search tasks for search personalisation (2015)
Temporal latent topic user profiles for search personalisation (2015)
Improving search personalisation with dynamic group formation (2014)
Methodological approaches to the evaluation of game music systems (2014)
Algorithmic music as intelligent game music (2014)
ComTax: community-driven curation for taxonomic databases (2013)
Literature-driven Curation for Taxonomic Name Databases (2013)
Curation tools for taxonomic databases (2013)
A generalised hybrid architecture for NLP (2012)
Speculative requirements: automatic detection of uncertainty in natural language requirements (2012)
Extending Nocuous Ambiguity Analysis for Anaphora in Natural Language Requirements (2010)
Automatic detection of nocuous coordination ambiguities in natural language requirements (2010)
A methodology for automatic identification of nocuous ambiguity (2010)
Using discovered, polyphonic patterns to filter computer-generated music (2010)
Improving search in scanned documents: Looking for OCR mismatches (2009)
On presuppositions in requirements (2009)
Making tacit requirements explicit (2009)
NP coordination in underspecified scope representations (2007)
Identifying nocuous ambiguities in natural language requirements (2006)
Disambiguating coordinations using word distribution information (2005)
Report
Literature Review on Patient-Friendly Documentation Systems (2006)
Can Online Learning Materials Improve Student Access to Digital Libraries? (2005)
Nocuous Ambiguities in Requirements Specifications (2005)
Using a Distributional Thesaurus to Resolve Coordination Ambiguities (2005)
Working Paper
Best Practices in using Technological Infrastructures (2020)