In April 2016 Manchester eScholar was replaced by the University of Manchester’s new Research Information Management System, Pure. In the autumn the University’s research outputs will be available to search and browse via a new Research Portal. Until then the University’s full publication record can be accessed via a temporary portal and the old eScholar content is available to search and browse via this archive.

Lexical Simplification: Optimising the Pipeline

Shardlow, Matthew

[Thesis]. Manchester, UK: The University of Manchester; 2015.

Access to files

Abstract

Introduction: This thesis was submitted by Matthew Shardlow to the University of Manchester for the degree of Doctor of Philosophy (PhD) in the year 2015. Lexical simplification is the practice of automatically increasing the readability and understandability of a text by identifying problematic vocabulary and substituting easy to understand synonyms. This work describes the research undertaken during the course of a 4-year PhD. We have focused on the pipeline of operations which string together to produce lexical simplifications. We have identified key areas for research and allowed our results to influence the direction of our research. We have suggested new methods and ideas where appropriate.Objectives: We seek to further the field of lexical simplification as an assistive technology. Although the concept of fully-automated error-free lexical simplification is some way off, we seek to bring this dream closer to reality. Technology is ubiquitous in our information-based society. Ever-increasingly we consume news, correspondence and literature through an electronic device. E-reading gives us the opportunity to intervene when a text is too difficult. Simplification can act as an augmentative communication tool for those who find a text is above their reading level. Texts which would otherwise go unread would become accessible via simplification.Contributions: This PhD has focused on the lexical simplification pipeline. We have identified common sources of errors as well as the detrimental effects of these errors. We have looked at techniques to mitigate the errors at each stage of the pipeline. We have created the CW Corpus, a resource for evaluating the task of identifying complex words. We have also compared machine learning strategies for identifying complex words. We propose a new preprocessing step which yields a significant increase in identification performance. We have also tackled the related fields of word sense disambiguation and substitution generation. We evaluate the current state of the field and make recommendations for best practice in lexical simplification. Finally, we focus our attention on evaluating the effect of lexical simplification on the reading ability of people with aphasia. We find that in our small-scale preliminary study, lexical simplification has a nega- tive effect, causing reading time to increase. We evaluate this result and use it to motivate further work into lexical simplification for people with aphasia.

Bibliographic metadata

Type of resource:
Content type:
Form of thesis:
Type of submission:
Degree type:
Doctor of Philosophy
Degree programme:
PhD Computer Science (CDT)
Publication date:
Location:
Manchester, UK
Total pages:
158
Abstract:
Introduction: This thesis was submitted by Matthew Shardlow to the University of Manchester for the degree of Doctor of Philosophy (PhD) in the year 2015. Lexical simplification is the practice of automatically increasing the readability and understandability of a text by identifying problematic vocabulary and substituting easy to understand synonyms. This work describes the research undertaken during the course of a 4-year PhD. We have focused on the pipeline of operations which string together to produce lexical simplifications. We have identified key areas for research and allowed our results to influence the direction of our research. We have suggested new methods and ideas where appropriate.Objectives: We seek to further the field of lexical simplification as an assistive technology. Although the concept of fully-automated error-free lexical simplification is some way off, we seek to bring this dream closer to reality. Technology is ubiquitous in our information-based society. Ever-increasingly we consume news, correspondence and literature through an electronic device. E-reading gives us the opportunity to intervene when a text is too difficult. Simplification can act as an augmentative communication tool for those who find a text is above their reading level. Texts which would otherwise go unread would become accessible via simplification.Contributions: This PhD has focused on the lexical simplification pipeline. We have identified common sources of errors as well as the detrimental effects of these errors. We have looked at techniques to mitigate the errors at each stage of the pipeline. We have created the CW Corpus, a resource for evaluating the task of identifying complex words. We have also compared machine learning strategies for identifying complex words. We propose a new preprocessing step which yields a significant increase in identification performance. We have also tackled the related fields of word sense disambiguation and substitution generation. We evaluate the current state of the field and make recommendations for best practice in lexical simplification. Finally, we focus our attention on evaluating the effect of lexical simplification on the reading ability of people with aphasia. We find that in our small-scale preliminary study, lexical simplification has a nega- tive effect, causing reading time to increase. We evaluate this result and use it to motivate further work into lexical simplification for people with aphasia.
Additional digital content not deposited electronically:
None
Non-digital content not deposited electronically:
None
Thesis main supervisor(s):
Thesis co-supervisor(s):
Language:
en

Institutional metadata

University researcher(s):

Record metadata

Manchester eScholar ID:
uk-ac-man-scw:278088
Created by:
Shardlow, Matthew
Created:
16th November, 2015, 12:14:14
Last modified by:
Shardlow, Matthew
Last modified:
16th November, 2017, 14:24:44

Can we help?

The library chat service will be available from 11am-3pm Monday to Friday (excluding Bank Holidays). You can also email your enquiry to us.