University of Brighton home page

     University of Trento home page

Image of Singapore, venue for ACL-IJCNLP 2009   
"ITRI" in a range of writing systems "ITRI" in a range of writing systems "ITRI" in a range of writing systems

  ACL-IJCNLP 2009 Workshop:
  Language Generation and Summarisation (UCNLG+Sum)

Call for participation

Registration form

Important dates

Accepted papers

Key facts


Call for papers



Workshop aims

There are many branches of NLP research which involve the generation of language (summarisation, MT, human-computer dialogue, application front-ends, data-to-text generation, document authoring, etc.). However, it is not always easy to identify common ground among the generation components of these application areas, which has sometimes made it difficult for generic research in 'Natural Language Generation' (NLG) to engage with them effectively. Recent advances in corpus-based approaches (both manual and automatic) across many of these areas, and in particular in NLG itself, offer a new perspective on this problem and the opportunity to explore synergies and differences from the secure common grounding of corpus data.

This workshop is the third in an occasional series seeking to exploit this opportunity by providing a forum for discussing NLG and its links with these closely related fields from a corpus-oriented perspective. These workshops have the general aims:

  • to provide a forum for reporting and discussing corpus-oriented methods for generating language;
  • to foster cross-fertilisation between NLG and related fields by looking for common ground through corpus-oriented approaches;
  • to promote the sharing of data and methods in all language generation research.
Each of these workshops has a special theme: at the first workshop (at Corpus Linguistics in 2005) it was the use of corpora in NLG, at the second (at MT Summit in 2007) it was Language Generation and Machine Translation. The special theme of the 2009 workshop is Language Generation and Summarisation.

There are two basic approaches to text summarisation: abstractive, where texts are analysed, the internal representations are pruned, and a more condensed version regenerated, and extractive, where key passages of the input texts themselves are identified and the 'glued together' to form a shorter text. Extractive summarisation is less dependent on fragile 'deep' analysis and regeneration techniques, but tends to produce summaries that are not very coherent and whose referring expressions are not very clear (so for example, they often score low on DUC human assessment criteria such as 'discourse coherence' and 'referential clarity').

The relevance of NLG techniques to abstractive summarisation is clear, but recently there has also been increasing interest in regeneration as a post-process for extractive summaries. Work by Otterbacher et al., Steinberger et al. and Nenkova et al., for example, show how regeneration of (parts of) extractive summaries can increase their coherence, referential clarity or fluency. At the same time NLG researchers are investigating techniques that could be used to improve extractive summaries by regenerating them (in particular in the subfield of referring expression generation, see for example the GREC Task papers at INLG 2008).

The core aim of this workshop is to provide a forum for NLG and summarisation researchers to examine the similarities and differences between their current approaches to generating language, and to explore the potential for cross-fertilisation. To this end, the workshop will include:

This will be supported by a programme of technical papers, on all aspects of using corpora in the generation of language, with a particular interest in relevance to text summarisation. Specific topics include, but are not limited to:

  • generation techniques in abstractive summarisation
  • regeneration/rewriting/post-processing techniques for extractive summarisation
  • generation of references to named entities in discourse context
  • annotating corpora for language generation and summarisation
  • uses of corpora in the evaluation of language generation and summarisation systems
  • reuse of corpus resources developed for NLU (e.g. treebanks) in language generation and summarisation
  • domain-specific vs. general-purpose corpora for language generation and summarisation
  • statistical approaches to language generation
  • machine learning methods for language generation

Key workshop facts

Invited speaker:

Kathy McKeown, Columbia University, USA


Regina Barzilay, MIT
Ed Hovy, ISI
Kathy McKeown, Columbia
Donia Scott, Open University


Anja Belz, University of Brighton, UK
Sebastian Varges, University of Trento, Italy
Roger Evans, University of Brighton, UK

Programme committee:

Enrique Alfonseca, Google Zurich, Switzerland
Srinivas Bangalore, AT&T, USA
Robert Dale, Macquarie University, Australia
Daniel Marcu, ISI, University of Southern California, USA
Chris Mellish, Universiy of Aberdeen, UK
Ani Nenkova, University of Pennsylvania, USA
Amanda Stent, SUNY, USA
Michael Strube, EML Research, Germany
Stephen Wan, Macquarie University, Australia
Mike White, Ohio State University, USA
Jianguo Xiao, Peking University, China

Important dates

1 May 2009: Deadline for paper submissions
3 Jun 2009: Notification of acceptance
10 Jun 2009: Camera-ready copies due
30 Jun 2009:   Early registration deadline
6 Aug 2009:   UCNLG+SUM workshop in Singapore

Accepted papers

Keynote paper:

Kathy McKeown
Query-focused Summarization Using Text-to-Text Generation: When Information Comes from Multilingual Sources

Long papers:

Karolina Owkzarzak and Hoa Trang Dang
Evaluation of automatic summaries: Metrics under varying data conditions

Horacio Saggion
A Classification Algorithm for Predicting the Structure of Summaries

Jackie Chi Kit Cheung, Giuseppe Carenini and Raymond Ng
Optimization-based Content Selection for Opinion Summarization

Wei Xu and Ralph Grishman
A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

Hideki Tanaka, Akinori Kinoshita, Takeshi Kobayakawa, Tadashi Kumano and Naoto Katoh
Syntax-Driven Sentence Revision for Broadcast News Summarization

João Cordeiro, Gaël Dias and Pavel Brazdil
Unsupervised Induction of Sentence Compression Rules

Short Papers:

Stephanie Schuldes, Michael Roth, Anette Frank and Michael Strube
Creating an Annotated Corpus for Generating Walking Directions

Iris Hendrickx, Walter Daelemans, Erwin Marsi and Emiel Krahmer
Reducing Redundancy in multi-document Summarization Using Lexical Semantic Similarity

Maria Fernanda Caropreso, Diana Inkpen, Shahzad Khan and Fazel Keshtkar
Visual Development Process for Automatic Generation of Digital Games Narrative Content

Mohit Kumar, Dipanjan Das, Sachin Agarwal and Alexander Rudnicky
Non-textual Event Summarization by Applying Machine Learning to Template-based Language Generation

Site News

22 Jul 2009
Regina Barzilay confirmed as additional panel member.

13 Jul 2009
Full workshop programme now available on conference website.
Final draft proceedings now available.

17 Jun 2009
Call for Participation, details of keynote presentation and panel members added.

15 Jun 2009
On-line registration available - earlybird deadline 30 Jun 2009.

5 Jun 2009
List of Accepted papers published.

6 May 2009
Schedule for notification and CRC deadlines revised - see Important dates

14 Apr 2009
2nd call for papers

30 Jan 2009
Call for papers published

13 Jan 2009
Workshop date now confirmed as 6 August 2009

30 Sep 2008
Workshop initial announcement

UCNLG+Sum is organised by Anja Belz and Roger Evans, NLTG, University of Brighton, and Sebastian Varges, DIT, University of Trento, and is endorsed by SIGGEN, the special interest group on generation of the Association for Computational Linguistics (ACL).

back to top

Last modified: 22 Jul 2009
Maintained by: