Accessibility Tools

  • Content scaling 100%
  • Font size 100%
  • Line height 100%
  • Letter spacing 100%
Free Article: No
Contents Category: Language
Custom Article Title: 'Citizen lexicography: Creating a ‘Word Zoo’ in Canberra' by Sarah Ogilvie
Review Article: Yes
Show Author Link: Yes
Online Only: No
Custom Highlight Text:

Every day for the past few months, the Sydney linguist Michael Walsh has been sitting in the Mitchell Library poring over old manuscripts. He is extracting old wordlists of Aboriginal languages from the library’s rich collection of early British settler diaries, missionary field notes, and unpublished historical documents for a project funded by the State Library of New South Wales and Rio Tinto. This week, Michael sent me twelve scanned pages of a leather-bound diary he discovered which belonged to Richard Tester, who had recorded his daily adventures in 1860, travelling overland from Kerkaraboo on the Wakefield River to Melbourne and the goldfields.

Display Review Rating: No

Tester’s diary describes life in the bush and the gold fields, his encounters with Aboriginal people (including claims of murder and cannibalism around Lake Victoria), and his experience of a corroboree. His handwriting is difficult to read, but the text is extremely rich in early Australian vocabulary, and those twelve pages alone provided over thirty new citations for the next edition of the Australian National Dictionary (AND2). There were earlier examples of words such as wombat (a slow or stupid person) and coon (Aboriginal person), antedated by fortyfive years and thirty-nine years, respectively. These are significant antedatings when you consider that Australian English is only two hundred years old. The use of the word coon was particularly surprising, because it showed that it was not originally used in a derogatory way, but in fact was also used by whitefellas to refer to themselves.

There were also completely new words that we will now add to the dictionary such as yeller dust (gold) and thunder stick (a gun) – dated in the OED as 1918 but which can now be claimed as originally Australian from sixty years earlier. Even more exciting for us was the use of early pidgin and Aboriginal English in the diary: bacca (tobacco), black pella (black fella), butter (the fat on an animal), and moke (to smoke a cigarette).

We would not have discovered these words without Michael Walsh’s kindness in sending us scans of the diary. It highlights the subjective nature of citation collection and dictionary compilation. How many thousands of words lie undiscovered in archives and attics around Australia?

Scholars have access to excellent databases of digitised Australian newspapers such as Factiva and Trove (the National Library of Australia’s free archive of digitised Australian regional newspapers), but we have yet to create comparable digitised repositories of unpublished Australian materials. The main problem, of course, is technology: how can we refine a computer’s ability to search handwritten materials? We human readers at the Australian National Dictionary Centre (ANDC) found Tester’s handwriting difficult enough to decipher; a computer would have had a meltdown. Until this is solved technologically, we will have to rely on human effort. Searching for citations in handwritten manuscripts is therefore the perfect crowd-sourcing project, reaching out to the public via the Web to help in discrete scholarly tasks.

James Murray made his appeal for members of the public to read published sources for the OED in the nineteenth century. What if we had the capacity for the public to read scanned copies of manuscripts and other unpublished materials? Here at the ANDC we are developing possibilities for lexicographic crowd-sourcing, following on from similar models developed by scholars at Oxford in astrophysics. If Oxford’s Galaxy Zoo is anything to go by, there are hundreds of thousands of people willing to help scholarly projects in this way.

Galaxy Zoo was started in 2007 by Oxford astrophysicist Kevin Schawinski, who needed to classify millions of pictures of galaxies by eye. These were images taken by a robotic telescope, and no human eye had ever seen them before. Schawinski was classifying them himself – 50,000 galaxies a week – but he found the process mind-numbing. So he teamed up with colleagues at Yale, University of California, Berkeley, and elsewhere, and they invited the public to help them. No knowledge of astronomy was needed: volunteers went to the site and were given a ten-minute tutorial on classifying galaxies, and were then able to start. This process is now called ‘citizen science’.

Citizen science was hugely successful. In the first three weeks they had 80,000 volunteers, who classified ten million galaxies. One of these volunteers was a twenty-five-year-old teacher in the Netherlands called Hanny Van Arkel. She observed a weird green object in one of the photos, so she alerted the experts, and sure enough she had discovered a new galaxy the size of the Milky Way. It is now known as Hanny’s Voorwerp, and she jointly authored a paper with the experts in the journal of the Royal Astronomical Society.

When I worked in the new words group at the OED, we admitted about one thousand new words per year. For every one word we accepted we could have included another ten that met the criteria for inclusion. The only thing that stopped us admitting more words was lack of editorial support. Now that we have the technological capacity to expand our lexicon almost infinitely, the only thing holding us back is person-power.

Our plan here at the ANDC is to create a ‘Word Zoo’ and give the public the chance to join us in our dictionary making. If we found over thirty new citations in twelve pages of Tester’s 1860 diary, just imagine how many citations we will collect when the task of reading for new words in Australian manuscripts is opened beyond the doors of the ANDC and the new field of ‘citizen lexicography’ is born. We look forward to signing you up for this exciting new project.

Comments powered by CComment