Skip to main content
Collaboration with Stanford to digitise colonial newspaper.jpg
Arts, languages and social sciences

Collaboration with Stanford to digitise colonial newspaper

Department of Journalism
13 March 2026
  • Historical Digitization: Stellenbosch and Stanford are collaborating to digitize The South African Commercial Advertiser, a landmark 19th-century newspaper for press freedom.
  • AI-Driven Accessibility: The project uses HistoryGenie AI to turn microfilm into searchable, machine-readable text for scholars and the public.
  • Educational Research: This partnership provides South African students with an authentic digital corpus to research their own historical past.

Prof Gawie Botma from the Journalism Department of Stellenbosch University (SU) is collaborating with two academics from Stanford University in the United States in the HistoryGenie project to digitise a prominent Cape colonial newspaper, The South African Commercial Advertiser (SACA) from 1829 to 1864.  SACA was the first independent newspaper in the Cape colony and played a major role in the struggle for press freedom in the early 19th century.

HistoryGenie (https://history.genie.stanford.edu/)  is a free, “history-from-below" AI-assisted archive designed in a university environment to increase access to historical sources and voices. Essentially, HistoryGenie is designed to allow any community, museum, scholar, or teacher to turn materials in their possession into a corpus of machine-readable, searchable, and queryable text. AI is used to provide reliable, fast transcription of text, to manage a semantic search engine, and to help students and researchers find and organize material.  

In the case of the SACA,  Prof Grant Parker and Prof Trevor Getz of Stanford University are responding to requests by scholars for greater access to this source, but also are building a curriculum (and hence a corpus)  for South African students to do authentic research in their historical past. HistoryGenie already has fourteen West African newspapers, and these are frequently used by a wide range of researchers and by instructors in Ghana and the US. Digital copies of SACA between 1824 and 1829 are also available, and the collaboration with SU will enhance the collection.

The first step in the transcription process is to turn the microfilm into .pdf or .tiff format documents that can be "chunked’" transcribed by the AI, and examined by humans for accuracy.  Prof Botma applied and received permission from the  Stellenbosch University Library to release 25 tins of microfilm of SACA  temporarily for the project. Once the microfilm is processed, copies will be made available to Stanford, while Stellenbosch University also plans to host it in their digital repository.

The front page of SACA in 1854.

The front page of SACA in 1854.jpg

Microfilm of the newspaper will be digitised and turned into searchable texts.

Microfilm of the newspaper will be digitised and turned into searchable texts..jpg

Tags

Journalism

Related stories