JCVI: Research / Projects / Southern African Genome Diversity Study / Background
Section Banner

Southern African Genome Diversity Study


As defined by the United Nations geographical classification, southern Africa includes the countries of Botswana, Lesotho, Namibia, South Africa and Swaziland. The region is home to populations broadly defined as Khoesan, Bantu, or European. We introduce a new classification of populations that have emerged from significant migrations into the region.

Southern Africa, as defined by the United Nations geographical classification.

A Word on the Use of Population Identifiers

The publication of the 'Complete Khoisan and Bantu genomes from southern Africa' (Nature 463, 943—947; 2010), led to a published response by C. Schlebusch (Nature 464: 487; 2010) to our use of the terms Khoisan, Bushmen and Bantu, stating that these terms 'are perceived by those populations as outdated and even derogatory'. 

It should be noted that classification of peoples, based either on ethnic/cultural similarities or linguistics has historically been derived by non-indigenous academics. Many of these terms, for example Bantu and Khoe (or Khoi), simply mean 'people'. With a rich diversity in culture and languages, the indigenous peoples of southern Africa did not have a need for a collective. One of the goals of our research is therefore to provide an identity to these individual groups.

Academic research does, however, dictate that a form of collective is used to describe a culture, language, region, or in the case of genomics, a human lineage. We are therefore forced to establish the best possible fit, while respecting the perceptions of these terms by the community. The research team has therefore made a concerted effort to allow the voice of the communities participating in each project to be heard via self-identification. This is the same procedure in place in the United States for research projects.  

Population identifiers in South Africa. Until 1991 South African law divided the population into four official ethnic groups:'Black', 'Coloured', 'Indian' and White'. Although many South Africans still use these population identifiers, for others these classifications have negative connotations. In an attempt to address our use of the population identifier 'Coloured' in our studies, we performed a population classification survey of 521 'Coloured' blood donors residing across several suburbs within the Western Cape of South Africa. Unprompted population self-identification included; 91.2% identification as "Coloured," "South African Coloured," or "Cape Coloured," while 8.8% referred to themselves as "Admixed/Mixed." Based on our survey we conclude that Coloured is still the most widely recognized population-specific identifier within this community. 

Description of Terms

Khoesan is an academic collective used to define indigenous southern Africans with a foraging mode of subsistence with click-using languages (excluding the Bantu-based languages that have incorporated clicks). The preferential use of the spelling Khoesan (over Khoisan) is based on the linguistic observation that the combination of o+i does not exist in the Khoekhoegowab language from where the word was taken. Anthropologically, archeologically, linguistically (and very likely genomically) the Khoesan definition includes two distinct peoples, the Khoe herder-gatherers and the San hunter-gatherers, that once inhabited a broader region at the most southern tip of Africa.

Bushmen (Bossiesman in Afrikaans), over the more academically accepted term 'San', is the community preferred collective in our initial studies, which includes only Namibian and relocated Angolan Bushmen. Populations in our study that fall under the Bushmen classification are the Ju/'hoan and !Xun (obsolete !Kung). The following symbols represent dental (/), alveolar (!), palatal (‡), and the lateral (//) clicks.

Bantu is a broad term used to describe around 500 sub-Saharan African populations defined by their linguistic use of distinct Niger-Congo B languages. Meaning 'people', Bantu languages are distinguishable from the Khoesan languages as they do not use click consonants and are characterized by the extensive use of affixes. An exception to the non-clicking rule is found in the South African Nguni languages such as isiXhosa (see below) and isiZulu.

amaXhosa are a South African Bantu people from the Nguni linguistic classification consisting of several subgroups. The amaXhosa are likely one of the first migrant Bantu to reach South Africa along the east coast up to 1500 years ago and settling in the Eastern Cape Province of South Africa. Their encounters with indigenous Khoesan led to the incorporation of 'click' sounds in the isiXhosa non-Khoesan language. It should be noted that the Southern Bantu groups are denoted with a prefix, which will differ when referring to a language, namely isiXhosa, or to a people, amaXhosa. The English alternative is to omit the prefix and refer to the ethnolinguistic identifier as Xhosa.

Coloured of South Africa emerged as a direct result of colonization and slave trade beginning in 1652 at the most southern tip of Africa. These early migrants included European settlers (predominantly Dutch, German and French, and later British) and slaves from the East (including India, Indonesia, and Sri Lanka), Madagascar, and coastal Africa (including Mozambique, Angola and Guinea). The Coloured therefore represent a complex genomic ancestry including European, Asian, African and indigenous Khoesan.

Basters of Namibia were formerly part of the original pool that led to the rise of the Coloured at the Cape of South Africa, who subsequently traveled north towards then South West Africa (now Namibia) and settled in the town of Rehoboth. In 1872 they established themselves as an independent population with a national flag. The use of the term Baster (or Rehoboth Baster) in Namibia is regarded with immense pride within the community.