Tag Archives: Coastal Bikol

New Bikol Orthography – Part 2

In this part, I would like to show how to implement the new orthography within the Bikol macrolanguage. Let’s clarify first why Bikol is classified as a macrolanguage in ISO. Is it not a dialect? How about a language? What are the differences anyway?


ISO 639 [1] governs the classification of language varieties if they are dialects, individual languages, macrolanguages or collections of languages. It acknowledges that there is no uniform definition of language that is acceptable to all language speakers and linguistic experts suited for all purposes. But it did outline a set of criteria in their classification of languages deemed fit for the intended range of application of that standard. Whatever range of applications is contemplated, that is not mentioned in the website.

Based on how I understood ISO 639, the dominant criterion used to classify a language variety is ethnolinguistic identity – if both language varieties share a common, well-established ethnolinguistic identity, both are treated as varieties of the same language (a dialect in short) even if intelligibility is marginal. There is intelligibility if speakers of both variety have inherent understanding of each other at a functional level, without the need to learn the other variety. In ISO 639, if each linguistic variety has a distinct ethnolinguistic identity, even if sufficiently intelligible to each other, they are treated as distinct languages. Problem for this standard is, it did not define what is an ethnolinguistic identity. Although unstated, I would suppose that shared ethnolinguistic identity means having (a) a common literature or culture and (b) a central language variety intelligible to the varieties in question. In ISO, an individual language encompasses a defined range in spatial, temporal and social spectrum of its linguistic varieties, whether spoken and written, including its standardised variety.

It also used the terms “Macrolanguage” and “collections of languages”. Individual languages are members of a macrolanguage if they all have (a) close linguistic relations and (b) common linguistic identity (due to a single written form or a standard spoken language used in wider commnication among the speakers of such closely related languages) in at least one domain. Language collections are groups of languages which are never deemed a single language in any context.

But I find that even with the term macrolanguage, we are still incapable of distinguishing language varieties that are (1) mutually intelligible yet ethnolinguistically distinct, from (2) mutually marginally intelligible or unintelligible yet ethnolinguistically integrated. I think it would be better if additional terms are invented, like ‘apolanguage’ (‘apo’ away from, detached) for (1) and invent ‘arthrolanguage’ (‘arthro’ joint) for (2), while at the same time using ‘isolanguage’ (‘iso’ equal, like) for an individual language where varieties have matching boundaries in terms of intelligibility and ethnolinguistic identity. And also call each variety within an arthrolanguage or apolanguage as an ‘ethnolect’. I believe that ISO 639 definition of language is a good step forward but there is still room for improvement of traditional definitions.

The largest macrolanguage in terms of speakers is Chinese with over 1 billion speakers and 13 individual languages under it. Next is Arabic, spoken by roughly 422 million and covering 30 languages. In terms of number of individual member languages, Zapotec has the most member with 57 languages and Quechua with 44 languages. In the Austronesian language family, there are only 3 macrolanguages: Malay, Malagasy and Bikol. Malay has 184 million speakers (41 million native, 143 million nonnative) for its 13 languages; Malagasy has 17 million speakers of its 10 individual languages, and Bikol has 4.5 million speakers of the 5 languages.


When I use the word dialect here, I am referring to spatial/geographic language varieties only. I would use the term sociolect if I am referring to language varieties that are different among hierarchically arranged social groups. So, how are the different ethnolects and dialects of this arthrolanguage Bikol interrelated? Ethnologue [2] has made a superb identification and classification of these language varieties, so the following table of dialects, languages and their subgroupings were taken from their website.

Ethnolect** Language (or dialect groups) Dialects Population (2000)
Coastal Bikol
Central Bikol 2,500,000
Southern Catanduanes Bikol 85,000
Mt Iraya Agta 150
Mt Isarog Agta 5 to 6
Inland Bikol
Albay Bikol 1,900,907
Iriga Bikolano 234,361
Riconada Bikol
Mt Iriga Agta 1,500
Northern Catanduanes Bikol 122,035

**These are my grouping of the language varieties into 3 groups.

Although in Ethnologue the 3 Agta languages are part of the Bikol language subgroup, it is not considered under ISO as part of the macrolanguage since they have different ethnolinguistic identity, being nomadic and non-Christian, I suppose. Central Bikol, Southern Catanduanes, Mt Iraya Agta and Mt Isarog Agta to me are dialect groups of a single ethnolect, so I will depart from the Ethnologue treatment here and group them under Coastal Bikol as I have indicated above. Albay Bikol, Iriga Bikolano and Mt Iriga Agta are also for me dialect groups of a single ethnolect that we can call Inland Bikol. And since Inland Bikol and Coastal Bikol are English terms, I would like to give them Bikol equivalents, Bikol Iraya and Bikol Ilahud, respectively. Let’s call the arthrolanguage simply as Bikol.


Being a macrolanguage, there should be 1 domain where all the language varieties are viewed as 1 single language or an arthrolanguage. In this regard, to facilitate usage of Bikol in that domain, I am here proposing the rules of orthography for this macrolanguage.

So here’s the principle: Cognate words with identical meanings between the individual Bikol ethnolects/dialects will be written in their reconstructed form using comparative linguistics to arrive at a hypothetical proto-Bikol form. Cognate words that have shifted in meaning (false friends) will also be written in its reconstructed form, unless their pronunciation have diverged so much from the protoform, in which case they should be written in the attested forms in the individual ethnolects/dialects. Cognate words in the same ethnolect/dialect (etymological twins) will also be written in the attested forms. Non-cognate words like false cognates and loanwords (identified through contact linguistics) will be spelled as attested in the ethnolect/dialect.

The reasons:

  1. To highlight the fact that individual Bikol languages are once dialects of one protolanguage.
  2. To show to speakers of each individual Bikol language how their languages are related, by formally showing the exact point of the divergence.
  3. To provide an alternative model on how to modernize the Bikol language for future consideration and teaching.
  4. To use this as springboard to trial another idea, that of replacing Tagalog as the basis of Filipino and instead use reconstructed proto-Austronesian forms, a true representative of all Philippine languages. This proto-Austronesian forms, once official, can even be shared with Indonesia, Malaysia, Singapore, Brunei, Timor Leste, Malagasy and Pacific Island countries (Fiji, Tonga, Samoa, Kiribati, Marshall Islands, Micronesia, Nauru, Palau, Solomon Islands, Tuvalu, Vanuatu) as a common official language in an Organization of Austronesian Speaking Countries, much like how French (Organisation Internationale de la Francophonie), English (Commonwealth of Nations) & Portuguese (Comunidade dos Paises de Lingua) connect the various countries and colonies that speak such languages. And even countries with indigenious minority Austronesian speakers like Papua New Guinea, Taiwan, Vietnam, Thailand, New Zealand, USA (Hawaii, Guam), France (French Polynesia) could join in such an organization promoting such “world language”.

    To visualize how this will be implemented in the Bikol macrolanguage, I will give examples from Spanish [3] where consonants have changed the most: the ll in caballo ‘horse”, ‘j’ in jamon ‘ham’, and ‘z’ in zapatos ‘shoes’ have variable pronunciation among various Castellano dialects. Although /ʎ/ has merged with /y/ in most dialects and has become /ʒ/ in Argentina and Uruguay, Castellano has not revised the spelling at all to *cabayo or *cabazho. Likewise, /x/ has merged with /h/ for j and /θ/ has merged with /s/ for z for many dialects, yet standard Castellano spelling for them remains, and not *hamon or *sapatos. Even /f/ is /ɸ/ in Ecuador and /tʃ/ is /ʃ/ in Panama, yet spelling continues to be facil and ocho, for example, to this day, and not *ɸasil or *oʃo. By retaining the original spellings and teaching it and exposing people to the various dialects, speakers of other Castellano dialects who do not pronounced it similarly as the original knows what the original Castellano pronunciation had been, even without an existing parallel writing system in their own dialects. [Please refer here [4] for the significance of the symbols.]

    The same can be said of the English words, where vowels have changed most. Please refer to this website [5] for a comparison of vowels among the different native English pronunciation. Spanish and English spellings are called etymological spelling for their dialects as these dialects are not written in phonemic spellings [6].

    Going back to Bikol, what I am advocating for the arthrolanguage is an etymological ethnophonemic spelling while for the ethnolects and dialects a pure phonemic spelling. There will be no confusion at all since the ethnolects/dialects will be written as pronounced but the arthrolanguage in reconstructed form as pronounced back in time. We are representing here the words as they are pronounced long ago, say in the 7th to 13th century (I’m not really sure). Taking the examples that I have shown in Part 1, here are the comparative representation for each ethnolect/dialect and the arthrolanguage.

    English Gloss Bikol Bikol Ilahud Bikol Iraya
    Naga Virac Iriga Buhi
    life bu•hay bu•hay buay
    tree ka•hoy ka•hoy kaoy
    body ha•wak ha•wak чa•wak
    see hiliŋ hiliŋ чiliŋ
    youngest child ŋuhud ŋuhud ŋu•d
    study чa•daƥ чa•dal чa•daλ
    buy bakaƥ bakal bakaλ
    bring daƥah darah daλah
    run daƥa•gan dala•gan daλa•gan
    tall haƥaŋkaw halaŋkaw haλaŋkaw
    fight чi•waƥ чi•wal чi•waλ
    man laƥa•kih lala•kih laλa•kih
    walk ƥakaw lakaw λakaw
    one saƥɷч saroч saλoч
    talk taƥam taram taλam
    three tɷƥoh tuloh tuλoh
    cockpit buƥaŋan bulaŋan buλaŋan
    rice bɷgas bagas bɷgas
    blade tarɷm tarum tarɷm
    rice plant pa•rɷy paroy pa•rɷy
    black чitɷm чitum чitɷm
    itch gatɷƥ gatol gatɷƥ
    long duration haƥɷy haloy чaƥɷy
    repent sɷƥsɷƥ solsol sɷƥsɷƥ
    wait hɷƥat halat haλat чɷƥat
    sour чaƥsɷm чalsom чaλsom чaƥsɷm
    snake ha•ƥas halas чa•ƥas
    worry handaƥ handal чandaƥ
    floor saƥɷg salog salɷg saƥɷg
    house baƥɷy balay, haroŋ balɷy baƥɷy
    sour чaƥsɷm чalsom чalsɷm чaƥsɷm
    hear dɷŋɷg daŋog rɷŋɷg rɷŋɷg

    These data were taken from several websites so I still need to confirm if the rendering of Bikol Iraya is really phonemic. I am not a trained linguist, so please take my data with caution. The reconstructed forms here are tentative, once I come accross the proper reconstructed forms of these words, I will put them here. The reason I put those tentative protoforms above as such are based on my assumptions that:

    1. The protophoneme *h was retained in Bikol Ilahud but was lost in Bikol Iraya.
    2. The protophoneme *ƥ was retained in Buhi, became λ in Virac, and l or r in Naga and Iriga.
    3. The protophoneme *ɷ (schwa) was retained in Iriga and Buhi but became a or o in Naga, depending on its position inside the word.

    Although I have tried to show how the words were to be represented, I have not shown the arthrolanguage’s grammar. That would come in the future.


    [1] http://www.sil.org/ISO639-3/scope.asp. For a complete list of macrolanguages, go to http://www.sil.org/iso639-3/macrolanguages.asp.

    [2] http://www.ethnologue.com/show_family.asp?subid=92362

    [3] http://en.wikipedia.org/wiki/Spanish_orthography

    [4] http://en.wikipedia.org/wiki/International_Phonetic_Alphabet

    [5] http://wapedia.mobi/en/IPA_chart_for_English

    [6] http://www.hku.hk/linguist/program/contact10.html