New Bikol Orthography – Part 3

This page is under construction, a rough draft. Comments is disabled for the moment. Please do not quote yet as the phrasing will definitely change..

This part will discuss handling loan words or borrowed words from any language, which is very important as Bikol has substantial borrowings from Castellano and English, and more is to come from other influential languages.

Loanword phonology for the borrowing language has 2 issues: (1) what phonemes it will accept and adopt and (2) what sequences and combinations these phonemes must take.

Allowed Phonemes

As of now, the following phonemes are contrastive in the onset and coda of Bikol syllables: 16 consonants: m, n, ŋ, p, b, t, d, k, g, ч, s, h (not contrastive in coda), l, r, w, y ( plus 1 more consonant for Buhinon and Viracnon: ƥ / λ ) and 3 vowels: a, e/i, o/u ( plus 1 more vowel for Central Bikol: ɷ). The vowels e and i and o and u are constrative in borrowed words only. There are many more phonemes contrasted in other possible language donors than these 21 phonemes and any one of them could find their way into Bikol through borrowings. In the coming globalized world, these donor languages could possibly include those languages with large populations and unique cultures, as well as the other indigenous languages of the Philippines (Tagalog, Sugbuhanon, etc.) and other Austronesian languages (Javanese, Malay, etc). These could also include the dead languages Latin, Ancient Greek, Sanskrit and Pali. Please refer to the table below for the possible word donor languages to Bikol with large populations. Apart from what Bikol already has, what other phonemes are we going to allow into the Bikol phoneme inventory? The answer of course would depend on a lot of factors and on personal preferences, or make that social preferences. I will hazard here my very own personal preferences. I want to identify first what should not be allowed. Most of my choices will be arbitrary of course.

In current phonetics, airstream mechanism for all sounds are either pulmonic egressive, glottalic egressive, lingual ingressive or glottalic ingressive. I will only recognize 2 airstream mechanism: Ingressive and Egressive. My opinion of glottalic egressive is a combination of egressive and one of phonation types: closed phonation. Since the glottis blocks the airstream from the chest, the glottis alternates to produce the airstream. Closed phonation nonsonorants are normally called ejectives and closed phonation sonorants are called glottalized consonants. Ejectives and glottalized consonants are mostly found in smallish languages in Western North America with a few in Asia & Pacific (like Ubykh), South America and Africa. Only egressive phonemes will be admitted into Bikol and ingressive will be converted to their egressive counterparts. Phonation will be limited to closed ( “ч” ), modal/voiced ( “ɦ” ) and voiceless ( “h” ) so breathy, slack, stiff, creaky, plus the laryngeal states harsh and faucalized phonation should not be imported into the language as phonemic modification. Advance tongue root and retracted tongue root, being laryngeal modifications, will not be imported into Bikol sound system. Breathy and slack will be subsumed under voiceless ( “h” ) as an allophone and stiff and creaky to voiced ( “ɦ” originally a murmured symbol) in the same way. Closed phonation, either as ejectives or glottalized consonants, will be permitted, like that of Vietnamese. Aspiration will be treated as consonant cluster with ‘h’. This is found in Mandarin, Hindustani, English, Bengali and Hokkien. In German dialects, this is treated as a consonant cluster. Pre-aspirated stops ʰp, ʰt, ʰk are found in Icelandic and are treated as clusters as well. Susurration (the use of murmured phonation instead of open phonation in aspiration) will be treated as consonant clusters and occurs in Hindustani and Bengali.

For the complex or co-articulated consonants, my view is that any borrowing of these will be treated as a consonant cluster much like secondary articulation and double articulation. Secondary articulation is co-articulation of a weaker consonant and in a different manner.

  1. Labialization – treated as consonant cluster with “w”. Example: Scottish English “wh” or ʍ like that of “which” will be written as “hw”. Eastern Arrernte or Ikngerripenhe has labialization at all places and manner of articulation as explained in Wikipedia.
  2. Palatalization – treated as consonant cluster with “j”. Found in Russian as soft consonants and other Slavic languages. Palatalized alveolar will include ʑ > zj and ɕ > sj as well.
  3. Velarization – treat as consonant cluster with “ƥ” or “x”. Found in Russian as hard consonants. Irish makes distinctions between velarized and palatalized consonants. ɫ is treated as velarized alveolar, so written as lƥ.
  4. Pharyngealization – treated as consonant cluster with ʕ. Found in Arabic as emphatic consonants. Ubykh distinguishes labialized and palatalized consonants as well as pharyngealized consonants.

Doubly articulated consonants will be treated as consonant clusters as well if borrowed. Other ways of releasing plosives are also treated as consonant clusters: lateral release (treat as consonant cluster with the corresponding voiced/voiceless lateral on the point of articulation), nasal release (treat as consonant cluster with voiced/voiceless nasal of similar place of articulation) and fricative release (treat as consonant cluster with voiced/voiceless fricative/sibilant of similar place of articulation). Affricates will be treated as consonant clusters as well. Rhoticity will be treated as a cluster with retroflex consonant, like English ɝ and ɚ. Rhoticity in vowels I think is a co-articulated vowel and retroflex consonant.

Nasals may be imported like Burmese voiceless nasals. Taps will be subsumed under flaps since there is no language that distinguish them, and trills will be subsumed and converted to flaps in each point of articulation (ʙ, ʋ/ѵ, ʀ, ɹ/r/ɾ). All trills will be treated as consonant clusters, like Spanish trilled r as a consonant cluster of several tap or flap r’s (long r). All geminate consonants will be treated as consonant clusters. There will be no distinction between approximants and fricatives. Lingoulabials will not be imported into Bikol. Lateral will be distinguised from nonlateral consonants, with both having nasal, plosive, fricative, flap and glides. Lateral nasal, lateral glides, lateral plosives, lateral fricatives, lateral flaps may be imported but only as a possibility in the future. All places of articulation may be distinguished like other nonlateral consonants.

Bilabial, alveolar, retroflex, palatal, velar, uvular can be used for nasal and oral plosives and fricatives. Labiodental, interdental and palato-alveolar will be restricted to fricatives/sibilants. All sounds produced at pharyngeal and epiglottal and epiglotto-pharyngeal areas will be combined and not distinguished at all. Bikol already has 2 glottal sounds which will be maintained.

So here is the list of the same languages with their individual consonant phonemes not found in Bikol and how to treat them (not exhaustive). Their consonant clusters will be treated as such but modified in accordance with the table below.

Languages Native Speakers Single Phonemes Phonemes treated here as Clusters
Mandarin 843M ɥ, f, x, ʂ, ɻ ts, tʂ, tɕ, ph, th, kh, tʂh, tɕh
Hindustani 366M f, z, ʃ, ɦ, ʈ, ɖ tʃ, dʒ, ph, th, ʈh, kh, tʃh, bɦ, dɦ, ɖɦ, gɦ, dʒɦ, lɦ, rɦ, mɦ, nɦ
Spanish 358M ɲ, ʎ, f, β, ð, ɣ, θ, x tʃ, rr
English 341M f, v, θ, ð, z, ʃ, ʒ tʃ, dʒ, ph, th, kh
Arabic 206M q, f, θ, ð, z, ʃ, x, ɣ, ħ, ʕ tʕ, dʕ, sʕ, ðʕ, lʕ
Portuguese 178M ɲ, ʎ, f, v, z, ʃ, ʒ, x
Bengali 171M f, z, ʃ, ʈ, ɖ, ɽ tʃ, dʒ, ph, th, ʈh, kh, tʃh, bɦ, dɦ, ɖɦ, gɦ, dʒɦ
Russian 170M f, v, z, x, ʂ, ʐ ts, tɕ, mj, nj, pj, bj, tj, dj, kj, gj, fj, vj, sj, zj, xj, rj, lj
Japanese 122M ɸ, f, z, ʃ, ɕ ts, dz, tɕ, dɕ
German 100M f, v, z, ʃ, ʒ, ç, x, ʁ, pf, ts, tʃ, dʒ
French 80M ɲ, ɥ, f, v, z, ʃ, ʒ, ʁ
Javanese 76M ɲ, ʈ, ɖ tʃ, dʒ
Korean 74M tɕ, ph, th, kh, tɕh
Vietnamese 68M c, ɲ, f, v, z, x, ɣ, b’, d’ th
Tamil 66M ɳ, ʈ,, ɻ,, ɭ ʈʃ
Italian 62M ɲ, ʎ, f, v, z, ʃ ts, dz, tʃ, dʒ
Turkish 61M c, ɟ, ɫ, f, v, z, ʃ, ʒ, ɣ tʃ, dʒ
Hokkien 47M ʑ, ɕ ts, dz, tɕ, ph, th, kh, tsh, tɕh
Persian 40M ɢ, f, v, z, ʃ, ʒ, x, ɣ tʃ, dʒ
Malay 40M ɲ, f, v, z, ʃ tʃ, dʒ
Burmese 32M θ, z, ʃ, ɬ, voiceless m̥, n̥, ɲ & ŋ̊ ph, th, sh, tʃh, kh, tʃ, dʒ
Hausa 25M
Amharic 18M ɲ, f, z, ʃ, ʒ, p’, t’, k’ ts’, tʃ’, tʃ, dʒ
Hebrew 10M f, v, z, ʃ, ʒ, χ, ʁ ts, tʃ, dʒ

To summarize, these will be the list of additional phonemes: Stops ( voiceless m̥, n̥, ɲ, ŋ̊, voiced ɲ, c, ɟ, q, ɢ), glides (ɥ), laterals (ɬ, ʎ), fricatives (ɸ, β, f, v, θ, ð, z, ʃ, ʒ,ç, x,ɣ, χ, ʁ, ħ, ʕ, ɦ), retroflex ( ɳ, ʈ, ɖ, ʂ, ʐ, ɻ, ɽ, ɭ ), ejectives (p’, t’, k’ ) and glottalized (b’,d’). Other consonants are found in smallish languages (e.g.: ɴ, ʡ, ʝ, ɮ, ʟ), so would take time to be influential in Bikol. Here is a tabular graph of all the simple consonants that will be allowed into Bikol:

<> will insert table here later <>

For the vowels, we can allow other vowels but not with too many distinctions. So 3 distinctions in height (Close, Mid, Open), 3 in backness (front, central, back) and 2 in labialization (spread, rounded, with compressed merged with rounded). Nasalization may be imported later. Phonation distinction on vowels may also be imported but should be restricted to voiced, unvoiced and glottalized only. There is no need to include too much distinctions, such that certain phonemes need to merge into 1 phoneme if borrowed, although they can remain as allophones: The following are all oral voiced vowels indicating what other vowels will be subsumed into what: i/ɪ › i , e › e , ɛ/æ › æ , y/ʏ › y , ø › ø , œ/ɶ › œ , u/ʊ › u , o › o , ɔ/ɒ › ɒ , ɯ › ɯ , ɤ › ɤ , ʌ/ɑ › ʌ , ɨ › ɨ , ɘ/ə › ə , ɜ/a/ɐ › a , ʉ › ʉ , ɵ › ɵ , ɞ › ɞ. Each of these vowels may have nasal voiced, oral glottalized and oral voiceless counterparts.

Vowel clusters will be limited to succession of syllables without intervening consonants (long vowels or with a very slight transitional glide) similar to Squamish which appears to have vowel clusters consisting of distinct vowels with apparently neither glottal insertion nor diphthongization of vowels to break up the hiatus. Semivowels in diphthongs and triphthongs are treated in Bikol as glide consonants, so will not be considered a vowel cluster. The following Castellano dipththongs and triphthongs will be converted to a vowel+consonant or consonant+vowel sequence: ai › ay, ei › ey, oi › oy, au › aw, eu › ew, ou › ow, ia › ya, ie › ye, io › yo, iu › yu or iw, ui › wi or uy, ua › wa, ue › we, uo › wo, iai › yay, iei › yey, uai › way and uei › wey. Centering diphthongs in English will be treated as vowel sequence, but both vowels needs to be clearly articulated.

Here’s the list for each language of vowels after conversion to the various allowed vowels (not exhaustive):

Languages Single Phonemes
Mandarin i › i , e › e , ɛ › æ , y › y , œ › œ , u/ʊ › u , o › o , ɔ › ɒ , ɤ › ɤ , ɑ › ʌ , ə › ə , a › a
Hindustani i/ɪ › i , e › e , ɛ/æ › æ , u/ʊ › u , o › o , ɔ › ɒ , ɑ › ʌ , ə › ə
Spanish i › i , e › e , u › u , o › o , a › a
English i/ɪ › i , e › e , ɛ/æ › æ , u/ʊ › u , o › o , ɔ/ɒ › ɒ , ʌ/ɑ › ʌ , ə › ə , ɜ/a/ɐ › a
Arabic i › i , u › u , a › a
Portuguese i › i , e › e , ɛ › æ , u › u , o › o , ɔ › ɒ , ɯ › ɯ , a/ɐ › a , ĩ, ẽ, ũ, õ, nasal ɐ
Bengali i › i , e › e , æ › æ , u › u , o › o , ɔ › ɒ , a › a , ĩ, ẽ, ũ, õ, ã, nasal æ, nasal ɔ
Russian i › i , e › e , u › u , o › o , ɨ › ɨ , ə › ə , a › a
Japanese i › i , e › e , o › o , ɯ › ɯ , a › a
German i/i:/ɪ › i , e/e: › e , ɛ/ɛ: › æ , y/y:/ʏ › y , ø/ø: › ø , œ › œ , u/u:/ʊ › u , o/o: › o , ɔ › ɒ , ə › ə , a/a:/ɐ › a
French i › i , e › e , ɛ/ɛ: › æ , y › y , ø › ø , œ › œ , u › u , o › o , ɔ › ɒ , ɑ › ʌ , ə › ə , a › a , nasal ɛ, nasal œ, nasal ɔ, nasal ɑ
Javanese i › i , e › e , ɛ › æ , u › u , o › o , ɔ › ɒ , ə › ə , a › a
Korean i/i: › i , e/e: › e , ɛ/ɛ: › æ , ø/ø: › ø , u/u: › u , o/o: › o , ɯ/ɯ: › ɯ , ʌ/ʌ: › ʌ , a/a: › a
Vietnamese i › i , e › e , ɛ › æ , u › u , o › o , ɔ › ɒ , ɨ › ɨ , ə: › ə , ɜ/a/a: › a
Tamil i/i: › i , e/e: › e , u/u: › u , o/o: › o , a/a: › a
Italian i › i , e › e , ɛ › æ , u › u , o › o , ɔ › ɒ , a › a
Turkish i › i , e › e , y › y , ø › ø , u › u , o › o , ɯ › ɯ , a/ɐ › a
Hokkien/Minnan i › i , e › e , ɛ › æ , y › y , u › u , o › o , ɔ › ɒ , ɤ › ɤ , ɨ › ɨ , ə › ə , a/ɐ › a
Persian i › i , e › e , æ › æ , u › u , o › o , ɒ › ɒ
Malay i › i , e › e , ɛ › æ , u › u , o › o , ɔ › ɒ , ɑ › ʌ , ə › ə , a › a
Burmese i › i , e › e , ɛ › æ , u › u , o › o , ɔ › ɒ , ə › ə , a › a
Hausa i/i: › i , e/e: › e , u/u: › u , o/o: › o , a/a: › a
Amharic i/ɪ › i , e › e , ɛ › æ , u/ʊ › u , o › o , ɔ › ɒ , ɨ › ɨ , ə › ə , a › a
Hebrew i › i , e › e , u › u , o › o , a › a

The full vowel phoneme inventory could result in the following 17 phonemes (IPA symbols):
Front Unrounded : i, e, æ
Front-central Rounded : y, ø, œ
Back Rounded : u, o, ɒ
Back-central Unrounded : ɯ, ɤ, ʌ
Central Unrounded : ɨ, ə, a
Central Rounded : ʉ, ɵ, ɞ
Counting their nasal voiced, oral glottalized and oral voiceless counterparts, we have a total of 68 vowels.

Level tones, contour tones, registers and stress (whether primary or secondary) is not to be imported as a phoneme. Bikol has no stress but has a chroneme or phonemic length, I suppose.


Since these phonemes do not exist in isolation but are combined in different ways to form syllables and words, we must also define permitted syllable structures, consonant clusters and vowel clusters. The syllabic structure of native Bikol base words, if disyllabic, is CVC’CVC. The values of the coda C of the first syllable can be consonants but can also be a chroneme and the coda C of the last syllable is obligatory but could evaluate to null if “h”. My view is that onset C is not optional for the initial syllable since the glottal stop is an obligatory default onset on base words that are traditionally written with vowel initials. I will explain my view further in a future post.

There is no consonant cluster whether initial or final within syllables in native Bikol words, only in syllable boundaries. Internally, all sorts of consonant combination is possible, except geminations. The question would be, should we allow consonant clusters into Bikol? Before answering that question, there is minor constraint against consonant clusters and a major one for vowel clusters in Bikol. There is no vowel clusters as w and y are treated as consonants and not as semivowels or parts of diphthongic sequence vowel+semivowel. If a glide is eliminated, a glottal consonant ‘ч’ or ‘h’ will appear thus treated as separate syllables.

Bikol: uang › чuчaŋ ‘bettle’, not *waŋ
Bikol: abaana › чabaчanah ‘too much’, not *чaba:nah

For consonant clusters, its presence is a source of irregularity for partial reduplication, at least in word initial position:

Castellano borrowing: planchar › plantʃah › nagplaplantʃah or nagpaplantʃah?
English borrowing: practice › praktis › nagprapraktis or nagpapraktis?

Either we standardize how to deal with partial reduplication or we avoid borrowing words with clusters and replace those already borrowed. If possible, I suggest the first recourse would be to borrow another word from another language with no consonant clusters. If we must borrow words with clusters, then my preferred reduplication would be the just the 1st consonant of the consonant cluster and not the entire cluster, or the 2nd alternative shown above. This is also after considering how -um- is infixed (see below).

In syllable final position, consonant clusters are not a problem at all, so we can borrow to our hearts content:

Castellano borrowing: extra › чekstrah › nagчeчekstrah
English borrowing: golf › golf › naggogolf , gogolfan

Glide+vowel combination in the initial syllable would not be a problem even with an infix -Vr- since it will just copy the vowel and semivowels are treated as consonants. The same applies with the infix -in-.

Castellano borrowing: piano › pyanoh › pyaranohon and not *paryanohon nor *piryanohon
Castellano borrowing: toalla › twaʎah › twaraʎahan and not *tarwaʎahan nor *turwaʎahan
Castellano borrowing: toalla › twaʎah › twinaʎahan (twinatwaʎahan), not *tinwaʎahan (tinwatwaʎahan)
Castellano borrowing: piano › pyanoh › pyinanohan (pyinapyanohan), not *pinyanohan (pinyapyanohan)

But with infix -um-, there seems to be a different rule: the infix is inserted after the first consonant and not after the glide of the cluster.

Castellano borrowing: piano › pyanoh › pumyanoh, not *pyumanoh
Castellano borrowing: toalla › twaʎah › tumwaʎah, not *twumaʎah

It is not just Castellano, English, Nihonggo or Mandarin with clusters, so words borrowed from these other languages must be accepted as well, as I find no reason to exclude them. Initial two-consonant clusters that are very common in Austronesian languages are initial geminates (Chuukese, Pohnpeian, Dobel, Sa’ban, Taba, etc). Dobel has these geminates: //bb dd tt ɸɸ ss mm nn ŋŋ ll rr ww jj чч//. Taba allows 11 different geminates in initial positions: /bb dd gg tt kk mm nn ŋŋ ll hh ww/ plus many other combinations. For a full list of possible consonant clusters in Taba, Leti and Roma, click here . Here are examples from Taba, an Austronesian language in Indonesia.

wwe ‘leg’
hhan ‘you (pl.) go’.
ddoba ‘earth’
rsuri ‘they pour’.

Geminates not in word initial positions would not be a problem so we can import them from Japanese, Italian, Arabic, Russian and even Ilokano and Bontok. There are also Austronesian languages that have preploded & postploded nasal clusters and prenasalized stops like mb, nd, ɲdʒ, ng. And some languages have prestopped nasals.

If we must borrow words with clusters, then should there be a minimum number of consonant series in a cluster that we can take in? Georgian can have up to 8 initial consonant clusters. Georgian brt’q’eli (flat) has 4 consonants, English glimpsed has 4 consonants word finally. Or take Russian zdravstvujtye ‘hello’ and vzglyat ‘opinion’ or the Polish initial consonant clusters here. Personally, I would like to limit consonant clusters to a series of 2 consonants only in syllable initial positions but 3-4 consonants in syllable final positions.

Words without vowels should not be allowed into Bikol, like Nuxalk xłp̓x̣ʷłtłpłłskʷc̓ ‘he had had a bunchberry plant’ or Tashlhiyt Berber tftktstt ‘you sprained it’. Also we should not allow words words with syllabic consonants, like Slovak žblnknutie. These vowelless consonant series should be inserted with vowels if to be borrowed. These example words are taken from Wikipedia.

Form of Borrowed Word
Apart from the sounds of the words to be naturalized in the borrowing language, there is another issue at hand with borrowed words: whether the words to be borrowed are just base words, or base words with inflections. The disadvantage of borrowing a fully inflected word is that it will force a change in the syntax of the borrowing language, by either bringing in new affixes (if there are a lot of borrowed words with such affix), the old affixes can not be used together with the borrowed word if of the same meaning thus rendering them obsolete, and forcing changes in word orders as well. Because of these, I am more inclined to borrow just the base words.

This page is under construction, a rough draft. Comments is disabled for the moment. Please do not quote yet as the phrasing will definitely change..

New Bikol Orthography – Part 2

In this part, I would like to show how to implement the new orthography within the Bikol macrolanguage. Let’s clarify first why Bikol is classified as a macrolanguage in ISO. Is it not a dialect? How about a language? What are the differences anyway?


ISO 639 [1] governs the classification of language varieties if they are dialects, individual languages, macrolanguages or collections of languages. It acknowledges that there is no uniform definition of language that is acceptable to all language speakers and linguistic experts suited for all purposes. But it did outline a set of criteria in their classification of languages deemed fit for the intended range of application of that standard. Whatever range of applications is contemplated, that is not mentioned in the website.

Based on how I understood ISO 639, the dominant criterion used to classify a language variety is ethnolinguistic identity – if both language varieties share a common, well-established ethnolinguistic identity, both are treated as varieties of the same language (a dialect in short) even if intelligibility is marginal. There is intelligibility if speakers of both variety have inherent understanding of each other at a functional level, without the need to learn the other variety. In ISO 639, if each linguistic variety has a distinct ethnolinguistic identity, even if sufficiently intelligible to each other, they are treated as distinct languages. Problem for this standard is, it did not define what is an ethnolinguistic identity. Although unstated, I would suppose that shared ethnolinguistic identity means having (a) a common literature or culture and (b) a central language variety intelligible to the varieties in question. In ISO, an individual language encompasses a defined range in spatial, temporal and social spectrum of its linguistic varieties, whether spoken and written, including its standardised variety.

It also used the terms “Macrolanguage” and “collections of languages”. Individual languages are members of a macrolanguage if they all have (a) close linguistic relations and (b) common linguistic identity (due to a single written form or a standard spoken language used in wider commnication among the speakers of such closely related languages) in at least one domain. Language collections are groups of languages which are never deemed a single language in any context.

But I find that even with the term macrolanguage, we are still incapable of distinguishing language varieties that are (1) mutually intelligible yet ethnolinguistically distinct, from (2) mutually marginally intelligible or unintelligible yet ethnolinguistically integrated. I think it would be better if additional terms are invented, like ‘apolanguage’ (‘apo’ away from, detached) for (1) and invent ‘arthrolanguage’ (‘arthro’ joint) for (2), while at the same time using ‘isolanguage’ (‘iso’ equal, like) for an individual language where varieties have matching boundaries in terms of intelligibility and ethnolinguistic identity. And also call each variety within an arthrolanguage or apolanguage as an ‘ethnolect’. I believe that ISO 639 definition of language is a good step forward but there is still room for improvement of traditional definitions.

The largest macrolanguage in terms of speakers is Chinese with over 1 billion speakers and 13 individual languages under it. Next is Arabic, spoken by roughly 422 million and covering 30 languages. In terms of number of individual member languages, Zapotec has the most member with 57 languages and Quechua with 44 languages. In the Austronesian language family, there are only 3 macrolanguages: Malay, Malagasy and Bikol. Malay has 184 million speakers (41 million native, 143 million nonnative) for its 13 languages; Malagasy has 17 million speakers of its 10 individual languages, and Bikol has 4.5 million speakers of the 5 languages.


When I use the word dialect here, I am referring to spatial/geographic language varieties only. I would use the term sociolect if I am referring to language varieties that are different among hierarchically arranged social groups. So, how are the different ethnolects and dialects of this arthrolanguage Bikol interrelated? Ethnologue [2] has made a superb identification and classification of these language varieties, so the following table of dialects, languages and their subgroupings were taken from their website.

Ethnolect** Language (or dialect groups) Dialects Population (2000)
Coastal Bikol
Central Bikol 2,500,000
Southern Catanduanes Bikol 85,000
Mt Iraya Agta 150
Mt Isarog Agta 5 to 6
Inland Bikol
Albay Bikol 1,900,907
Iriga Bikolano 234,361
Riconada Bikol
Mt Iriga Agta 1,500
Northern Catanduanes Bikol 122,035

**These are my grouping of the language varieties into 3 groups.

Although in Ethnologue the 3 Agta languages are part of the Bikol language subgroup, it is not considered under ISO as part of the macrolanguage since they have different ethnolinguistic identity, being nomadic and non-Christian, I suppose. Central Bikol, Southern Catanduanes, Mt Iraya Agta and Mt Isarog Agta to me are dialect groups of a single ethnolect, so I will depart from the Ethnologue treatment here and group them under Coastal Bikol as I have indicated above. Albay Bikol, Iriga Bikolano and Mt Iriga Agta are also for me dialect groups of a single ethnolect that we can call Inland Bikol. And since Inland Bikol and Coastal Bikol are English terms, I would like to give them Bikol equivalents, Bikol Iraya and Bikol Ilahud, respectively. Let’s call the arthrolanguage simply as Bikol.


Being a macrolanguage, there should be 1 domain where all the language varieties are viewed as 1 single language or an arthrolanguage. In this regard, to facilitate usage of Bikol in that domain, I am here proposing the rules of orthography for this macrolanguage.

So here’s the principle: Cognate words with identical meanings between the individual Bikol ethnolects/dialects will be written in their reconstructed form using comparative linguistics to arrive at a hypothetical proto-Bikol form. Cognate words that have shifted in meaning (false friends) will also be written in its reconstructed form, unless their pronunciation have diverged so much from the protoform, in which case they should be written in the attested forms in the individual ethnolects/dialects. Cognate words in the same ethnolect/dialect (etymological twins) will also be written in the attested forms. Non-cognate words like false cognates and loanwords (identified through contact linguistics) will be spelled as attested in the ethnolect/dialect.

The reasons:

  1. To highlight the fact that individual Bikol languages are once dialects of one protolanguage.
  2. To show to speakers of each individual Bikol language how their languages are related, by formally showing the exact point of the divergence.
  3. To provide an alternative model on how to modernize the Bikol language for future consideration and teaching.
  4. To use this as springboard to trial another idea, that of replacing Tagalog as the basis of Filipino and instead use reconstructed proto-Austronesian forms, a true representative of all Philippine languages. This proto-Austronesian forms, once official, can even be shared with Indonesia, Malaysia, Singapore, Brunei, Timor Leste, Malagasy and Pacific Island countries (Fiji, Tonga, Samoa, Kiribati, Marshall Islands, Micronesia, Nauru, Palau, Solomon Islands, Tuvalu, Vanuatu) as a common official language in an Organization of Austronesian Speaking Countries, much like how French (Organisation Internationale de la Francophonie), English (Commonwealth of Nations) & Portuguese (Comunidade dos Paises de Lingua) connect the various countries and colonies that speak such languages. And even countries with indigenious minority Austronesian speakers like Papua New Guinea, Taiwan, Vietnam, Thailand, New Zealand, USA (Hawaii, Guam), France (French Polynesia) could join in such an organization promoting such “world language”.

    To visualize how this will be implemented in the Bikol macrolanguage, I will give examples from Spanish [3] where consonants have changed the most: the ll in caballo ‘horse”, ‘j’ in jamon ‘ham’, and ‘z’ in zapatos ‘shoes’ have variable pronunciation among various Castellano dialects. Although /ʎ/ has merged with /y/ in most dialects and has become /ʒ/ in Argentina and Uruguay, Castellano has not revised the spelling at all to *cabayo or *cabazho. Likewise, /x/ has merged with /h/ for j and /θ/ has merged with /s/ for z for many dialects, yet standard Castellano spelling for them remains, and not *hamon or *sapatos. Even /f/ is /ɸ/ in Ecuador and /tʃ/ is /ʃ/ in Panama, yet spelling continues to be facil and ocho, for example, to this day, and not *ɸasil or *oʃo. By retaining the original spellings and teaching it and exposing people to the various dialects, speakers of other Castellano dialects who do not pronounced it similarly as the original knows what the original Castellano pronunciation had been, even without an existing parallel writing system in their own dialects. [Please refer here [4] for the significance of the symbols.]

    The same can be said of the English words, where vowels have changed most. Please refer to this website [5] for a comparison of vowels among the different native English pronunciation. Spanish and English spellings are called etymological spelling for their dialects as these dialects are not written in phonemic spellings [6].

    Going back to Bikol, what I am advocating for the arthrolanguage is an etymological ethnophonemic spelling while for the ethnolects and dialects a pure phonemic spelling. There will be no confusion at all since the ethnolects/dialects will be written as pronounced but the arthrolanguage in reconstructed form as pronounced back in time. We are representing here the words as they are pronounced long ago, say in the 7th to 13th century (I’m not really sure). Taking the examples that I have shown in Part 1, here are the comparative representation for each ethnolect/dialect and the arthrolanguage.

    English Gloss Bikol Bikol Ilahud Bikol Iraya
    Naga Virac Iriga Buhi
    life bu•hay bu•hay buay
    tree ka•hoy ka•hoy kaoy
    body ha•wak ha•wak чa•wak
    see hiliŋ hiliŋ чiliŋ
    youngest child ŋuhud ŋuhud ŋu•d
    study чa•daƥ чa•dal чa•daλ
    buy bakaƥ bakal bakaλ
    bring daƥah darah daλah
    run daƥa•gan dala•gan daλa•gan
    tall haƥaŋkaw halaŋkaw haλaŋkaw
    fight чi•waƥ чi•wal чi•waλ
    man laƥa•kih lala•kih laλa•kih
    walk ƥakaw lakaw λakaw
    one saƥɷч saroч saλoч
    talk taƥam taram taλam
    three tɷƥoh tuloh tuλoh
    cockpit buƥaŋan bulaŋan buλaŋan
    rice bɷgas bagas bɷgas
    blade tarɷm tarum tarɷm
    rice plant pa•rɷy paroy pa•rɷy
    black чitɷm чitum чitɷm
    itch gatɷƥ gatol gatɷƥ
    long duration haƥɷy haloy чaƥɷy
    repent sɷƥsɷƥ solsol sɷƥsɷƥ
    wait hɷƥat halat haλat чɷƥat
    sour чaƥsɷm чalsom чaλsom чaƥsɷm
    snake ha•ƥas halas чa•ƥas
    worry handaƥ handal чandaƥ
    floor saƥɷg salog salɷg saƥɷg
    house baƥɷy balay, haroŋ balɷy baƥɷy
    sour чaƥsɷm чalsom чalsɷm чaƥsɷm
    hear dɷŋɷg daŋog rɷŋɷg rɷŋɷg

    These data were taken from several websites so I still need to confirm if the rendering of Bikol Iraya is really phonemic. I am not a trained linguist, so please take my data with caution. The reconstructed forms here are tentative, once I come accross the proper reconstructed forms of these words, I will put them here. The reason I put those tentative protoforms above as such are based on my assumptions that:

    1. The protophoneme *h was retained in Bikol Ilahud but was lost in Bikol Iraya.
    2. The protophoneme *ƥ was retained in Buhi, became λ in Virac, and l or r in Naga and Iriga.
    3. The protophoneme *ɷ (schwa) was retained in Iriga and Buhi but became a or o in Naga, depending on its position inside the word.

    Although I have tried to show how the words were to be represented, I have not shown the arthrolanguage’s grammar. That would come in the future.


    [1] For a complete list of macrolanguages, go to