Author Archive: vagabonddrifter

Primary Branches of Proto-Austronesian Revised

I was reading Victoria Chen’s dissertation and came across her Table 6.3 which shows the sound correspondences between the Formosan languages. I reproduced the table below.


Chen(2017:206) described how the above table was assembled:

Table 6.3 is a revised tabulation of sound correspondences among higher-order Austronesian languages, based primarily on Blust (1999:43, 2013:583) with modifications based on Ting (1976:342–88), Ho (1978:604–77; 1998:163–66), Li (1977), Ross (2012: 1274–5), and the Austronesian Comparative Dictionary (ACD) (Blust & Trussel ongoing). The grey cells indicate important mergers that define major language groups.

This is a very good resource to try on my own Austronesian subgrouping. The limitations I see are that (a) there are no vowels, (b) it doesn’t show the the correspondences based on position in the word, and (c) it is not annotated with exceptions.

Prior Art

Previous work to define the primary branches include that in Ho (1998) and illustrated in Ross (2009) like this:


The next one was that of Blust (1999) built on Dahl’s (1973) and Blust’s (1977) “Malayo-Polynesian hypothesis”. Ross also provided a chart of accepted primary branches as follows.


Blust (2013:31) further noted that he was unable to reduce the number of Austronesian primary branches.

No convincing evidence has yet been found that would enable
us to reduce this collection of languages to a smaller number of primary branches.

I quote Ross (2008:166-167,172-173,174) who said the same below:

Thus Blust (1999) gives evidence that the fourteen Formosan languages form nine primary subgroups of Austronesian, coordinate with each other and with Malayo-Polynesian. This means that there is no ancestor which the Formosan languages share exclusively, as their most immediate shared ancestor is Proto Austronesian, which they also share with Proto Malayo-Polynesian. That is, there is no ‘Proto Formosan’. Sagart (2004; this volume) proposes an alternative subgrouping of Formosan languages, but this too recognizes no ‘Proto Formosan’.

Similarly, Figure 6.2 shows no ‘Proto Western Malayo-Polynesian’, as western Malayo-Polynesian groups have no exclusively shared ancestor (Ross 1995). Despite numerous references in the literature to a ‘Western Malayo-Polynesian’ group, the western Malayo-Polynesian languages consist of some 20–25 groups, each descended from Proto Malayo-Polynesian. A similar comment can be made about the central Malayo-Polynesian languages. I return to these matters in §7.

Above, I have presented Dahl’s (1973) and Blust’s (1977) Malayo-Polynesian hypothesis, accepted in its broad outlines by most historical linguists working on Austronesian. However, two linguists, Dyen (1995) and Wolff (1995), have put forward variants of what Pawley (2002) dubs the ‘Formosan-Philippine hypothesis’,2 and Peiros (1994; this volume) proposes what I will call a Formosan hypothesis. I shall not discuss alternative hypotheses about the migrations of early Austronesian speakers offered by archaeologists Meacham (1984) and Solheim (1996) or geneticists Oppenheimer and Richards (2001, 2002), as they employ no linguistic evidence.

The Formosan-Philippine hypothesis essentially says that the Formosan and Philippine languages look so similar that they must form a subgroup. The similarities are to be found both in vocabulary and in grammar. This position, however, neglects the methodological point made in §5 that a subgroup is defined by shared innovations. If the Formosan-Philippine hypothesis is to be taken seriously, then it needs to be shown that Formosan and Philippine languages reflect a set of innovations that other Austronesian languages do not share, i.e. that there is evidence for a shared Proto Formosan-Philippine node. Such evidence has not been offered. Instead, it is likely that the similarities among Formosan and Philippine languages are shared retentions of Proto Austronesian features, an inference which causes no difficulty under the Malayo-Polynesian hypothesis. Alternative explanations would also need to be offered for the Malayo-Polynesian innovations noted above, together with an alternative Austronesian subgrouping, but these have not been forthcoming.

… there are no convincing innovations supporting either the Formosan-Philippine or Formosan hypotheses, and the Malayo-Polynesian hypothesis stands effectively without a strong challenger.

The Formosan languages belong, according to Blust (1999), to nine primary Austronesian subgroups, coordinate with each other and with Malayo-Polynesian. Western Malayo-Polynesian languages belong to 20–25 Malayo-Polynesian groups, coordinate with each other and with Central/Eastern Malayo-Polynesian. The subgrouping of Formosan languages among themselves remains somewhat problematic. Three alternatives to Blust’s proposed nine primary Austronesian subgroups are offered in this volume, one by Peiros (in his Table 7.4), a second by Li, and a third by Sagart.

Then Ross(2009) advanced his Nuclear Austronesian Hypothesis, which created this subgrouping:

Chen(2017) showed that this Nuclear Austronesian Hypothesis is phonologically incompatible with the well-supported Tsouic group, as well as showing that the lack of homonymy between nominalizations and verb is an innovation in Tsou, Rukai and Puyuma.

Sagart also proposed a subgrouping based on numerals but was critiqued in Teng & Ross (2010) and Blust (2014).


My Subgrouping

The end result of my effort to find larger Formosan groups is this table of exclusively shared innovations of each Formosan micro-groups.


As you can see, Blust (1999)’s old 9 primary branches are still there. I have not changed the existing grouping, just demoted them from being primary branches so that there are now just 2 primary branches of Formosan languages. And the tree to show the relationship is as follows.


Victoria Chen created a table listing the phonological innovations that identified the 9 primary branches, in her table 6.4. So I create two tables based on that. The first table, the one on top, shows exclusively sound innovations per language placed contiguously, basically to assist in visually determining inter-relationships among the languages based on these sound changes. These were the sound changes that prove the present subgroupings. The second table shows the other innovations that I have identified.


The meaning of question marks in the first table are exceptions, like in the following:

  • Northwest Formosan: merger of *n and *ŋ also happened in Kulon.
  • East Formosan: merger of *n and *N also happened in Kavalan.
  • Tsouic: merger of *j and ∅ happened only in Tsou, not Tsouic as a group.
  • Tsouic: merger of *S and *s supposedly happened in Tsouic, but I don’t see this happening across the board. See the gaps below: image
  • PMP: merger of *h and *S. Same here, there are gaps with no merger.image

Chen said there is no exclusively shared innovation for Atayal. But looking at Ho (1998)’s *ʃ > h sound change, P-Atayalic  has a shared sound innovation with Northwest Formosan. Also Western Plains shared *C > *s with Northwest Formosan.

We could also see that Bunun shared with East Formosan the merger of *C and *t. And with Tsouic it shared merger of *k & *g. Likewise, we could also see that Tsouic shared with Puyuma the merger of *r and *R. There is no shared innovations between Rukai and Paiwan listed by Chen.

So now I went identifying larger groupings. The first thing I noticed are the retroflex consonants. Ross (2006:4) mentioned that these are not actually retroflex but flat laminal post-alveolars.

The use of the term ‘retroflex’ is also perhaps a little misleading. The so-calledretroflexes of Formosan languages have similar articulations to the so-called retroflexes of Mandarin(Hui-chuan Huang, pers. comm.), and, although these are conventionally included under the rubric‘retroflex’ (Hamann 2003:22-23, 45), they do not entail curling of the tongue-tip and are more appro- priately described as flat laminal post-alveolars (Ladefoged and Wu 1984, Ladefoged and Maddieson1996:154). It is reasonable to infer this articulation for PAn. I retain the hooked retroflex symbolshere for the sake of readability.

Ross inference that this sound is PAN was based on the fact that no Chinese language in Taiwan has a retroflex.

I was tempted to infer that modern Formosan languages have adapted their articulation to Mandarin, in which their speakers are bilingual, but Hui-chuan Huang (pers. comm.) points out that neither Taiwan Mandarin nor Taiwanese Southern Min have a retroflex articulation, so this is not possible.

Ross (2006:11-12) discussed whether retroflexes are innovation or retention, concluding that “There is a case for a distinction between *d₃(my *D) on the one hand and *d₁ and *d₂ on the other, but it is less strong.

Because of the importance in Formosan languages of the dental/alveolar vs retroflex distinction and the relative insignificance of palatal articulation, I inferred in Ross (1992) that the dental/alveolar vs retroflex distinction was a PAn feature. Blust (1999:35) objects this, suggesting that retroflexion (read ‘post-alveolarity’) may be an areal feature that has developed since (I infer) the departure of PMP from Taiwan’s shores. It now seems to me that the distinction between ‘palatal’ and ‘retroflex’ in PAn may be a chimera. Both were post-alveolar, and the allegedly palatal *z and allegedly retroflex *D and *l may well have had more or less the same point of articulation (cf Table 6 below).

In view of the desirability of notation without subscripts, from this point on PAn *Z becomes *z, following Blust’s (1999:34–35, footnote) revised notation, whilst *d₁ becomes *d and *d₃ becomes*D, following Li (1985).

If the distribution of correspondences does indeed reflect shared inheritance, then there are two possible classes of hypothesis. To the first class belong hypotheses to the effect that PAn indeed had just the voiced apical obstruent *d, which under went an unconditioned split or splits into*d, *Z and *D in a language or languages ancestral to Favorlang-Babuza, Paiwan and Puyuma (this could have been a two-stage event: first a split into *d and *Z/*D, as reflected in Favorlang-Babuza and Puyuma, then a split of the *Z/*D proto phoneme into Paiwan *Z and *D). The second class consists of the hypothesis that *d, *Z and *D are indeed of PAn antiquity, but have merged in most daughter languages.

Hypotheses of the first class would place Favorlang-Babuza, Paiwan and Puyuma in a subgroup.This would be problematic, as Tsuchida (1982:9–11) shows that Favorlang-Babuza, Taokas, Paporaand Hoanya probably formed a subgroup, which Blust (1999) labels ‘Central Westerm Plains’ and which, to my knowledge, no investigator has ever questioned. Furthermore, despite my temerity in suggesting otherwise Ross (1992:43), there is no evidence for anything other than a contact relation-ship between Paiwan and Puyuma, which are strikingly different in historical phonology, morphology and lexicon.

The difficulty with the hypothesis that *d and *Z are of PAn antiquity is that it implies that they have merged in most daughter languages, and, unless we wish to claim that all these languages are a subgroup, we must infer that the mergers have occurred several times over. This is in fact not difficult to believe. Proponents of the Malayo-Polynesian hypothesis (Blust 1977, 1999) claim that *t and *C merged several times over, in PMP and in Siraya, Amis, Bunun, Kavalan and Ketagalan. A similar claim is made with regard to *z and *d (in its conventional embodiment) in Saisiyat, Pazeh, Atayalic,Thao, Rukaic, Tsouic, Paiwan, Puyuma, Amis, Bunun and Kavalan. Against this background, it is reasonable to infer that *d and *Z, the putative voiced equivalents of *t and *C, have merged in Saisiyat, Pazeh, Atayalic, Hoanya, Thao, Rukaic, Tsou, Siraya, Amis, Bunun, Kavalan and PMP.

I have excluded *D from the discussion in the previous paragraph, as its status is obviously more doubtful. It is represented by only the three Paiwan-Puyuma examples in (4a) and the Favorlang-Thao-Kanakanavu-Paiwan example in (4b). There are, on the other hand, no cases of Paiwan ɖ with a Puyuma correspondent other than ɖ. Could these four examples and the three listed under *D in §2.7,and indeed all other etyma with ɖ listed by Ferrell (1982) and Egli (2002), be due to borrowing? The question is a difficult one, and requires much larger lexical databases in Formosan languages other than Paiwan for its solution. Thus Paiwan ke-ɖemel (4a-i) may well be a borrowing from Puyuma ke-zemer , the more so as the latter reflects putative PAn *ZemeR), and inherited Paiwan terms reflect*R as zero. Paiwan paŋuɖal ‘pineapple’ in (4a-iii) may be a borrowing of the Puyuma word with the same form, the more so as terms for plants are often borrowed (Wolff 1994). On the other hand, there is no evidence that Paiwan numerals are borrowed, and Paiwan ɖusa ‘two’ in (4a-ii) does not reflect a borrowing of Puyuma ɖua. For the time being, the safer assumption is that *D did occur in PAn, and that the merger of *Z and *D has apparently occurred everywhere except in Paiwan and perhaps Thao and Kanakanavu. However, I do not think this assumption is safe enough to use in subgrouping hypotheses.

Blust (2013:5776,577) however, still considers the retroflex *D a product of borrowing.

Following an earlier proposal by Ogawa and Asai, Dahl (1976:58ff) reconstructed *d1, *d2, and *d3. This distinction is based almost exclusively on data from Paiwan and Puyuma of southeast Taiwan, two neighbouring languages that have been in a borrowing relationship for many centuries. Although Dahl suggested that Paiwan and Puyuma are mutually supportive in distinguishing these three types of *d, the evidence is in fact contradictory (Blust 1999b:49ff). Moreover, the *d1-d3 distinction crosscuts Dempwolff’s *d/D, so that if both sets of distinctions are accepted the number of reconstructed *d phonemes must be greater than three. All-in-all, then, the puzzling apical stop
correspondences in Paiwan and Puyuma are probably best explained as products of a long and complex history of borrowing.

450px-Formosan_languages_en.svg[15]Based on that, I took it that retroflex consonants are innovations in Rukai, Puyuma and Paiwan, their geographic contiguity a factor, as evident in the Wikipedia map on the right side.

Next I looked at how the Formosan languages reflex velars. Bunun and Tsouic has *g > *k, while East Formosan, Atayalic, Northwest Formosan and Western Plains  has *g > ∅. Because the reflex of *g does not conflict with Blust’s subgrouping and actually further groups his primary branches, I take it that this should be used to group his 9 primary branches, after the retroflex sound change. The end result was the table above showing the exclusively shared innovations for the groups. Based  on the retroflex innovation and *g reflexes, PMP is not to be subgroup with any Formosan language.

 Norquest and Downey

Later on, I came across Norquest and Downey(2013) showing evidence that the Austronesian languages in Taiwan constitute a subgroup, and that there was an original retroflex series in PAN.

Overall, it appears that within Austronesian, evidence for the retroflex series has been preserved best at two geographic extremes – in the northwest in Taiwan and the Philippines, and in the southeast in WCMP.The cumulative reflexes for the retroflex series in the Formosan languages are given below:


The most conservative Formosan language appears to be Puyuma, which has maintained retroflex reflexes in all cases. It is notable that even though the distinction between *l and *ɭ is not maintained in Taiwan, the Formosan languages still provide indirect evidence for the latter (as noted in Ross (1992)) since in the majority of cases, *l and *ɭ seem to have merged as *ɭ, with modern languages showing a combination of lateral and rhotic reflexes.

Their conclusion is worth quoting in full below:

Evidence has been presented in this paper for three new phonemes (*f, *ɭ, and *g), as well as additional extra-Formosan evidence for *ʈ and an expanded domain for *c within PMP. The evidence comes from two subgroups on Borneo, as well as three of the four corners of the Austronesian-speaking world: Nias in the southwest, PWOc in the northeast, and WCMP in the southeast, with the conservative Formosan languages of Taiwan in the extreme northwest completing the picture.

According to the methodology of historical linguistics, whatever is reconstructed for PMP that is not the result of a conditioned split can be projected to the level of Proto Austronesian. The Out-of-Taiwan‘express train’ hypothesis predicts that phonemic mergers should have occurred as the Austronesian expansion proceeded in time and space; the number of inherited phonemes for any node would be equal to or less than the number of those in the node above, and any secondary splits increasing the phoneme inventory which occurred in a lower node would be localized within that node with the conditioning factors likely remaining transparent.

As shown below in Table 15, however, this is not the pattern that appears. The Formosan languages are still unique in directly preserving evidence for the palatal phonemes *ç and *ʎ. Formosan evidence for*ʈ, however, is now found in three other locations, and evidence for several other phonemes can be found in several other groups as well:


In terms of sheer number of distinctions preserved, WCMP is actually the most conservative group,followed by Nias. If one assumes CMP and EMP (SWHNG + Oceanic; See Fig. 1) to be the two lowest nodes of the Austronesian phylogenetic tree, then it is perplexing that they are more conservative than most WMP languages, the exceptions being the West Barito and North Sarawak groups on Borneo and the Barrier Islands group (to the extent that other languages of that region can be shown to subgroup with Nias. Howto interpret these data?

As mentioned above, the most conservative groups lie either on the periphery of the Austronesian-speaking world or on Borneo. Our present working hypothesis is that these languages represent an older layer of Austronesian languages that have been located in their present positions for some time. The WMP languages (excluding the Barrier Island languages), on the other hand were more recent expansions by various groups out of Borneo, possibly triggered by climate stress or other cataclysmic factors. The hypothesis that the Malayo-Chamic languages originated on Borneo (see for example Collins & Sariyan2006) is well known; the South Sulawesi languages are related to the Tamanic group on Borneo, and the Philippine languages may subgroup with Sabahan (although this is still conjectural (Blust 1998)). If these examples are any indication, then it may be shown eventually that other WMP languages and subgroups originated on Borneo (cf. Blust 2010), and that immigration out of Borneo and into the surrounding islands has been occurring for quite some time, including quite possibly even the Philippine languages from Sabah in northern Borneo.

The phylogenetic tree in Figure 2 supports this conclusion. The tree was derived via a binary distance matrix based on phonological mergers and neighbor-joining. It suggests that the closest relationships between the easternmost Austronesian groups (WCMP and PWOc) are not to each other, but rather to discrete groups on Borneo, WCMP joining with Dohoi and PWOc joining with Proto North Sarawak. Although this phylogeny must remain tentative for now, we note that it is geographically consistent with two eastern migrations out of Borneo – one from southeast Borneo into the Nusa Tenggara region, and one from northern Borneo to the Bird’s Head region of New Guinea which then spread eastward.

With this in mind, we propose an alternative to the traditional Austronesian expansion hypothesis: that the Formosan languages do form a subgroup, and that this group represents the first migration away from the original Austronesian homeland (and therefore the first split in the Austronesian phylogenetic tree, similar to the place of Anatolian within Indo-European). This Formosan-Malayo-Polynesian sister-group hypothesis would predict that retentions and innovations would be found in both subgroups, and not necessarily be constrained to WMP, as the OoT hypothesis would imply.

If it can be shown convincingly that the Formosan languages form a discrete innovation-defined subgroup and that the Formosan group is effectively a sister of PMP, then the question of Formosan origins becomes open – did the early Formosans migrate from mainland southeast Asia, as is commonly supposed,or might they have migrated from somewhere further south, perhaps ultimately from Borneo itself? Figure 2above confirms that the Formosan languages still maintain a unique position in the Austronesian family tree,as they are the only languages to preserve concrete evidence for the phonemes we interpret as *ç and *ʎ. The same cannot be said for Proto Philippines, which retains evidence for *ɭ with WCMP (and evidence for *ɳ — if this hypothesis is valid — with North Sarawak and Malayic). This question is ultimately outside the scope of this paper, but we hope to explore it in the future using both linguistic and non-linguistic evidence.

There is a blogpost here reviewing R&D(2013) and accepts the new sound correspondences but not the identified original sound value of the new correspondences. He proposed alternative values, which I did not quote below.

One of the contrasts involved (*t versus *C) has previously been used as a part of the definition of Malayo-Polynesian; while a couple of them are only attested in MP. This means that a separate Formosan branch could now be defined, supported by about as many phonological isoglosses as Malayo-Polynesian is as well (as for lexical evidence, the question now becomes if Formosan-only items are innovations or retentions). S&D also locate an alternate option for the barycenter of the diversity of the family, in Borneo. And though they don’t say quite as much: this would moreover even seem to substantially weaken the idea that the vast MP group is a separate branch of Austronesian at all, and not simply an areal entity.

N&D’s phonetic assignments for their new segments I find far from unproblematic however. They propose splitting Proto-Austronesian *p into *p and *f; *k into *k and *g; and *l into *l and *ɭ. In support of the retroflex value of the latter, they also reanalyze the previously known *C and *j as *ʈ and *ɖ. Also, a previous Proto-Austronesian *g has already been reconstructed; this they propose to reinterpret as *ɢ. Pretty much all of this seems dubious.

To digress a bit, a retroflex *ɭ was supposedly reconstructed by Paz(1981) for Proto-Philippine. I wonder if this is the same as the interdental approximant in OMGPP(2010) and which I have blogged before as interdental lateral approximant.

Back to R&D(2013), I can see that PMP as a subgroup is effectively over, as the PAN *C and *t distinction is found in some PMP languages plus there are other distinctions that are retained in PMP but merged in Formosan, such as R&D(2013) *p vs. *f, *l vs. *ɭ,, *k vs. *g and *s vs. *c. Whether a separate Formosan branch can be defined remained to be seen, since the other Philippine-type languages also need to be looked at if they share the same mergers as the Formosan languages. Some of them may and some may not. For example, we can’t use retroflex consonants *T, *D or *L as a diagnostic for Formosan languages, as well as *N and  *ç since there is no exclusively shared innovation based on it from among the Formosan languages as a whole, although the reflexes of *T was used to define lower level subgroups in these languages. The exclusively shared innovation of Formosan languages would come from N&D (2013) *p vs. *f, *k vs. *g, and *l vs. *ɭ. Philippine languages may retain the distinction *l vs. *ɭ but that does not mean that the sound is used to define a Philippine branch, as the lower level subgroups will be defined on other criteria and not based on the retention of this sound. So it is perfectly possible that lower-level Formosan subgroups and the 12 Philippine subgroups  are grouped in the same branch without any intermediate Formosan or Philippine branches, together with Chamorro, Palauan and even the 2 Sabahan subgroups, since the Greater North Borneo languages is based on lexical innovations (Blust 2010) and not on exclusively shared sound innovations. Smith(2017:209,340) mentioned that Blust’s evidence for subgrouping Sabahan with North Sarawak is differential reflexes of intervocalic voiced obstruents after schwa and intervocalic voiced obstruents after any other vowel, which to me is a rather minor phonological evidence which might as well be due to contact.

Sabah presents certain difficulties for identifying lexical innovations that are less of an issue in other parts of Borneo. First, the languages of Northern Borneo, as represented in the wordlists in Lobel (2016), have borrowed from one another, from Greater Central Philippine languages,and from Malay. Second, the conservative phonologies, with few characteristic sound changes in any Sabahan language, make identifying loans problematic.

Blust (1974b:197, 2010:56-68) proposed a North Borneo subgroup containing Southwest Sabah, Northeast Sabah, and North Sarawak. The only piece of phonological evidence for a North Bornean subgroup is differential reflexes of intervocalic voiced obstruents after schwa and intervocalic voiced obstruents after any other vowel. It follows from this evidence that PNB automatically geminated stops after schwa, forming a series of voiced geminates *-bb-, *-dd-,*-jj-, *-gg-, which then underwent terminal devoicing as shown by their reconstructability to Proto-North Sarawak and Proto-Northeast Sabah. He also claims that there are “a number of lexical items” which “appear to be exclusively shared by North Sarawak languages and the languages of Sabah” (Blust 2010:68). He does not, however, list these apparent lexical innovations, arguing that enumerating those lexemes would be more of a distraction than a necessary part of his argument. It is apparent that much weight is placed on gemination after schwa with terminal devoicing in forming the North Borneo hypothesis.

So on balance, I believe Sabahan languages to subgroup with northern Austronesian languages based on syntactic and lexical evidence. This provisional primary branch I have named as Northern Autro-Daic in the tree below.

Additionally, future linguists will be re-evaluating Blust’s position and explanations of the following, among others, if these can be overturned in the future in the light of R&D(2013).

  • *t < *T, *t : in Javanese, Madurese, Balinese (Dempwolff 1934-38, Li 1985, Ho 1998)
  • *d < *D, *d : ten Western Indonesian languages (Iban, Malay, Batak, Balinese, etc.), Paiwan, Puyuma (Dahl 1976, Blust 2009))
  • *s < *t′, *θ : unconditioned split of PAn *s in Rukai dialects (Tsuchida 1976, Li 1985)
  • *S < *s, *ʃ : (Ho 1998)
  • *h < *h, *ɦ : h/∅ contrast in word-final position in Amis, Saisiyat, Pazeh, Atayal and Seediq (Tsuchida 1976, Ho 1998).
  • b1/b2 : Idahan languages (Prentice 1974, Norquest 2015) .
  • phonemic stress (Zorc 1978)

Back to Formosan micro-groups, N&D(2013) did not touched on how the Formosan micro-groups are inter-related, so that means my work stands but has to be revised. The revised tree and table of sound correspondences are shown below.




In the above table, the dark blue cells show N&D (2013) *f > *p merger, which is effectively all Formosan languages. The orange cells are the sound changes and mergers that I have identified. The other colors are the exclusively shared innovations for each of the 9 Formosan groups identified by Blust.

Paiwanic or Southern Formosan retained *g but had *R > r , *r > ∅ and *ʃ > s. Puyumic or Central Formosan is a linkage, indicated using dashed box line for the group name. In this group, *g > ⌐∅ (*g changed into something other than nothing). Tsouic and Bunun both had the merger *g, *k > k while Puyuma had *g > h. Puyuma and Tsouic both had merger of *R, *r > r and *ʃ > ∅. This is what makes it a linkage, since sound changes were shared with some but not other members of the group. Outer Formosan had the changes *g > ∅ and *r > ∅. Western Plains additionally had *ʃ > ∅. Kulunic or North Central Formosan has *ʃ > h. Note that Kulunic is also a linkage, where Atayalic and Northwest Formosan both had *ñ > l, Kulun and Northwest Formosan had *C > s, and Kulun and Ayatalic had *S > s and *S > sh respectively.

Because N&D(2013) threw away the primary branches of Austronesian, it would now appear that the possible primary branches are the following for PAD (Proto-Austro-Daic). I am not committed to this chart, merely as a theoretical possibility, and I am not very familiar with the other groups apart from Northern Austro-Daic.


The reason for the dashed lines is that these branches have to be re-evaluated to see if some of them can be promoted to primary branches. Klamer(2019:6) gives an extended summary of the state of what used to be PMP as as subgroup.

In an ideal world, the branches in the MP family tree should represent “interstage” languages that are reconstructible as single proto‐languages, with recursive branching matching the progress of MP languages through ISEA. For the MP tree in Figure 2, those branches applying to languages in ISEA are not supported in this way. This does not mean that we do not know anything about the affiliations of languages in this area. It has been perfectly possible to establish lower level subgroups, where the languages covering part of an island, or some adjacent islands, form a clearly motivated subgroup that derives from MP. For example, the MP subgroups discussed in Adelaar (2005) include six subgroups located on Borneo, three on Sumatra, seven in Sulawesi, and one spreading from Sumatra, via Java, Madura, Bali, and Lombok to Sumbawa (Adelaar 2005, 15–16). That is, there is good evidence that in the north‐western part of Indonesia, MP has more than a dozen first‐order branches. However, to date, there is no evidence of defining innovations that would connect these lower level subgroups to each other and allow the reconstruction of a higher level interstage language such as “Western MP.” It is very clear from the literature over the last decades (Ross, 1995; Tryon, 1995, 25) that Western MP is not considered a reconstructed language and the languages comprised by it2 do not form a single subgroup but many distinct ones.

Furthermore, the suggestion of the tree in Figure 2 that Central‐Eastern MP (CEMP) is an innovation‐defined subgroup has not generally been accepted among linguists. Blust (1982, 1983–1984, 1993) has argued for its unity, but Adelaar (2005) notes that one morphosyntactic piece of evidence for CEMP, the innovation of using proclitic subject markers on the verb, is not reflected in cognate forms so could also be the result of a convergent development. Moreover, as Blust admits, subject proclitics are also found in Sulawesi languages, as well as in Barrier Island languages, so that the phenomenon of proclitic subject marking is not unique to CEMP. The second innovation defining CEMP, the morphological distinction between alienable and inalienable possession, is considered as stronger evidence by Blust. However, this distinction is not unique to CEMP either, and there are significant differences in possessive forms and constructions (Adelaar 2005, 25–26). Donohue and Grimes (2008) present evidence indicating that proto‐CEMP did not exist as an interstage language because some of the phonological innovations on which it is based also occur in some western MP languages. Blust (2009) responded that unique innovations remain but also stresses that CEMP “poses some of the most complex and challenging subgrouping issues that are found in [Austronesian]” (Blust, 2009, 75)

Third, in Figure 2, the CEMP node splits into two groups: Central Malayo‐Polynesian (CMP) languages and Eastern Malayo‐Polynesian (EMP). For over 30 years, it has been known that for neither of these groups can a single ancestor language be established (Ross, 1995; Adelaar 2005; Ross, 2008). The partly overlapping distribution of various innovations weakens the argument for a CMP subgroup (Ross, 1995, 82). Instead, Blust (1993) suggests that CMP languages descended from a dialect chain or network that would necessarily have been hundreds of miles long and evolved when MP languages spread through eastern Indonesia very rapidly. However, the evaluation of this hypothesis requires more bottom‐up reconstruction work on a larger amount of geographically balanced lexical materials than has hitherto been used (on the scarcity of data, see also below).

The two remaining subgroups are South Halmahera‐West New Guinea (SHWNG) and Oceanic. Both of these subgroups are defined by a clear set of sound changes (for a recent proposal, see Kamholz, 2014). Oceanic is the most clearly defined of all Austronesian subgroups, having sound changes accompanied by other kinds of innovations (Ross, 1995). However, within the Oceanic subgroup, the structure of the tree is also very rake‐like, with nine first‐order branches (Lynch, Ross, & Crowley, 2002; Ross, 2017), but as Oceania is beyond the region of ISEA it will not be further discussed here.

Instead of the standard tree in Figure 2, the tree that is more commonly accepted in the field is the one in Figure 3, where a distinction is made between actually reconstructable proto‐ languages and names for groups of languages that do not derive from a single proto‐language; the latter are shown in italics in Figure 3. For example, Pawley (Pawley, 2007, 21) refers to the “Western Malayo‐Polynesian language groups” and the “Central Malayo‐Polynesian linkage.” “Linkage” is used to refer to a grouping for which no proto‐language can be reconstructed. In Figure 3, the branches within the dotted circle are located in ISEA; the South Halmahera West New Guinea languages are partly located on the Papuan mainland.

In other words, the historical reconstruction data available at present do not allow us to say that proto‐MP branched out in a few daughter languages (such as “Western MP,” “Central Eastern MP” or “Central MP”) from which the lower subgroupings of languages in ISEA derived. What we know at the moment is that under the proto‐MP node there exist dozens of hierarchically unordered clades whose history cannot be modeled with this tree. This may be represented in a rake‐like family tree as in Figure 4 (see Ross, 1995, 2005; Adelaar, 2005; Donohue & Grimes, 2008).

The inclusion of Daic is after Ross(2008:171-172) acceptance of its relationship to Austronesian.

Benedict (1942, 1975) proposed a larger ‘Austro-Tai’ family, but this has been poorly received because of the liberties that Benedict took in his reconstructions. More recently, however, Ostapirat (2005) has shown that there are systematic sound correspondences between the basic vocabularies reconstructed for Proto Tai-Kadai and Proto Austronesian. He takes these as evidence that Tai-Kadai and Austronesian are related, but is agnostic as to whether the two together form an Austro-Tai family or whether Tai-Kadai is a high-order subgroup of Austronesian (Ostapirat shows that it does not reflect the defining innovations of Malayo-Polynesian). Building on Ostapirat’s work and his own work on Formosan subgrouping, Sagart (2004, 2005a; this volume) proposes that Tai-Kadai represents a branch of Austronesian that split from the rest of the family at a node perhaps just above Proto Malayo-Polynesian in Figure 6.2. If this is the case, then Tai-Kadai languages are not external witnesses for the purposes of reconstructing Proto Austronesian, although they may provide additional internal evidence.

Norquest (2013)’s findings also points to mutually corroborative evidence indicative of a genetic relationship between Austronesian and Kra-Dai.

Future Directions

I can sense that future work in Proto-Austro-Daic (PAD) vindicating Isidore Dyen who advocated for Central Hypothesis and treated languages in Taiwan as a Formosan subgroup. The only difference with Dyen (2006) is that the urheimat will be in Borneo instead of somewhere in the neighborhood of New Guinea as he proposed.

This would also bring Dyen’s homomeric method some academic respectability. Dyen (2006:1,3) described it as follows:

My use of the so-called homomeric method has been criticized by Blust (1999.67) as based on circular reasoning. The homomeric method draws inferences leading to subgroups from collections of cognate sets with the same distribution over languages. Such a collection is called a homomery from ‘homo-’ = same, ‘-mer-’ = measure. A homomery is a collection of exclusively shared cognate sets.

These numbers are not a list of innovations, but a list containing innovations, a subtle difference. The inclusion of a very large number of innovations is indicated by the magnitude of the collection. It was further indicated that removals from the list were to be expected as research continues, but that additions likewise are not unlikely. The subtraction of one or even a few cognate sets by the finding of a nullifying cognate does not affect the value of a whole collection of sets.

The only modification to this method that I would add is for homomeries to contain just exclusively shared lexical innovations, which is not clear from Dyen above. Blust (2013:718)’s objection is the mixing of retention with innovation sets and using lexicostatistics. I do not support using lexicostatistics as a method however.

Dyen’s conclusions followed from the type of data he
used: cognate percentages without regard to the innovation/retention distinction.

It could even be said that Blust is using a coarse homomeric method since he usually mentions using lexical innovations between languages to prove their subgrouping, for example, the Great North Borneo group as explained by Smith (2017) above.   

Dyen (2006:2,5) describes the weaknesses of comparative method below.

Blust’s newest argument in support of the Formosan hypothesis is based on the observation that many Formosan languages share no significant phonemic mergers (1999.37-55). He couples this with the proposal that only shared phonemic mergers should be used in subgrouping. Phonemic mergers have the advantage that they can not be reversed.

Shared phonemic mergers are highly regarded as evidence for subgrouping. Nevertheless their occurrence can not be guaranteed. The non-occurrence among immediate daughters of a proto-language can not be distinguished from the non-occurrence among the daughters of an immediate daughter. If many Formosan languages exhibit no significant shared mergers, some of them might yet be granddaughters that lacked shared mergers. If Proto-Austronesian could dissolve into daughters that showed no significant shared mergers, why should not a daughter dissolve in the same way?

A second objection arises from experience with the Indo-European languages. Proto-Iranian shares with Proto-Balto-Slavic the merger of voiced aspirate stops with plain voiced stops, but is nevertheless associated in Proto-Indo Iranian with Proto-Indic, which does not exhibit this merger. On the other hand the so-called centum languages are believed to have merged velar stops with palatal stops, but are not taken to form a subgroup. Similarly the satem-languages merge velar and labiovelar stops without being formed into a subgroup. In both cases the changes are interpreted as constituting isoglosses in Proto-Indo-European.

I would add as a disadvantage of the comparative method that reconstruction and subgrouping have mutual effect on each other. On balance, I see a modified homomeric method based on exclusively shared lexical innovation supplementing comparative method where such is not definitive or weak and to sense check whether reconstructions and subgrouping hypotheses are sensible or not.


Blust, Robert. 2013. The Austronesian languages

Chen, Victoria. 2017. A Reexamination of the Philippine –Type Voice System and its Implications for Austronesian  Primary-Level Subgrouping

Dyen, Isidore. 2006. Some Evidence Favoring the Central Hypothesis

Klamer, Marian. 2019. The dispersal of Austronesian languages in Island South East Asia: Current findings and debates

Norquest, Peter and Downey, Sean. 2013. Expanding the PAn consonant inventory

Norquest, Peter. 2013. A Revised Inventory of Proto Austronesian Consonants:Kra-Dai and Austroasiatic Evidence

Norquest, Peter. 2015. Revisiting the question of Austronesian implosives

Olson Kenneth, Mielke Jeff, Sanicas-Daguman Josephine, Pebley Carol JEan and Paterson Hugh J. 2010. The phonetic status of the (inter)dental approximant

Paz, Consuelo J. 1981. A reconstruction of Proto-Philippine phonemes and morphemes. 

Pittayaporn, Pittayawat. 2006. When Words Erode: Moken Trisyllabic Syncopation and PAn Stress

Ross, Malcolm. 2006. Some Proto-Austronesian Coronals Re-examined.

Ross, Malcolm. 2008. The integrity of the Austronesian language family

Ross, Malcolm. 2009. In defense of Nuclear Austronesian (and against Tsouic)

Sagart, Laurent. 2014. In Defense of the Numeral-based Model of Austronesian Phylogeny, and of Tsouic

Smith, Alexander D. 2017. The Languages of Borneo: A comprehensive Classification.

Zorc, David Paul. 1978. Proto-Philippine Word Accent: Innovation or Proto-Hesperonesian Retention?

Casiguran Agta Verb Affixes and Particles

I came across “Grammatical Sketch of Dumagat (Casiguran)” by Thomas Headland and Alan Healey, which is about a language in the Pacific coast of Isabela and Aurora provinces in Luzon. According to them, this language has three dialects: (1) Casiguran Dumagat, (2) Pahanan Agta and (3) Paranan, but according to Jason Lobel, Paranan is not a dialect of this language. For the purpose of article, let’s call it Casiguran Agta.

Note: I have re-spelled the words in the above work, by replacing é with ǝ, ë with ɛ, ö with ɔ and ng with ŋ.

There are two areas that interest me about this language:

  1. the function words, and
  2. the verb conjugation.

Noun-Marking Particles,  Personal Pronouns and Demonstrative Pronouns

What’s interesting about the noun-marking particles of Casiguran Agta is the distinction between “absent” versus “present” in non-personal pronouns which manifest as ‹u and ‹i respectively. We could say that the “absent” marker is ‹u even if there’s the no/to forms since ‹o is derived from the sequence ‹a+u according to Reid in “On Reconstructing the Morphosyntax of Proto-Northern Luzon” [p68]:

Casiguran Dumagat genitive no and locative to appear to have developed from sequences of *na+u and *ta+u respectively, rather than from *nu and *tu with vowel lowering, since high vowel lowering only occurred when the vowel was stressed.

This “absent” vs. “present” distinction is present in Isneg as well, as “near” vs. “far” distinction [Reid, p.66], but the “far” formative is ‹tu and not ‹u. Thus, we have the following forms: nominative tu, genitive natu, locative kitu and all plural form datu. There is another language on the eastern Luzon side with this distinction. Absent forms are also present in Rinconada Bikol with su, and nu in other Bikol lects. This is not present in Tagalog.

Casiguran Agta has 13 noun-marking particle forms: tu, i, no, na, to, ta, ti, ni, du, di, de, and the reduplicated forms of non-personal plural dudu and didi and these can be put in a paradigm like below:


From the table above, we can see that the forms are not one cohesive unit, but could have different origins.

  1. The plural forms all starts with d›, and that there is no case distinctions (topic, attributive and oblique) so that we have only 3 plural forms (excluding the reduplicated ones): du (plural non-personal absent), di (plural non-personal present), and de (plural personal). Because there is no single phoneme that distinguish the plural vis-à-vis the singular in each case (topic, attributive and oblique) unlike in the personal pronouns, it is easy to think of the plural forms as coming from other words diachronically.

    In “A Brief Syntactic Typology of Philippine Languages”, Reid and Liao mentioned that personal plural determiners are the same as the enclitic third person nominative pronouns among Cordilleran languages [p473]. For common nouns: “By following the head noun with a free (non-enclitic) third person plural pronoun. Constructions of this type occur in most of the Cagayan Valley languages of Northern Luzon, such as Central Cagayan Agta (107), Itawis (108), Gaddang, Ibanag and Atta, but not in Yogad or Isnag. It is also found in Paranan (109), on the northeastern coast of Luzon, and in Isinai, a Central Cordilleran language.” [p475].

    We could relate the de to the personal pronoun. Jason Lobel’s “Philippine and North Bornean Languages: Issues in Description, Subgrouping, and Reconstruction” mentions of Low Vowel Fronting (the raising of the vowel *a usually after voiced stops /b d g/ and glides /w y/) among languages of the Pacific Coast, so the equivalent of e is a in Tagalog, Samarnon and Bikol. Therefore, its Tagalog, Samarnon or Bikol cognate would be da. This de is present in the personal plural topic pronoun, whose form is sidɛ in the pronoun table below but in the text some forms are side (p27). Its cognate would be sila, sira and sinda in Tagalog, Samarnon and Bikol respectively. Low vowel fronting is also happened in Dupaningan Agta but the personal plural form is di and personal singular form is ni.

  2. Because Dumagat only has de (from da) in third person plural pronoun attributive form, we still have to account for the two other forms du and di. Reid, in “On Reconstructing the Morphosyntax of Proto-Northern Luzon” [p7], he thought that they were derived from demonstratives:

     Since case-marking prepositional forms, and the nominal specifiers from which they developed are probably in all cases homophonous with forms that can unambiguously be reconstructed as demonstratives, I will claim that each of the forms historically descended from a demonstrative. This is a claim that has been challenged in the literature, in that the deictic features of a given demonstrative may differ from those of its homophonous nominal specifier. Nevertheless in many cases the deictic features are clearly relatable, as for example where a distal demonstrative (referring to a referent that is far from speaker and addressee) has become a nominal specifier marking a noun as having past reference, or as referring to a deceased person; or a medial demonstrative (referring to a referent that is close to the addressee) has become a nominal specifier that marks a noun as being RECOGNITIONAL,12 that is, within the recent common experience of speaker and addressee (“the one that you and I have just been talking about, or experienced”).

    Since the current demonstrative forms of Casiguran Dumagat don’t have the forms du/di/da, they must be demonstrative forms of a proto-language or borrowings from a language with such proto-forms. The closest reconstructed proto-demonstratives made by Ross was for Proto-Central Cordilleran (Prox tu, Med na and Dist di ) but they do not match Casiguran Dumagat (present di, personal da and absent du) not only in terms of consonants for the first two forms, but also the vowels in terms of  proximate vs. remote  axis. However, we can relate their formal similarity to the oblique case markers of Bikol Agta: di, dya, du as described by Lobel [p71]. Is there a connection between these forms?

  3. Moving on to singular forms, it’s noticeable that singular personal forms all have ‹i vowel (just like Tagalog, Bikol or Samarnon) but this vowel is also present in singular non-personal present topic as well as in plural non-personal present forms. But because the plural forms are not related in form to the singular forms, the ‹i vowel in the singular personal form series would not be related to the  plural non-personal present form series.

  4. Additionally, the singular personal attributive ni form spilled from attributive case into the oblique case. This is similar to Itneg where there is no personal noun specifier for both Genitive and Dative [Reid, p46] and in Dupaningan Agta where ni is used on all grammatical cases.

  5. Another possible group of borrowed particles is na/no, ta/to. The form na in singular non-personal attributive (and no which is from na+u) is also used in Ga’dang, Isneg, Karao, Pangasinan and Bikol Agta. Bikol Agta has ni, na, nu [Lobel, p71] in the attributive just like Casiguran Dumagat, although the ni is non-personal unlike Casiguran Dumagat.

    The oblique forms are expected to start with d› instead of t›, taking cue from the personal oblique pronoun forms. The form in singular non-personal oblique ta (and to since it is from ta+u)  is similar in vowel to the one used in in Dupaningan Agta ha, Karao cha, Tagalog sa, Central Cagayan Agta ta and possibly one the forms in Bikol Agta, dya [Lobel, p71].

  6. One of the forms, the singular non-personal present topic form i stands out in the singular forms’ vowel series. This might point to it being not the original form in this slot. This part of the paradigm is similar to Dupaningan Agta which does not have any particle in the topic slot but has na in the attributive and ha in the oblique. This could possibly be the original situation in Casiguran Agta as well, until an i was borrowed from another language, possibly Paranan or Northern Alta. This is the exact situation in Karao which has a very similar series, with forms i/na/cha in case markers for consonant-final words [Reid, p58] to Dumagat Agta i/na/ta. Could one of these languages borrowed from the other or a common language? 

    Another thing to consider is that had the value been a, the series would be similar to the forms in Tagalog, Bikol and Samarnon (, naŋ, sa/saŋ) excluding the final consonant.

  7. We are now left with two forms, the topic forms for singular personal ti and singular non-personal absent tu, which both starts with t›. These forms compare with Bikol topic forms si and su and means exactly the same. Ivatan (Reid’s Problems in the Reconstruction of Proto-Philippine Consturction Markers, pp.53,50 ) has the same distinction ni nu and di du between common nouns and personal nouns for the attributive/genitive and locative cases.

In the personal pronoun forms, my curiosity is directed at the Set I and Set IV forms. All Set IV forms starts with di›, with the same stems as that of Set I which starts with si› or sa›. The stem from which Set I and Set IV are based is Set II. The sa› form in Set I first person singular is identical with Remontado Dumagat sako. [Lobel, p74]. I think this sa› and Bikol sa plural personal noun marker came from the same source. The form sakǝn in Set I singular first person is most likely borrowed from a Northern Luzon language with a form “sa akǝn”.

Its notable that all Set I forms starts with s› instead of the expected t›.  Does this mean this is a recent borrowing?


The Set III forms are all identical with Dupaningan Agta [p118].

The ‹ko in Set I siko and Set IV diko is from ‹kaw, so we can say that the Set II form ka is originally kaw through apocope or by clipping ‹w. We know it’s kaw because that is the form in its neighbor to the north, Dupaningan Agta.

The third person forms siya and sidɛ are the  same for Set I  and Set II . Which one is is more original? It looks to me like both slots were filled at the same time, because both starts with s›. If Set II was older, it would have started with t›.



Among demonstrative pronouns, it should first be noted that the Casiguran medial forms ina and sina are similar to Hiligaynon forms inâ and sinâ. It is also similar to Northern Alta medial na, with distal ya [Reid, p.64].

The Set I (topic) forms ǝye and ǝya compares with Dupaningan Agta forms aye and aya, except that aya in Dupaningan is medial and ǝya in Casiguran is distal, thus they only share the proximal forms ǝye/aye.

The Set II (oblique) forms could be similar too, if we drop Dupaningan Agta i›. Casiguran se is formally like Dupaningan ihe without the i›. The same medial vs. distal difference can be seen in Casiguran sa and Dupaningan ihay without the i› and with clipped ‹y.


There is another article written by Thomas Headland (“Sentence Structure of Casiguran Dumagat”). Browsing through this paper, I noticed that Casiguran Agta also has the linker ǝy on page 11, which is cognate to Tagalog ay and Aklanon hay.

Another  interesting particle is da “because”. This is almost identical to Bikol ta. Now this makes me wonder if there is any connection to Tagalog dahil.

Other particles: a (ligature), saka “and”.

Verb Affixes

The authors supplied 3 tables of verb affixes plus two special aspect tables which I have pasted below.









What I find interesting about these affixes are the following:

  1. In the Actor Orientation (Orientation 1), forms with mi› exists side by side with forms without mi›, and these forms have no difference in meaning.

    In those instances where there are two past tense forms side by side in Tables 8, 9 and 10 (such as minag- and nag- ) , there seems to be no difference in their meanings.  (p.27)

    These forms serve well in confirming the regularity of the ‹in› infix, by shedding light on the earlier forms of the counterpart affixes in other Philippine languages that lost the initial mi›.

  2. The infix ‹in› seems to mean “completed” like Ilokano and not as “begun” like Tagalog. This is similar to other Philippines languages except Central Philippine languages Tagalog, Bikol and Bisayan languages. [Lawrence Reid, On the Development of the Aspect System in Some Philippine Languages, p71-75].
  3. The infix ‹in› seems to be applied after instead of before the orientation affixes compared with Tagalog, Bikol or Samarnon. This results in the form actor past form ‹inum›, from ‹um› + ‹in› →  ‹ч‹in›um›, and conveyance past form ni›, a short form of ‹ini›, from ‹i› + ‹in› →  ‹ч‹in›i›. This contrasts with Bikol, ‹umin›, from ‹in› + ‹um› →  ‹ч‹um›in› and i›‹in›.  

    For the Casiguran Agta prefixes which starts with min›, these also follows the rule above: the infix ‹in› was inserted after the ‹um› was applied but after truncation.  Example for minag› prefix: pag› + ‹um› + ‹in› → p‹um›ag› → (pu)mag› → m‹in›ag› → minag›. This is opposite of Reid’s (p77) conclusion below.

    In a number of Northern Philippine languages, including Northern and Southern Alta, and Casiguran Dumagat (Reid 1988), affixation of the completed aspect of actor focus, pag- and pang- derived verbs, is minag-, or minang-, requiring reconstruction of Proto-Cordilleran *m (in)aR- and *m(in)aN-,26 each of which implies infixation of *(in) prior to voice affixation. In numerous other languages, these forms have now been reduced to n-initial prefixes, such as Ilokano nag- and nang-.

    This is because Reid thinks of it in terms of infixation of ‹in› to the affix mag› and the minag› affix subsequently applied to the stem, thus ‹in› was applied first. But if we think of the stem as already applied with mag› and then subsequently infixed with ‹in›, then we arrive at the opposite conclusion.

  4. The “Continuative” aspect shown in Table 11 is ‹Cǝ›|‹ǝ›, by reduplicating the initial consonant of the stem plus ‹ǝ›, except that if the initial consonant is a glottal stop, the initial glottal stop is not repeated. This is different from Bisayan languages ‹a›, Tagalog and Bikol ‹CV:›, Ilokano ‹CVC›/‹CV:› [Reid, p71], or Manide Agta ‹CVC› [Lobel, p266-267]. In the consonant-initial stems, the reduplicated Cǝ› segment is similar to Tagalog and Bikol CV› in that the stem-initial consonant is included in the reduplicant yet different since the vowel component is fixed to ‹ǝ›. The vowel-initial stems is similar to Bisayan except that the vowel is ‹a› instead of ‹ǝ›. 
  5. However, the differences with the above languages’ Continuous aspect forms are the following:

    1. It is said to occur only in Actor Orientation (Orientation 1). In Tagalog or Bikol it occurs in all orientation.
    2. The example only shows mǝg› and no other Actor orientation forms; no mag›, maŋ› or maŋi› examples.
    3. There is an additional suffix ‹ǝn;
    4. apart from continual meaning, it also means repetitive and intensive.
  6. The cooperative affixes  magpag› and mǝgpǝg›  functions similar to Tagalog and Bikol maŋag›. I wonder if the ‹ŋa› in maŋag› is a way to indicate that its derived from a doubled morpheme pag› i.e. (p‹u)m›agpag› or maybe that the medial consonant cluster ‹gp› was reduced to ‹ŋ›.

  7. The “accidental aspect” (meaning accidental or unintentional action) is described as using an inner prefix  ke›, which is distinct from other inner prefixes ka› (stative) and kǝ›.
  8. The afixes maŋi›/mǝŋi›, maŋipa›/maŋipe›, mǝŋipa›/mǝŋipe›, mǝŋipag› and mǝŋipaŋ› are absent in Tagalog or Bikol. These are in conveyance orientation or benefactive voice.
  9. An even more interesting discovery is the presence of alternation on the vowels of some prefixes, like mag› and mǝg›:

    If one compares forms vertically in Tables 8 , 9 and 10, it will be seen that there are many pairs of forms that are the same except for the vowel, one containing a and one containing é.  In the non-past forms, such as mag- and még- , the a at times indicates future and imperative whereas é indicates present continuous and habitual …. However , in other instances , they seem to be the same tense and differ in transitivity or in other ways that are less clear. [p.27]…The forms containing pe- could be regarded as involving a fusion of pa- and i- . [p37]

    Without context, it would difficult to categorically state what differentiates them. I think the difference between a vs. ǝ  is just one of aspect and not tense (future vs. present) because the four given exemptions clearly indicate that either can occur in present and future. My hunch is that transitivity is not involved but habitual aspect as the two mǝg› exemptions are not incompatible with a habitual reading: Mǝgdigus ǝk, I will bathe (myself), i.e. I will bathe (myself in the usual way, time, etc that I do). Mǝglogbut ǝk, I will submerge (myself) in water, i.e. I will submerge (myself in the usual way) in water. The other two mag› examples, are less prototypical of habitual actions, but not necessarily incompatible with habitual meaning. Context will greatly aid in disambiguating its actual meaning.

    The suggestion that pe› could be from pa› + ‹i› is not believable since there are forms that alternates with a , ǝ and e, such as below and these could not  be derived from ‹i› as well. And the fact that these combine in different ways would imply that in all those combinations there is an ‹i› is just not believable.

    ka› ke› kǝ›
    pa› pe› pǝ›
    pag› peg› pǝg›
    paŋ› peŋ› pǝŋ›
    paŋi› peŋi› pǝŋi›

Some of the other affixes described are similar to Tagalog, Bikol or Bisayan:

  1. abilitative aspect maka›
  2. purposive aspect mǝki›
  3. casual aspect affixes magR› and mǝgR› (and forms  minagR›, minǝgR›,nagR› and nǝgR›)
  4. playing aspect magCV› and mǝgCV› (and forms  minagCV›, minǝgCV›,nagCV›, nǝgCV› for consonant initial stems and magV›, mǝgV›, minagV›, minǝgV›,nagV›, nǝgV› for glottal initial stems)  image
  5. deceptive aspect magCVCV(C)›‹an and mǝgCVCV(C)›‹an (and forms minagCVCV(C)›‹an, minǝgCVCV(C)›‹an,nagCVCV(C)›‹an, nǝgCVCV(C)›‹an for consonant initial stems and magVCV(C)›‹an, mǝgVCV(C)›‹an, minagVCV(C)›‹an, minǝgVCV(C)›‹an,nagVCV(C)›‹an, nǝgVCV(C)›‹an for glottal initial stems ) image
  6. cooperative purposive voice mǝkipag›
  7. external ability makipag›
  8. causative pa› and pe›

Different Organization of Verbal Affix Paradigm

I have re-organized the affixes in Table 8, 9 and 10 into a different tables below. The forms with * and in orange background are not in the original tables.

With basic affixes:

Orientation Non-Past Past
Actor ‹um› ‹inum›
Object ‹ǝn ‹in›
Location ‹an ‹in› ‹an
Conveyance i› ni›

With ka›, ke› or kǝ›:

Orientation Non-Past Past
Actor ka›
mina› / na›
mine› / ne›
Object ka›
ǝn in
Location ka›
ke› ‹an
‹in› ‹an
kine› ‹an
Conveyance ka›

With pa›, pe› or pǝ›:

Orientation Non-Past Past
Actor pa›
mina› / na›
Object pa›
Location pa›
Conveyance pa›

With pag› or pǝg›:

Orientation Non-Past Past
Actor pag›
minag›  / nag›
minǝg› / nǝg›
Object pag›
Location pag›
pinag› ‹an
pinǝg› ‹an
Conveyance pag›

With paŋ› or pǝŋ›:

Orientation Non-Past Past
Actor paŋ›
minaŋ› / naŋ›
minǝŋ› / nǝŋ›
Object paŋ›
Location paŋ›
Conveyance paŋ›

With paŋi› or pǝŋi›:

Orientation Non-Past Past
Actor paŋi›
minaŋi› / naŋi›
minǝŋi› / nǝŋi›
Object paŋi›
Location paŋi›
Conveyance paŋi›

With paka›, peka› or pǝka›:

Orientation Non-Past Past
Actor paka›
minaka› / naka›
minǝka› / nǝka›
Object paka›
Location paka›
Conveyance paka›

With paki›, peki› or pǝki›:

Orientation Non-Past Past
Actor *paki›
*minaki› / *naki›
minǝki› / nǝki›
Object *paki›
Location *paki›
Conveyance *paki›

With pǝke›:

Orientation Non-Past Past
Actor *pǝke› mǝke› minǝke› / nǝke›
Object *pǝke› *pǝke›‹ǝn *pinǝke›
Location *pǝke› *pǝke›‹an *pinǝke›‹an
Conveyance *pǝke› *ipǝke› *nipǝke›

With pǝgke›:

Orientation Non-Past Past
Actor *pǝgke› mǝgke› minǝgke› / nǝgke›
Object *pǝgke› *pǝgke›‹ǝn *pinǝgke›
Location *pǝgke› *pǝgke›‹an *pinǝgke›‹an
Conveyance *pǝgke› *ipǝgke› *nipǝgke›

With pǝgke› ‹an:

Orientation Non-Past Past
Actor *pǝgke›‹an mǝgke›‹an minǝgke›‹an/nǝgke›‹an
Object *pǝgke›‹an *pǝgke›‹anǝn *pinǝgke›‹an
Location *pǝgke›‹an *pǝgke›‹anan *pinǝgke›‹anan
Conveyance *pǝgke›‹an *ipǝgke›‹an *nipǝgke›‹an

With pǝŋike›:

Orientation Non-Past Past
Actor *pǝŋike› mǝŋike› minǝŋike›/nǝŋike›
Object *pǝŋike› *pǝŋike›‹ǝn *pinǝŋike›
Location *pǝŋike› *pǝŋike›‹an *pinǝŋike›‹an
Conveyance *pǝŋike› *ipǝŋike› *nipǝŋike›

With pa›, pe› or pǝ› ‹an:

Orientation Non-Past Past
Actor *pa›‹an
Object *pa›‹an
Location *pa›‹an
Conveyance *pa›‹an

With pag›, peg› or pǝg› ‹an:

Orientation Non-Past Past
Actor *pag›‹an
pǝg› ‹an
mag› ‹an
mǝg› ‹an
minag›‹an/nag›‹an minǝg›‹an/nǝg›‹an
Object *pag›‹an
pǝg› ‹an
Location *pag›‹an
pǝg› ‹an
Conveyance *pag›‹an
pǝg› ‹an

With papa› or pǝpa›:

Orientation Non-Past Past
Actor *papa›
minapa› / napa›
minǝpa› / nǝpa›
Object *papa›
Location *papa›
Conveyance *papa›

With pepa› or pepe›:

Orientation Non-Past Past
Actor *pepa›
minepa› / nepa›
minepe› / nepe›
Object *pepa›
Location *pepa›
Conveyance *pepa›

With pape› or pepe:

Orientation Non-Past Past
Actor *pape›
minape› / nape›
*minepe› / *nepe›
Object *pape›
Location *pape› 
Conveyance *pape›

With papa›, pepa› or pǝpa› ‹an:

Orientation Non-Past Past
Actor *papa›‹an
minapa›/napa›‹an *minǝpa›/nǝpa›‹an
Object *papa›‹an
Location *papa›‹an
Conveyance *papa›‹an

With pagpa›, pegpa› or pǝgpa›:

Orientation Non-Past Past
Actor *pagpa›
minagpa› / nagpa› minǝgpa› / nǝgpa›
Object *pagpa›
Location *pagpa›
Conveyance *pagpa›

With pagpe›:

Orientation Non-Past Past
Actor *pagpe› magpe› minagpe›/nagpe›
Object *pagpe› *pagpe›‹ǝn *pinagpe›
Location *pagpe› *pagpe›‹an
Conveyance *pagpe› *ipagpe› *nipagpe›

With pagpa›, pegpa› or pǝgpa› ‹an:

Orientation Non-Past Past
Actor *pagpa›‹an
magpa› ‹an
mǝgpa› ‹an
minagpa›/nagpa›‹an minǝgpa›/nǝgpa›‹an
Object *pagpa›‹an
Location *pagpa›‹an
Conveyance *pagpa›‹an

With paŋpa›, peŋpa› or pǝŋpa›:

Orientation Non-Past Past
Actor *paŋpa›
Object *paŋpa›
*pǝŋpa› ‹én
Location *paŋpa›
*paŋpa› ‹an
*pǝŋpa› ‹an
*pinaŋpa›  ‹an
*pinǝŋpa›  ‹an
Conveyance *paŋpa›

With papag› or pǝpǝg›:

Orientation Non-Past Past
Actor *papag›
minapag› / napag› minǝpǝg› / nǝpǝg›
Object *papag›
*pǝpag› ‹ǝn
Location *papag›
*papag› ‹an
*pǝpag› ‹an
*pinapag›  ‹an
*pinǝpag›  ‹an
Conveyance *papag›

With papag› or pǝpǝg› ‹an:

Orientation Non-Past Past
Actor *papag›‹an
minapag›‹an/napag›‹an *minǝpǝg›‹an/*nǝpǝg›‹an
Object *papag›‹an
*pinapag› ‹an
Location *papag›‹an
Conveyance *papag›‹an

With pepag›:

Orientation Non-Past Past
Actor *pepag› mepag› minepag›/*nepag›
Object *papag› *pepag›‹ǝn
Location *papag› *pepag›‹an
Conveyance *papag› *ipepag› *nipepag›

With pagpag› or pǝgpǝg›:

Orientation Non-Past Past
Actor pagpag›
minagpag›/nagpag› minǝgpǝg›/nǝgpǝg›
Object pagpag›
Location pagpag›
Conveyance pagpag›

With pepaŋ› or pepǝŋ›:

Orientation Non-Past Past
Actor *pepaŋ›
minepaŋ›/*nepaŋ› minepǝŋ›/*nepǝŋ›
Object *pepaŋ›
Location *pepaŋ›
Conveyance *pepaŋ›

With pagpaŋ› or pǝgpǝŋ›:

Orientation Non-Past Past
Actor *pagpaŋ›
minagpaŋ›/nagpaŋ› *minǝgpǝŋ›/*nǝgpǝŋ›
Object *pagpaŋ›
Location *pagpaŋ›
Conveyance *pagpaŋ›

With pakapag› or pǝkapǝg›:

Orientation Non-Past Past
Actor pakapag› pǝkapǝg› makapag›
minakapag›/nakapag› minǝkapǝg›/nǝkapǝg›
Object pakapag› pǝkapǝg› *pakapag›‹ǝn
Location pakapag› pǝkapǝg› *pakapag›‹an
Conveyance pakapag› pǝkapǝg› *ipakapag›

With pǝkipag› or pǝkipǝg›:

Orientation Non-Past Past
Actor pǝkipag›
minǝkipag›/nǝkipag› minǝkipǝg›/nǝkipǝg›
Object pǝkipag›
Location pǝkipag›
Conveyance pǝkipag›

With pǝkipaŋ› or pǝkipǝŋ›:

Orientation Non-Past Past
Actor *pǝkipaŋ›
minǝkipaŋ›/nǝkipaŋ› minǝkipǝŋ›/nǝkipǝŋ›
Object *pǝkipaŋ›
Location *pǝkipaŋ›
Conveyance *pǝkipaŋ›

With paŋipa› or pǝŋipa›:

Orientation Non-Past Past
Actor *paŋipa›
Object *paŋipa›
Location *paŋipa›
Conveyance *paŋipa›

With paŋipe› or pǝŋipe›:

Orientation Non-Past Past
Actor *paŋipe›
*minaŋipe›/*naŋipe› minǝŋipe›/nǝŋipe›
Object *paŋipe›
Location *paŋipe›
Conveyance *paŋipe›

With paŋipag› or pǝŋipag›:

Orientation Non-Past Past
Actor *paŋipag›
*minaŋipag›/*naŋipag› minǝŋipag›/nǝŋipag›
Object *paŋipag›
Location *paŋipag›
Conveyance *paŋipag›

Reading Materials

There are other Casiguran Agta resources one can read, from here, which I listed individually below:

  1. Libru a Pegbasaan
  2. Ugali na Agta
  3. Memahal a Lagip
  4. Lagip ni Tariri
  5. Tu aso sakay tu Bakokol
  6. Pakodyan tam a Mangibut ta saket a tibi
  7. Lagip na Agta

Himmelmann’s Tagalog Potentive and Stative Verb Paradigms

This is more or less an evaluation of Nikolaus Himmelmann’s two works :

  1. How to miss a paradigm or two: Multifunctional ma- in Tagalog (2006)
  2. On statives and potentives in western Austronesian (mostly Tagalog (2004)

This post quotes or paraphrases heavily from the first work, with such quotes or paraphrases indicated using italicized text below. The terms that I differ with Himmelmann are:

  1. imperfective. I will be using nonperfective.
  2. dynamic. This will be replaced with fientive, following Williams’ Hebrew Syntax, page 57 (2007).
  3. voice. This will be replaced with orientation, following Himmelmann’s The Philippine Challenge To Universal Grammar (1991). The case role realization depending on verb types for each of the orientations are: image

Going back to Himmelmann’s work, he observed:

This chapter explores some of the problems created by these items for grammatical analysis and the structure of descriptive grammars, using the multifunctional prefix ma- as its primary example. As further illustrated in section 2, this prefix occurs in formations that have been termed adjectival, involuntary action, potential, abilitative, stative, etc. The major goal of this chapter is to propose a coherent systematics for the multiple uses and functions of this prefix.

To date the nature of this affix has been misunderstood because analysts have failed to notice that it participates in two different, but related paradigms. On the one hand, ma- serves as the marker for potentive dynamic verbs in undergoer voice (section 5). On the other hand, it marks basic statives (section 6).[Himmelmann #1, page 489]


The main goal of this contribution is to bring some basic order to the fairly broad and, on first sight at least, somewhat heterogeneous range of uses and meanings associated with these forms. I will argue that the different uses can be grouped into two semantically and morphosyntactically quite different construction types, which I will call STATIVE (proper) and POTENTIVE, respectively.[Himmelmann #2, page 1]

And Himmelmann’s primary goal is to devise a paradigmatic structure which will serve as evidence for such an order:

Here we will be concerned with evidence for and from paradigm structure, a kind of evidence which has always been applied without much discussion in the case of Indo-European and Afro-Asiatic languages but which is often not used outside this group of languages. In particular with regard to putatively agglutinating languages such as Tagalog, little use of paradigms is made, probably based on the assumption that straightforward compositionality on the formal side is also mirrored by straightforward compositionality on the content side.

Although Tagalog paradigms lack the generality characterizing inflectional paradigms in Indo-European languages, they are still paradigms in adhering to the principle of constant correlation or proportionality (x relates to x’ as y relates to y’, regardless of the formal details). The morphological and semantic parameters underlying these correlations are essential in uncovering the language-internal systematics of multifunctional affixes such as ma-. [Himmelmann #1, page 488-489]


The main results of his work are listed below, with comments.

  1. A by-product of this exercise will be the recognition of the fact that next to aspect/mood and voice, dynamicity – the distinction between dynamic and stative predicates – is of fundamental importance to Tagalog grammar (section 7). [Himmelmann #1, page 489]

    I agree.

  2. Viewed crosslinguistically, Tagalog is somewhat remarkable in making a rather strict distinction between statives proper (eventualities which principally exclude the involvement of an agentive argument) and potentives (eventualities which principally include an agentive argument which however lacks control). In most languages in which a dynamicity distinction is grammaticized, there is a simple binary opposition between dynamic and stative formations, with most of the eventualities requiring potentive forms in Tagalog being expressed by stative forms.

    The forms in the two paradigms partially overlap. In particular, the two forms which only consist of the prefix ma-, i.e. patient voice potentive and basic stative voice, are very hard to distinguish semantically. The only way to distinguish them is syntactic, via the overall construction in which they occur and the voice alternations allowed for by this construction, as illustrated with example (40) above: “ang dahun ay nadàdalá ng tubig” ‘the leaf was being carried along by the water’. [Himmelmann #1, page 518]

    I agree with the Tagalog description. This can be extended to Philippine-type Austronesian languages I guess. I am not as intuitive and informed with other languages especially non-Philippine type languages, apart from what I can read and research.

  3. The analyses advanced in the preceding sections imply an elaborate system of basic verbal affixations, summarized in Table 6. In addition to aspect/mood inflection, this system involves distinctions with regard to voice, dynamicity (dynamic vs. stative) and control (potentive vs. non-potentive). It is paradigmatically organized in that each form conveys a fixed set of morphosyntactic features, obligatorily choosing one feature from each of the basic dimensions aspect/mood, voice, dynamicity and control. That is, a form such as i-lakad ‘walk with, use in walking’ is not just the conveyance voice form of lakad, it is the dynamic, non-potentive, non-realis, perfective conveyance voice form of lakad.

    Table 6 basically collapses the information given in Tables 2, 3 and 5. For the sake of clarity, aspect/mood alternations have not been included (see Tables 2 and 5). That is, strictly speaking each formative in Table 6 represents a set of four derivations. For example, -an represents BASE-an, RDP1-BASE-an, -in-BASE-an and -in-RDP1-BASE-an. [Himmelmann #1, page 518]

    The details of this paradigm is incorrect for the reasons that will be mentioned below. I will provide as well a replacement paradigm table below with more detailed explanations and examples.


Furthermore, he summed up his investigation into whether paradigms exists in Tagalog as follows:

The aspect/mood and control alternations form inflectional paradigms, for two reasons. First, they are highly general, each formation implying the existence of the complementary one(s) (i.e. a non-potentive forms implies the existence of a potentive one, etc.). Second, there are clearly unmarked or basic forms for these alternations (non-realis perfective for aspect/mood, non-potentive for control). The voice and dynamicity alternations, on the other hand, show derivational features in that they are less general, less productive and exhibit quite a few formal and semantic idiosyncrasies. Still, they are also paradigmatically related to each other because of the constant correlations holding across all cells of Table 6. In section 3, the term derivational paradigm was introduced to capture both their derivational features and their paradigmatic relatedness.

The results that I am more directly interested in is his evidence in grouping the different ma› uses, as well as his evidence in the way he organized his paradigm table (his table 6), where statives, potentives and non-potentives are distinct and separate.


To start, he reviewed the semantic range of words marked with ma› or its variant mà› (with secondary stress/lengthened vowel) which I put in a table below. He made a distinction between variable ma› and invariable ma›, with variable ma› referring to those with aspect/mood alternations (realis, nonrealis), and invariable ma› to those limited to just either the realis mood or nonrealis mood only and no other prefix. Based on their semantic coherence and correspondence to grammatically relevant categories also found in other languages, he grouped those words using variable ma› into 5 sets of uses:

No Usage Example
1 Bodily conditions or emotion states matakot / natàtakot, magùgútom / nagùgútom, mapipe / napipe, mabuhay / nabuhay, mamatay / namatay, matulog / natulog
2 Some positional predicates (being in or getting into the position denoted by the base). maupó / naupó,
mahigá’ / nahigá’, tàtayó’ / tàtayó’
3 Perception predicates or acts of perception. màkita / nàkita

Involuntary actions (actions occurring without the full control of the actor, or actor is enabled to perform by virtue of outer circumstances). These include:
a) spontaneous reactions/actions over which the actor has no control
b)accidental actions which the actor physically controls but did not intend to carry out.
c) inanimate effectors not being in control of the action triggered by them.

màbigkas / nàbigkas, madalá / nadalá / nadàdalá / madàdalá
5 Ability or opportunity to perform/carry out an action. In realis mood, this conveys that an actor succeeded or managed to carry out an action despite a number of obstacles. mabili / nabili / bibili / bibili , mapàpanoód / napàpanoód,
mapunó / napunó

And 2 sets for those using invariable ma›:

No Usage Example
1 Qualities or properties when used as attributes or predicates (does not alternate with na- in realis perfective contexts) maliít,
2 Locational predicates: consist of na- plus a deictic element or a prepositional phrase introduced by the general locative preposition sa (restricted to realis form na-, only na- never ma-) nàroon,


Before he created a paradigm for multifunctional ma›, Himmelmann at first looked at aspect/mood paradigm and revisited the reasons why they form a paradigm. Then he moved on to prove that orientation also forms a paradigm. A summary of evidence advanced in [Himmelmann #1] and my comments on them in blue follows.

For the Aspect/Mood Paradigm

  1. The only obvious paradigm in Tagalog which is similar to inflectional paradigms in Indo-European languages is the aspect/mood paradigm already mentioned in the preceding section (see below) and illustrated for variable ma- in Table 1. Aspect/mood alternations are not restricted to words prefixed with variable ma- but occur in a large number of other morphologically complex formations, in particular words marked with voice affixes (also called focus affixes in the Philippinist literature). [page 497]

    This is an uncontroversial statement but I need to include to set the background of the study.

  2. [M]any words prefixed with ma- allow the aspect/mood alternations illustrated in Table 1 with the base takot ‘fear’. The formative ma-, which is the conventional citation form of the prefix, is the basic or non-realis form which contrasts with the realis formative na-. Accented reduplication (the vowel in the reduplicated syllable is distinctly long) signals imperfective aspect in either mood. As we will see in section 3 below, this aspect/mood alternation is found with many other affixed formations in Tagalog. In fact, it is so general that these alternations can be (and have been) called aspect/mood inflection. In ("baká ngá kayó ‘y matakot") matakot is a non-realis perfective form (also called the base form) while natàtakot in ("natàtakot silá") is a realis imperfective form.


    Strictly speaking, then, the formations to be investigated here may carry the prefixes ma- or na-, the changing nasal indicating a regular realis/non-realis alternation as it is also found in many other Tagalog prefixes (e.g. maki-/naki-, maka-/naka-, mag-/nag-). Thus, when speaking about ‘the prefix ma-’, reference does not pertain to a specific formative of the shape /ma/ but rather to the complete inflectional paradigm given in Table 1. In this section, any formative which belongs to this paradigm is glossed simply as MA, as in the preceding two examples. [page 490]

    This is what is referred to in #1 above as “already mentioned in the preceding section”.

For an Orientation Paradigm

  1. It is widely, though not unanimously, agreed that there are four basic voices in Tagalog, i.e. actor voice, patient voice, locative voice and conveyance voice. The latter three share a number of morphological and syntactic properties which makes it convenient to refer to them collectively as undergoer voices. While there is essentially only a single formative for each of the undergoer voices, there are a number of distinct formatives for actor voice. Table 2 lists the major affixes signaling these voices and the alternations marking aspect (perfective vs. imperfective) and mood (realis vs. non-realis). It illustrates only the two most important actor voice formatives, -um- and mag-. All other actor voice prefixes (e.g. maN-) follow the pattern of mag- (non-realis m alternating with realis n). [page 498]


    As mentioned at the start, I use the term orientation and not voice. It is true that there are more than one initiative orientation formatives. But this is a superficial analysis since these are surface forms only. There is actually only one formative, ‹um›, for initiative orientation. The reason this is not apparent is due to truncation. This is attested by the table below, using some of those forms. Notice that ‹um› is irregular in the nonrealis nonperfective, realis perfective and realis nonperfective.


  2. As can be gleaned from Table 2 (and also from Table 1), the marking of the aspectual distinction is completely general and transparent: Accented reduplication of the first CV unit of the stem marks imperfective aspect. Perfective aspect remains formally unmarked, regardless of voice.

    The formal manifestations of the realis/non-realis distinction are somewhat less transparent and, more importantly, closely linked with voice marking. Thus, while in the undergoer voices, realis is signaled by the infix -in-, there is no clear exponent for realis mood in -um-actor voice. In mag-actor voice, realis/non-realis is conveyed by the alternation between m and n already familiar from the aspect/mood paradigm for ma-. [page 498]

    As mentioned above, this is a superficial analysis, and failed to mention what truncation has done to the eventual formatives. Had truncation been considered, it would be very clear that realis mood is indicated by the infix ‹in› and nonrealis mood by its absence just like in the non-initiative orientations. Additionally, the m› and n› alternation of the other initiative orientation affixes are just the base of the visible tip of the full affixes after truncation, and if we put back the clipped segments, it is clear mood and orientation are not signaled by this m› and n› alternation but by ‹in› and ‹um› respectively.

  3. Not all formations are formally compositional in that each morphosyntactic feature (aspect, mood, voice) is conveyed by a separate formative. Perfective aspect and non-realis mood are actually implicated by the absence of a particular formative. Furthermore, the realis patient voice forms (binilí, binìbilí) lack a separate voice formative as does the non-realis imperfective form (bìbilí) in the -um-paradigm.

    This lack of formal compositionality is a very important diagnostic for paradigmatic organization. One major principle for paradigms is the principle of constant correlation (Seiler 1966: 197) or proportionality (Uhlenbeck 1985): binilí relates to bilhín as does binilhán to bilhán, magbilí to bilhín as nagbilí to binilí, etc., regardless of the particular formatives involved. [page 498]

    I disagree with the following:

    1. Some of the formations are not fully compositional:
      1. Although perfective aspect and non-realis mood lacks marking, these are in opposition to the other pair of values (nonperfective ‹R› and realis ‹in›) which are compositional and make them compositional.
      2. Also, his analysis failed to consider truncation as a process. It failed to take note of the pattern of truncation throughout the paradigm, for example, the absence of ‹in in realis perfective terminative orientation (e.g. binilí).
      3. Also, although the lack of ‹um› in nonrealis nonperfective initiative orientation (e.g. bìbilí) is mentioned, it did not mention the irregularity in the forms of nonrealis nonperfective (e.g. bìbilí) and realis nonperfective (e.g. binìbilí) compared with all other orientation affixes by lacking ‹um›, with the regular forms being *bumìbili and *buminìbili respectively. Although Tagalog does not have these forms, Respectively, the realis perfective (“preterito”) has ‹um› and ‹in› (e.g. binacal, binmuhat, binmulig, inmagui), the realis nonperfective (“presente”) has reduplication ‹R›, ‹in› and ‹um› (e.g. binmabacal, guinmiguican, binmubuñag, inmiinum) and the nonrealis nonperfective (“futuro”) has ‹um› and reduplication ‹R› (e.g. bumabacal, guimiguican, simisiling). (Note: ‹um› becomes ‹im› if the vowel of the first syllable is ‹i›.) This shows that this feature got degraded in modern Tagalog. The paradigm itself was originally regular and the regularity is recoverable.
    2. I agree that there is a paradigmatic relation among the orientation formatives. However, I think he contradicted himself a bit when he said that the lack of formal compositionality is very important diagnostic for paradigmatic organization, yet asserted that a major principle for paradigms is constant correlation which disregards the formatives involved.
  4. While such correlations may hold both semantically (on the content side) as well as formally (on the expression side), the correlations on the content side are the ones of central importance. They presuppose (or imply) a grid of morphosyntactic features which are always conveyed together: Any given form which is part of the paradigm always conveys the triplet of aspect, mood and voice. There is no way of creating a form which conveys only one of these features. Thus, formations which convey two or more morphosyntactic features and are non-compositional in their formal makeup by their very nature imply paradigmatic organization. Applied to our current example this means that because the aspect/mood alternations remain constant across the different voices and their formal exponence is inherently linked to voice marking, voice itself becomes part of the paradigm. And it is in this sense – and only in this sense – that voice is paradigmatically organized in Tagalog. [page 498]

    I think he should have written “formations which convey two or more morphosyntactic features regardless of the compositionality in their formal makeup by their very nature imply paradigmatic organization”. It is the semantics and not the forms being fusional that convey paradigmaticity. Orientation is part of the paradigm because it participates in that grid of features which cannot be semantically separated from aspect and mood for each of the forms, although they may be separable formally. Orientation will not cease to be a part of the paradigm even if the formatives are uncovered to be fully compositional instead of fusional. That they are more fusional than compositional do not make them more paradigmatic.

  5. There are other diagnostic features of paradigmatic organization, most of which are not met by Tagalog voice alternations. They differ in this regard quite clearly from the aspect/mood alternations. Perhaps most importantly, voice alternations are not general in the same way as aspect/mood alternations. Aspect/mood formations are general, for example, in that for any given aspect/mood formation there are (almost) always three complementary ones (for a major exception, see the next section). Voice alternations are much less regular and predictable. Not many Tagalog lexical bases or derived stems are like bilí in that they co-occur with all five voice affixes illustrated in Table 2.9 Some bases typically occur only in two voice forms, others in three, etc., and while it is possible to make some generalizations about typical patterns based on the semantics of the base and the voice affix there are many exceptions to such patterns (cf. Himmelmann 1987: 129–145). Consequently, one voice form does not imply the existence of another voice form.

    Another basic characteristic of paradigms according to Bybee (1985: 50–58) is the existence of a formally and semantically basic, unmarked form. Such a basic form is easily identifiable for the aspect/mood alternations (i.e. non-realis perfective). In contrast, there is no evidence for a basic voice formation from which the other voices are derived.

    These differences point to the fact that the voice alternations have more characteristics of derivation than inflection. In particular because of their lack of generality, it is widely believed that they do not form a paradigm. In this view, the concept paradigm is limited to inflectional paradigms on the assumption that inflectional paradigms are always totally general (i.e. every base subcategorized for the paradigm occurs in all forms considered to be part of the paradigm). But, as Seiler (1966: 197) points out, this is not even true for the prototypical paradigms of Latin. Not every Latin verb has a supine form and not every Latin noun occurs in vocative case. Furthermore, as just stated, the fact that voice marking is formally intertwined with aspect/mood marking in such a way that all three morphosyntactic features always come in a package implies an extended aspect/mood and voice paradigm, even though the voice alternations are much less regular and general than the aspect/mood alternations. [page 499-500]

    There are two things incorrect here:

    1. These arguments are self cancelling, in that they are contradictory. Tagalog, which has defective/missing formatives, is said to not meet this paradigmaticity yet Latin, which also has missing forms, is said to have paradigms in spite of it. My position is that, as long as constant correlation and proportionality can be demonstrated, they form a paradigm in spite of missing forms. The absence of such forms is just something to be investigated, understood and explained in spite of it being a paradigm.
    2. There is a basic, unmarked form without orientation affixes. Just remove them and you have the base word, and those base words have neutral orientation. These are marked “Non-oriented” or unoriented in my paradigm, following mood (nonrealis/irrealis) and aspect (non-perfective/imperfective).
  6. This is not to deny that there are significant differences between the two types of alternations. In order to capture these differences, one could say that aspect/mood alternations form an inflectional paradigm while voice alternations form a derivational paradigm. As opposed to inflectional paradigms, derivational paradigms are characterized by a lack of generality which in turn implies a more important role for semantic and pragmatic factors in accounting for the actually occurring forms (for example, whether a given base occurs in conveyance voice depends very much on the compatibility of the meaning of the base with the meanings of conveyance voice formations). Importantly, not all derivational formations are paradigmatically organized. To the contrary, derivational formations typically do not involve paradigms.

    Derivational formations which are paradigmatically organized, on the other hand, involve two or more categories, obey the principle of constant correlation along different dimensions and consist of forms which are formally not fully compositional. In the most clear-cut cases, they are formally intertwined with alternations which are clearly inflectional, as just illustrated for the Tagalog voice alternations. [page 500-501]

    There is hesitation here on whether derivations can have paradigms or not. I think whether orientation is derivation or inflection is a totally separate issue and does not matter here. In the end, what matters is that the relations between the forms are paradigmatic.

  7. [A]lthough voice marking may be less general and regular than aspect/mood marking it is still surprisingly productive and widespread when looked at from the point of view of Standard Average European. For example, it is not an exception that an apparently semantically intransitive base such as lakad ‘walk, gait’ allows for all four basic voice formations: (15) matulin siyáng lumakad.  ‘He walks fast.’  (16) nilakad ng mga bata’ ang buóng sampúng milya. (17) huwág lakaran ang damó. ‘Don’t walk on the grass.’  (18) huwág mong ilakad ang bagong sapatos. ‘Don’t use the new shoes in walking.’. That is, although for most lexical bases only a subset of voice formations is conventionalized and frequently used, it would appear that almost all lexical bases have the potential to occur in all basic voice formations if the resulting formation “makes sense” in both semantic and pragmatic terms. [Himmelmann pages 501-02]

  8. [B]ecause of its intimate formal link to aspect/mood marking, voice marking is also paradigmatically organized, despite the fact that it is essentially derivational. This in turn will provide an important lead for the further systematization of ma-words. [page 497-498]

    This is the main take-away in this section, that orientation marking is intimately connected with aspect/mood marking, although the jury is still out on whether its derivation or inflection. Aspect, mood and orientation participates in a paradigm.

  9. I will follow here the basic assumptions of a WORD-AND-PARADIGM approach to morphology. Most importantly, rather than talking about morphemes as minimal units of form and meaning, I will speak of (bound) formatives – i.e. formal units attaching to lexical bases – which in a given morphosyntactic context may convey (or realize) such and such a bundle of semantic and/or syntactic features. It is only in this framework that the term multifunctional affix has a straightforward and consistent interpretation, i.e. a formative which occurs in a multitude of morphosyntactic contexts conveying a number of different bundles of semanto-syntactic features. In morpheme-based morphology, strictly speaking there cannot be a multifunctional morpheme since morphemes are units of meaning and form (so a multifunctional affix is either polysemous or “represents” two or more homonymous morphemes). [page 489]

    I would suppose this is one reason why the paradigm arrived at is incorrect. Tagalog affixes are both formatives and morphemic/polymorphemic. It’s the combinations of individual morphemes that creates each individual formative in the paradigm.

To summarize, I agree that there are four basic orientation affixes, that they form a paradigm, and they are intertwined with aspect/mood paradigms. Orientation affixes form a paradigm because (a) each form presuppose the existence of other forms from which it differs on some particular axis, like initiative, terminative, translative and locative, (b) this semantic difference is consistent and proportionate across the grid, and (c) it participates with other features that are confirmed paradigmatic, like aspect and mood.  However, I disagree that there are several initiative orientation affixes. I think there is only one (‹um›), with the others only its derivatives when combined with realis mood ‹in› and the other affixes. My opinion is that the affixes are quite transparent provided truncation is taken into consideration. I disagreed in that I think orientation is fully compositional, provided one takes into account the truncation process that is happening in the background. I concede that Tagalog has some irregular forms, but earlier stages of other related languages (Old Hiligaynon) points to a regular form in that same part of the paradigm in its earlier stage as a language.


Himmelmann split the uses of ma› into potentives and statives, noting where the term "potentive" came from:

The only problem that actually exists with regard to this category is the appropriate name for it. Potential, aptative and volitive are among the terms that have been used for this category, none of which is really satisfactory in characterizing exactly this set of uses. Here I follow Rubino (1997) in using the new term potentive to refer to formations which convey both involuntary action and ability readings. [Himmelmann #1, page 505]

How he split the uses of ma› and his evidence for splitting them into two separate paradigms and their placements in their respective paradigms are listed below, with my comments in blue:

  1. There are five different uses of variable ma-, i.e. expressions conveying a  bodily condition or emotional state, position, perception, involuntary action, and the ability to perform an action. Only two of these, i.e. bodily condition or emotional state and position, have semantically related invariable uses. Again, while this is not strong evidence, it may contribute to an argument for combining the five different uses of variable ma- into two higher-level groupings, the first consisting of bodily condition/emotional state and position, the second of perception, involuntary action and ability. [page 494]

    In my opinion, this seeming semantic relation between variable ma› and  invariable ma› for uses #1 & #2  vis-a-vis none for uses #3, #4 and #5 does not count as an evidence and is purely accidental. Although use #1 between variable and invariable ma› is related, use #2 is not. This will be the subject of another post.

  2. The last two uses of ma-, the ability and the involuntary uses, share an important property: They occur with exactly the same set of bases. That is, in principle all of the preceding six examples are ambiguous between an ability and an involuntary reading. To illustrate, example ("nadalá ko ang libró" ‘I took the book by accident.’ ) also means ‘I was able to carry the book’ in addition to ‘I took the book by accident’. Conversely, example ("kung màbibili iyán." ‘If that can be sold.’) also means ‘if that happens to be sold’ or ‘if that is sold by accident’.[page 494]

    These two uses are linked by the fact that they overlap more or less completely. Any formation which allows an involuntary action reading (in any of the various senses distinguished) usually also allows an ability reading and vice versa. (Endnote: As noted above, in some instances the two readings are distinguished suprasegmentally, unaccented ma- typically conveying an ability meaning, accented mà- an involuntary one.)

    While it is rare crosslinguistically that involuntary and ability uses are conveyed by the same form and far from obvious how they are linked semantically, use of the same formative for both uses is extremely common and widespread in western Austronesian languages, regardless of the shape of the formative… There is a broad consensus in the literature that the involuntary action and ability uses form a single category. [page 505]

    This is the semantic evidence to group together uses #4 & #5 separate from uses #1 & #2 of variable ma›. I agree with this grouping.

  3. Unlike perception predicates, the remaining two semantic classes of ma- marked expressions, i.e. expressions conveying a bodily condition or emotional state and positional predicates, do not fulfill the criteria for patient voice formations and also fail to show any of the [voice] correspondences in a regular and general fashion. Instead, they occur in a set of very different alternations.

    Tagalog is well known for allowing voice alternations in expressions for what would appear to be semantically intransitive activities, such as ‘run’, ‘dance’ and so on, as shown in examples (16)-(18) above. And there are in fact expressions for bodily conditions or emotional states which at least formally appear to be voice-marked. That is, next to matakot ‘afraid’ in (1) there is also ikatakot and katakutan: (32) "ang pagkalunod ng Kastila’ ay  ikinatakot ng tatlóng magkakaibigan." ‘The drowning of the Spaniard frightened the three friends.’  (33) "kinatàtakutan siyá ng mga tao dito." ‘People here are afraid of him.’ (34) "hindí nilá nàlàláman kung dapat katakutan ang aswáng." ‘they did not know whether a vampire was really to be feared’. These formations differ quite clearly in form and meaning from the potentive forms. In place of the prefix ma-,  which occurs in all potentive forms, there is another prefix, i.e. ka-. Conveyance voice (prefix i-) and locative voice (suffix -an) marking as well as aspect/mood marking (reduplication, realis undergoer infix -in-), however, are the same as in the other paradigms. [pages 508-510]

    This is the formal evidence to separate uses #1 & #2 from the other uses of variable ma›: each of the two groups shows up in a different orientation alternations. I agree that uses #1 & #2 do not have patient roles, simply because they are statives. However, I disagree that they do not show orientation formations or correspondences in a regular and general fashion:

    1. See my tables 5 and 7 below to see how regular and general stative ma› formations are, including terminative orientation.
    2. It’s actually his “potentive” paradigm that shows irregular orientation formations. See my tables 5, 6 and 7 below on how potentive ma› and maka› were split to two paradigms and how they really look like.

    Overall, the failure of stative ma› to show orientation correspondences in a regular and general fashion is an indication that the paradigm he devised is incorrect.

  4. All these [stative] formations principally exclude the involvement of an agent, i.e. an entity which is represented as intending to bring about a given state of affairs (and usually also controlling much of the action(s) required for bringing it about). Instead, causes for experiencing a given emotional or bodily state are typically inanimate things or abstract states of affairs. This constitutes the major difference to the potentives. In potentive formations, there always is a potential agent implied even if in the specific state of affairs referred to by a potentive form this agent is presented as not being in full control. Forms referring to states of affairs which principally exclude the involvement of an agent are called stative formations, those which principally allow the involvement of an agent, dynamic formations. [page 510]

    I disagree. Ma› statives in initiative orientation, even for “causes for experiencing a given emotional or bodily state”, can have causes that need not be inanimate things or abstract states of affairs. I am not referring here in the sense of maka› being stative ma›’s initiative orientation form and having stative uses. Take for example the following three sentences:

    1. ‘UST student, hinihinalang namatay sa hazing.’
    2. ‘Dalawang sundalo namatay sa bakbakan sa Sulu’
    3. ‘Tulong na pinansyal sa pamilya ng pulis na namatay sa pakikipagsagupa sa New People’s Army sa Mountain Province, ibinigay.’

    The experiencers of the verb namatay in these sentences suffered from animate and concrete, real-life causes with which the experiencers have agent-like active interaction. There is also a subtype of stative ma›, mapag›, where a seeming actor is involved:

    1. ‘Dito nagsisimula ang pagiging mapagtaka ng mga bata’
    2. ‘’Kumain ka ng marami’ sambit nya na ikinapagtaka ko.’

    Because causes of any stative can be in any form (inanimate or animate, abstract or concrete), they should not be stressed too much as a difference with potentives, because potentives’ causes for being involuntary or inability can be the very same type of causes.

  5. In a stative formation such as na-galit siyá ‘she was/got angry’ the subject (siyá) is an experiencer. In a corresponding experiential potentive formation such as nà-kita siyá ‘she was/got seen’, the subject is the stimulus of a perception/visual experience, not the perceiver/experiencer. The perceiver, if overtly expressed, has to be coded as a genitive (nà-kita siyá ng aso ‘the dog saw her’). Statives with ma-/na-, on the other hand, generally do not allow genitive arguments. If one wanted to add the object of the anger to nagalit siyá this would have to be marked as a locative (e.g. nagalit siyá sa aso ‘she was angry with the dog’). [page 510-511]

    This is the syntactic evidence to distinguish the two groups. To add further:

    1. In an experiential “potentive” formation such as nà-kita siyá ‘she was/got seen’, the subject is not just a stimulus but also an (unintentional) theme / patient / recipient.
    2. On ma› statives, they might have a similar looking genitive-marked argument (e.g. nagalit siya sa akin ng ginising ko ‘she got angry with me when I woke her up.’) but this is not a real genitive marker, just the homophonous marker nang since it can be transferred to the start of the sentence and has to be re-spelled (Nang ginising ko, nagalit siya sa akin ‘when I woke her up, she got angry with me’).
    3. When he said that "statives with ma-/na-, on the other hand, generally do not allow genitive arguments", he should clarify whether he is referring to one of the orientations of ma› formative or it’s the entire stative ma› paradigm that is incapable of taking a genitive arguments, because only the initiative orientation ma› is incapable of taking genitive arguments, the other orientations can.
      1. ‘Kinagalitan niya ang aso’ (Locative)
      2. ‘Ikinagalit niya ang pagngatngat ng aso sa tsinelas.’ (Translative)  
  6. It is also uncontroversial that potentive ma- formations are patient voice forms because they regularly alternate with potentive formations in other voices. Thus we find maka- for potentive actor voice, as in (25), ma- -an for potentive locative voice as in (26), and ma-i- for potentive conveyance voice as in (27): (25) "at hindí makabaríl sa kanyá" ‘(The man got bitten by the ants) and wasn’t able to shoot at him.’ (26) "kung inyóng mapagtiisán iyán" ‘if you (are able to) endure this …’ (27) "nailuto ko na" ‘(Good heavens, you will have to say it is not possible to return the rice, because) I already happened to cook it.’ .

    Viewed from the point of view of voice formation, this means that for a given voice, there are always two forms, a non-potentive one involving one of the affixations listed in Table 2 and a potentive one, which always includes ma (or na in realis mood). This is illustrated for conveyance voice by the following two examples: (28) (a) iniluto’ ko na ang manók  ‘I already cooked the chicken.’, (b) nailuto’ ko na ang manók ‘I already happened to cook the chicken.’.

    Note that the overall structure of the two preceding examples is absolutely identical. In particular, there is no change in the number or the marking of the core arguments. The correspondence shown in these two examples is absolutely regular and general: For every voice form denoting a controlled action there is a corresponding form which denotes the involuntary performance or the ability to perform this action. The basic correspondences are listed in Table 3. [pages 505-506]


    I do not agree with any of the above.

    1. The putative potentive forms are in the wrong orientation, such as ma› is not in terminative orientation but in initiative orientation, ma- -an is not in locative orientation but still in initiative orientation, and ma-i- is not in translative orientation but also still in initiative orientation. The reason is that all of them starts with m›. Excluding these potentive forms under consideration, there is no verb affix in non-initiative orientations that starts with m› (see my table 2 below for examples).
    2. maka› can’t be the initiative orientation counterpart of ma› since all verbs with formatives in the initiative orientation retain the rest of the formative except for the initial consonant m›. As an example, in the paki› paradigm, while maki› is the initiative orientation form, the ‹aki› segment is retained in terminative, translative and locative orientation forms (refer to my table 2). So for maka› to be in Himmelmann’s potentive paradigm, all the non-initiative orientation forms must have the ‹aka› segment in these forms. But that is not what we see in Himmelmann’s potentive paradigm, shown on the right side of my table 3.image

      Also, he did not provide explanation why there is a ‹ka› in maka› and its absence in the non-initiative orientations. Thus, his stative ma› non-initiative orientations do not directly correspond to any of his potentive ma› non-initiative orientations. I think the irregularity of the forms binili (realis perfective terminative orientation) and bibili (nonrealis nonperfective initiative orientation) made him believe that forms do not have to correspond in the same regular way as meanings. His earlier analysis of Ratahan, which shows a similar table on page 53, has possibly been carried over to his Tagalog analysis and retrofitted. I will show below that the correspondence in non-initiative orientations between stative ma› and his potentive ma› is only indirect. And lastly, stative causative maka› has non-inititiave orientation forms (see my tables 6 & 7): (terminative orientation : paka› ‹in, translative orientation : ipaka› , locative orientation : paka› ‹an) that is distinct from his potentive ma› undergoer forms. That is distinct still from potentive maka› forms (initiative-terminative orientation : maka› ‹in, initiative-translative orientation : maika› , initiative-locative orientation : maka› ‹an) which are all in initiative orientation. Example sentences for galit:

      1. ‘Talagang nasasaktan din kami kapag kami ay napagalitan at dahil kung ano man ang naikagalit ng aming guro ay hindi namin sinasadya at hindi namin nais na sila’y magalit samin.’
      2. ‘Si Ricardo  ay hindi na makaalis sa bangko dahil sa malimit niyang pagliban ay nakagalitan siya ng kaniyang puno.’

      Overall, what Himmelmann proposed conflicts on two counts with the regularity of other paradigms. Therefore, Himmelmann’s potentive paradigm can be split into two because they do not belonging in the same paradigm: (a) the initiative orientation maka›, belonging to the paka› paradigm and (b) the ma› potentives in undergoer roles (which are in fact in initiative orientation) belonging to the pa› paradigm. The same can be said of his stative ma› paradigm.

    3. From the point of view of orientation formation, there are 3 alternatives that we have uncovered: (a) a primitive paradigm, (b) a causative paradigm with two subtypes: non-potentive and potentive, and (c) a stative paradigm. More details of these will be looked at below, including subtypes of each.
  7. In all likelihood, uses of ma- which clearly are patient voice formations are also potentive formations. There are two major pieces of evidence for patient voice status: a) the ma- formation regularly corresponds to a patient voice formation with -in; and b) it allows for the overt expression of an undergoer subject and a non-subject actor marked as a possessive or genitive argument…… These criteria are fully met by perception predicates (‘see’, ‘hear’, ‘feel’, etc.). The following two examples show the base diníg ‘audible’ first in potentive patient voice meaning ‘hear’ and then in non-potentive patient voice meaning ‘listen to’:(29) "nang màriníg itò ng Kastila’" ‘When the Spaniard heard this, …’ (30) " dinggín mo ang maestra."  ‘Listen to the teacher.’. [page 507]

    1. I do not agree that ma› is clearly a terminative orientation formation. I agree with the two information as facts but them being evidentiary support is only superficial. I have explained it down below as an instance of “stacking”, and works very similar to case stacking.
    2. Causative ma› does not necessarily mean potentive formations. I have given the reason and details in the previous number’s comment, where I have shown that causative ma› have two subtypes (so far): non-potentive and potentive.
  8. A further morphosyntactic correlate pertains to actor voice forms. Potentives easily allow actor voice derivations with maka-, hence nakàkita siyá ng aso ‘she saw a dog’. No corresponding formations exist for statives. In fact, given that statives principally lack agent arguments one would predict that agent voice formations are impossible for statives. This prediction is true in that there is no general and regular actor voice formation for statives. However, it is false in that there are sporadic formations from statives which formally can be classified as actor voice formations because they also involve maka-. The base galit ‘anger’ is one of the bases which allow a clearly stative maka- derivation, makagalit meaning ‘to be the cause of anger, to give offence, to irritate’. In contrast to potentive actor voice formations, the subject of a stative maka- formation has to be an inanimate cause (some state of affairs or a thing) : "lahát ng kanyáng sabihi’y nakagàgálit sa akin". [page 511]

    I agree that there is no stative maka› because that would have to be derived from kaka› and that would be redundant: marking something stative when they are already marked stative. Another reason why there is no stative initiative orientation in his paradigm is that stative ma› is actually in initiative orientation, for reasons already discussed above. The terminative orientation of ma> is ka› ‹in. Additionally, I do not agree on four points:

    1. Stative ma› was put in “patient voice” when it should be in the initiative orientation (“actor voice”) row. The label “actor voice” is misleading since both stative and fientive verb paradigms are being displayed in one table.
    2. Statives can have agent-like arguments marked as subjects, like in one subtype of stative, the causative stative kapa›, although their paradigm formation is not as regular. Examples: “Napaalis ng MMDA sa mga lugar sa Metro Manila ang 36,000 na mga vendors.” “Ikinapaalis sa kanya sa bahay ni Juan ang kanyang pangungumit.” napa› is realis perfective initiative orientation and ikinapa› is realis perfective translative orientation of kapa›.
    3. I have already mentioned above that maka› and potentive ma› does not belong in the same paradigm. The difference between maka› and potentive ma› is not one of orientation but the presence of a dedicated stative marker ka›. maka› is stative causative from paka› while potentive ma› is from the base causative form pa› which can include stative and non-stative (e.g. potentive) meanings depending on the base word.
    4. maka› is analyzed as being in two paradigms when it should just be in one separate paradigm by itself as a subtype of causative pa›. Because his stative maka› formally belong in the same paradigm as his potentive maka›, it again indicates some flaw in his potentive vs. stative paradigm. I will provide an analysis below how maka› can belong in the same paradigm yet seems to have these two meanings.
    5. He seems to emphasize that subjects of "actor voice" verbs need to be in actor role for potentives as well as statives. This orientation actually is about the origin of the action, that’s why I use the term "initiative orientation". It just so happens that is is the actor or agent in fientive verbs. The originator of the action in stative verbs need not be an actor role, and normally is in experiencer role.
  9. Semantically, actor voice statives are difficult to distinguish from conveyance voice statives in that both refer to the cause for a given state. However, they differ syntactically. In conveyance voice, the experiencer is expressed by a genitive phrase, not by a locative phrase. Compare the preceding example (35) "lahát ng kanyáng sabihi’y nakagàgálit sa akin" ‘Everything he says irritates me.’ with: (36) “ikinagalit niyá akó.” ‘She got angry at me (I was the reason for her being angry).’  [pages 511-512]

    1. Again, the use of “actor voice” for statives is misleading.
    2. Distinguishing ikinagalit vs nakagagalit is not too difficult if we follow the formatives used. Nakagagalit is about the origin of what is causing such a given state, while ikinagagalit is what conveyed him to be in the given state. Although mentally they can both indirectly refer to a cause, the distinction being made between the two words is the originator of the cause versus the conveyer of the cause.
    3. ikagalit and nakagalit are not in the same orientation paradigms: ikagalit is in stative ka› and nakagalit is in stative causative paka›.
  10. With regard to productivity, the stative actor voice forms are the least common of all stative formations and whenever they occur they often take on somewhat specialized meanings. Thus, for example, makagalit is ‘irritate, antagonize, give offense’ rather than plain ‘make angry’. Furthermore, the stative actor voice derivations are often conventionalized in one of the four aspect/mood forms, for example, naka-àawa’ ‘arousing pity, pitiable’ (< awa’ ‘mercy, compassion’) or nakàka-litó (or naka-lìlitó) ‘confusing’ (< litó ‘confused, at a loss’)

    The ma- prefix is considered the basic form which is simply glossed as ST(ATIVE). The actor voice prefix maka- occurs in parentheses to indicate its lack of productivity and the frequent occurrence of “defective” formations which do not allow aspect/mood alternations. In this regard it should be noted that all stative voice alternations – like all voice alternations in Tagalog – are not fully general in that they are not conventional with every stative base, with the exception that ma- occurs on every stative base. In addition to the basic ma-form, the conveyance voice forms are the most productive and widespread, occurring, for example, with all bases denoting emotions. Locative voice is distinctly less common. [page 512]

    Four points are incorrect here:

    1. As mentioned above, maka› is not part of the stative paradigm but part of the stative causative paradigm, a subtype of the causative pa›.
    2. maka› is not unproductive or defective in its aspect/mood paradigm by having just one form in realis nonperfective (naka-àawa’,naka-lìlitó) but all four aspect/mood forms can be found in the wild. The two forms not shown yet (apart from the realis nonperfective and nonrealis perfective forms) are the nonrealis nonperfective forms (maka-àawa’,maka-lìlitó) and the realis perfective forms (naka-àwa’,naka-litó) and found in such sentences as ‘Ang panukalang ito ay nakalito sa lahat halos ng naroroon’, ‘Iwasan ang mga kilos na makalilito sa atensyon ng mga tagapakinig.’
    3. I don’t think the meanings of maka› are specialized. I think it has more to do with translational difficulty because English and Tagalog does not have the same meaning building blocks and expressions, so, to avoid verbose yet exact meanings, a conventionalized translation is done. 
    4. The stative paradigm initiative orientation form is the stative ma›. It is not in terminative orientation as displayed in Himmelmann’s paradigm.
  11. The voice alternation test also provides important language-internal evidence for the problem of invariable ma- formations. Despite the fact that ma- here is invariable (does not alternate for aspect/mood), these formations partake in some of the stative voice alternations listed in table 5. In particular, most “quality” ma- formations allow derivations with ika- which occur in all four aspect/moods (they are rarely attested in texts, though). For example, there is ikaliít ‘get small on account of’, ikabuti ‘improve/get better on account of’, and ikagandá ‘be(come) beautiful on account of’. The following example illustrates the use of ikagandá as a main predicate: (41) "ikinagandá ko ang pagtina’ ng buhók ko." ‘I became beautiful because I dyed my hair (on account of dying my hair).’

    Stative actor voice derivations with maka- are also possible with “quality”- denoting bases, as in makaliít ‘(inanimate) cause for someone or something to become small(er)’, makagandá ‘(inanimate) cause for someone or something to become beautiful’, or: (42) "Nakabuti sa kanyá ang gamót." ‘The medicine benefited him (did him good).’ Locative stative voice derivations with ka–an do not occur with these bases, probably because there is a very productive homophonous derivation denoting abstract qualities (e.g. kaliitán ‘smallness’, kabutihan ‘goodness, kindness’, kagandahan ‘beauty’), which does not belong to the stative voice paradigm.


    The fact that invariable ma- formations partake in stative voice alternations lends further support to the analysis of invariable ma- as a defective member of a single stative ma- paradigm rather than considering it a homonymous formation totally unrelated to variable ma- formations. [page 515-516]

    [O]ne could acknowledge the obvious semantic communality that both ma- formations denote states and consider invariable ma- formations defective members of a single class of ma- marked words. This is implied in Bloomfield’s solution (1917: 288f) who suggests that invariable ma- formations form a subclass of “special static words” within the larger class of maformations. [page 503]

    I agree that use #1 of invariable ma› belongs to the stative paradigm. Two points:

    1. Again, maka› is not the initiative orientation of stative ma›. Rather, ma› itself is the initiative orientation.
    2. Stative locative orientation do occur, you just have to know what you’re looking for. For example, the realis perfective kinaliitan and nonrealis nonperfective kakaliitan are used in these sentences:
      1. ‘Yung mga kinaliitan kong damit tinatabi ko pa, kasi baka pumayat pa ako eh.’
      2. ‘Kakapalit lang nya ng school shoes nya at ito yung kinaliitan nya masakit na daw.’
      3. ‘Yung bath tub bili na nang pang matagalan. Wag yung mga mamahalin o kaya kakaliitan agad kasi matagal na gagamitin.’

      I can think of kaliitan as a verb, but looking for example in the wild is more difficult as it is orthographically identical with kaliitan ‘smallness’ but the verb has stress on the last syllable.

  12. variable ma- formations are voice-marked and allow for voice alternations. More specifically, ma- formations partake in two different voice paradigms, being either patient voice potentives or basic statives. Potentives and statives differ not only in terms of their semantics – potentives denote dynamic eventualities, statives states – but also with regard to argument structure: Potentives, at least underlyingly, involve an agent or effector, statives don’t. [Himmelmann #1, page 516]

    I agree that the formative ma› partake in two different orientation paradigms, the causative pa› (with subtypes potentive and non-potentive) and the stative ka›. However I disagree on two points in that ma› paradigm also has orientation alternations in a different sense that what Himmelmann thinks:

    1. potentive maka› has separate orientation alternations from both potentive ma› and stative ma›.
    2. the ma› in potentive ma› and stative ma› paradigms is in initiative orientation for both of them. The terminative orientation for stative paradigm is ka› ‹in, for non-potentive causative it’s pa› ‹in, and for potentive-causative is ma› ‹in.
  13. The distinction between statives and potentives …… also has a number of morphosyntactic correlates which can be most easily shown by contrasting stative expressions with corresponding potentive ones. A particularly clear illustration of the difference is provided by potentive perception expressions since these involve an experiencer rather than an agent in the strict sense and thus are rather similar to expressions for emotional or bodily states which also  involve experiencers…. That is, the basic alignments of semantic roles and syntactic functions is very different in potentives and statives even though both formations may denote experiences. Table 4 summarizes the differences. [Himmelmann #1, page 510-511]

    When [ma-prefixed words denoting acts of perception] are used as predicates, thing perceived appears in subject position (marked by the specific article ang or one of its alternates), while the perceiver appears in a genitive or possessor phrase.…The fourth set of ma-words denotes involuntary actions. If the action is semantically transitive, the undergoer occurs in subject function, while the actor appears in a genitive or possessor phrase, just as in the case of perception predicates.… Again, words affixed with ma- in this sense [ma- deoting the ability or opportunity to carry out an action] are undergoer-oriented in that the undergoer occurs in subject function. [Himmelmann #1, page 492-493]


    It does look like those basic alignments do apply but the scope of potentive verb class is actually larger than just ma› and maka› and also include other verbs with pa› derived bases, such as makapag›, makapagpa›, mapa›, mapang›, etc. Therefore, it is difficult to definitively say if the alignments in his table 4 hold across all of them without an exhaustive study.

  14. This system allows us to systematize the different uses of variable ma- …. These can now be seen to fall into two higher level categories, potentive and stative, as detailed in Table 7. Note that this  higher-level distinction is based primarily on language-internal evidence(different voice alternations, different argument structure, etc.) and is not simply an instantiation of a putatively universal scheme. [Himmelmann #1, page 518]


    The place of positionals and locationals in this distinction is not quite straightforward. Their placement here in the stative column is tentative. [Himmelmann #1, page 522, Note 17]

To summarize, (1) I think potentives as a group exists, but the exact details of Himmelmann’s potentive paradigm is incorrect: (a) The scope of such potentives cover not just ma› and maka› but also other verbs with pa› derived bases, such as makapag›, makapagpa›, mapa›, mapang›, etc. (b) Potentives paradigm are only productive as and exists in initiative orientation. (c) Potentives are a subtype of causatives, which has other subtypes which we can call as non-potentives as a group. (2) The concept of statives is correct but the paradigm details are also not correct. (a) The scope of statives cover not just ma› but also other verbs with ka› derived bases, such as kapag›, kapagpa›, kapa›, kapang›, etc. (b) maka› is not a part of stative paradigm. (c) Statives paradigm are productive, and that ma› is in initiative orientation. I will show how I think the paradigms should look like below.

Potentive vs Non-Potentive Paradigm Correspondence/Contrast:

Himmelmann asserted that there is ample evidence to support this. I have listed down below those that he mentioned:

  1. There is a constant  semantic ratio in that -um- relates to maka- as -in to ma-, etc., independent of the formal make-up of these forms. [page 506-507]

    Yes, but this has another, different explanation.

  2. The correspondence between potentive and non-potentive forms is very general. For almost all potentive forms there is a corresponding non-potentive one and vice versa. (See Table 2 above.) [page 507]

    This has another, different explanation.

  3. The alternation between the two formations is also syntactically and semantically absolutely regular: The number and coding of arguments in both constructions is identical and the meaning difference always pertains to ability or lack of control. [page 507]

    Again, this has another, different explanation.

  4. The potentive/non-potentive distinction constitutes an obligatory choice in Tagalog grammar. That is, there is no neutral way to say ‘I broke a glass’ in Tagalog. Either I did it on purpose, in which case a non-potentive form has to be used. Or it was an accident, in which case it is necessary to use the potentive form (see also Wolff et al. 1991: 305f). This (part of the) paradigm is thus a truly inflectional paradigm in the sense established in section 3 (Wolff et al. 1991: 284, in fact, speak of potential inflection)…… Depending on the meaning of the base, it is of course also to be expected that a potentive patient voice formation regularly alternates with other potentive voice formations. The following example shows diníg in potentive actor voice: (31) "at nakàriníg siyá ng mga huni ng ibon" ‘… and then he heard some birds chirping.’ [page 507-508]

    I would agree if potentive verbs are expanded to include other verbs with pa› derived bases, such as makapag›, makapagpa›, mapa›, mapang›, etc. If not, Himmelmann’s potentive/non-potentive obligatory choice is a direct result of an arbitrary paradigm since other paradigms that fit the data better can be created that does not have this binary choice anymore. See below for more details.

  5. Perception predicates semantically fit the non-potentive – potentive distinction: The potentive forms refer to unplanned, casual, non-directed perceptions, the non-potentive forms to perceptions which are controlled in the sense that attention is consciously directed towards a given input. The major difference between the action predicates such as ‘shoot’ and ‘cook’ and a perception predicate such as diníg is that the latter usually occurs with potentive affixation, while the former are more frequently found with non-potentive affixation.

    The fact that perception predicates appear in two different formations which vary with regard to intentionality or control will not come as a surprise to typologically informed readers. Similar differences are found in languages which have grammaticized the distinction between dynamic (or active) and stative eventualities (see, for example, Mithun 1991). Two points are to be noted here. First, many existing descriptions of Tagalog set up a special verb class for perception predicates, assuming that there is a special maka-/ma- inflection for these verbs (e.g. Schachter and Otanes 1972: 288, 296) and thus missing the generalization that there is a highly general potentive/non-potentive alternation for predicates of nearly all semantic classes. Second, as will be seen shortly, potentives including nondirected perceptions are strictly to be distinguished from “truly” stative eventualities in Tagalog.[page 508]

    Agree, but potentives must be expanded and not as conceived by Himmelmann.

To summarize, I think there is a better paradigm than the potentive paradigm that Himmelmann created. I have explained it in more detail down below.

For Stative vs. Non-Potentive Paradigm Correspondence/Contrast:

  1. As in the case of both non-potentive and potentive dynamic voice alternations (see Table 3), the inherent link with aspect/mood inflection provides the major formal evidence for the view that the formations listed in Table 5 form a derivational paradigm. [The] productive derivational relations between stative and dynamic formations. That is, lexical bases are not limited to occurring in either dynamic or stative formations. In principle, i.e. inasmuch as the resulting formation makes sense semantically and is useful pragmatically, all lexical bases can occur in either paradigm. Hence, there are dynamic derivations from bases typically denoting states and vice versa. For example, takot ‘fear’ – which usually occurs with stative affixations – also allows for dynamic derivations as in : (37) "Huwág mong takutin ang bata’." ‘Don’t frighten/scare the child!’ (38) "Sino ang tumakot sa iyó?"  ‘Who frightened you?’. And conversely, usually dynamic putol ‘cut’ also allows for stative derivations as in : (39) "Ikapùputol ng mga sangá ng kahoy ang malakás na hanging itó."  ‘This strong wind will cause many branches of trees to break off.’. Consequently, there are constant correlations across dynamic and stative formations (e.g. tumakot relates to ikatakot as pumutol to ikaputol, etc.), which in turn suggests that the dynamic and stative paradigms themselves are also in a paradigmatic relationship (see below, Table 6). [page 513-514]

    I agree.

  2. Positional predicates with ma- do not only mean being in a given position but also getting oneself into a given position. For example, maupó’ does not only mean ‘be seated’ but also ‘seat oneself, sit down’. This second meaning is somewhat unexpected in that ‘sit down’ clearly involves an agentive argument. Not surprisingly then, there is also a dynamic form umupó’ ‘sit down, sit on’. It is hard to tell whether there is a real semantic difference between dynamic umupó’ and stative maupó’ in the reading ‘sit down’. In fact, the two forms are interchangeable in many contexts. Thus, both maupó’ kayó and umupó’ kayó are used for ‘sit down!’ (imperative). Note that the two forms differ in their other readings. Only maupó’, but not umupó’, also means ‘be seated’. And in contrast to stative maupó’, dynamic umupó’ also means ‘sit up (as when getting up from bed)’. [page 517]

    maupó’ kayó as ‘sit down’ is a rough, quick translation, more like a paraphrase and distorts the meaning a bit. maupó’ kayó and umupó’ kayó are translatable only as ‘be seated (down)’ and ‘sit (down/up/on)‘ respectively, so the difference is obvious to see and determine which one is stative or fientive.

To summarize, there is a stative paradigm that is distinct from the non-potentive paradigm as illustrated in Himmelmann’s Table 2.

Diagnostic for Distinguishing Potentives vs Statives

Since any lexical base can appear with either stative or fientive formatives and consequently “stative basic voice and stative actor voice are formally identical to potentive patient voice and potentive actor voice, respectively”,  Himmelmann provides a means to distinguish them:

  1. In context, if an agent is overtly expressed or its participation is clearly implied, the form is unambiguously to be interpreted as a dynamic one. Otherwise, a stative interpretation is the default interpretation. [page 514]

    Not entirely. There are stative subtypes, like causative statives, that allow agent-like interpretation of subjects so this is not an absolute rule. And because it is not an absolute rule, this should never ne used as a diagnostic for distinguishing dynamicity and control paradigms.

  2. For ma- with inanimate effectors,  there are two pieces of formal evidence for analyzing a given ma-formation as potentive patient voice rather than stative.

    1. First, only potentives allow the overt expression of an argument marked by the genitive marker ng. Basic statives do not allow genitive-marked arguments but only locative marked ones.
    2. Second, voice alternation test … provides important language-internal evidence for the problem of invariable ma- formations.

    [page 514-515]

    I think this needs further clarification.

    1. Statives, basic or derived, allow a genitive marker in non-initiative orientations.
    2. I think its mainly the orientation and their argument marking that distinguishes a dynamicity and control paradigms and their subtypes.


It is evident from my comments that I think Himmelmann’s paradigm is incorrect, which I reproduced below, Table 2, which was taken from [Himmelmann #2].


To summarize the reasons:

  1. All of Himmelmann’s potentive ma› undergoer roles are in the wrong orientation. They are in initiative orientation since all of them starts with m›, including all the putative potentive non-initiative orientation formatives that start with m›. Excluding this paradigm under consideration, there is no verb affix in non-initiative orientation affixes that starts with m› while all non-initiative orientation affixes in potentives start with m›. See My Table 3. image
  2. maka› can’t be the initiative orientation counterpart of ma› since all verbs with formatives in the non-initiative orientation retain the rest of the formative except for the initial consonant m›. See the examples in My Table 2. Only initiative orientation has m› and all non-initiative orientations retained the ‹ag› part of the initiative orientation formative. Potentives non-initiative orientations should have ‹aka›. 
  3. No explanation was provided why there is a ‹ka› in maka› that is absent in the undergoer voices.
  4. Potentive ma› and maka› do not belong in the same orientation paradigm, but they do belong in the wider potentive paradigms that includes other verbs with pa› derived bases, such as makapag›, makapagpa›, mapa›, mapang›, etc.  
  5. Statives can have agent arguments, especially if they are causative statives and stative causatives, which are different from each other and also have a regular paradigm. In fact, maka› is stative causative initiative orientation.
  6. Maka› cannot be in two paradigms: not in the stative paradigm and should be in the causative paradigm. We can use orientation alternation and form as a guide to prove this.
  7. All four forms for aspect/mood can be found in the wild for maka›. Apart from maka-litó and naka-lìlitó, the two other forms exists. The nonrealis nonperfective forms (maka-àawa’,maka-lìlitó) and the realis perfective forms (naka-àwa’,naka-litó) can be found in such sentences as ‘Ang panukalang ito ay nakalito sa lahat halos ng naroroon’, ‘Iwasan ang mga kilos na makalilito sa atensyon ng mga tagapakinig.’ .
  8. Although a potentive paradigm does exists, the similarity and correspondence in distinction with potentives can be explained differently without involving a Himmelmann-type potentive paradigm. The correspondence in orientation between the paradigms is not formally direct, only indirect through semantics.
  9. Stative ma› do show orientation correspondence in a regular and general fashion as opposed to his analysis. Stative locative orientation do occur, you just have to know what you’re looking for. For example, the realis perfective kinaliitan is used here: ‘Yung mga kinaliitan kong damit tinatabi ko pa, kasi baka pumayat pa ako eh.’ I can think of kaliitan as a verb, but looking for example in the wild is more difficult as it is orthographically identical with kaliitan ‘smallness’. I have already indicated above that it is the potentive ma› that do not show orientation correspondence in a regular and general fashion. This is the reverse of his position.
  10. Stative ma› does not exclude implied agents and is not limited to only inanimate things or abstract states of affairs.   


Let’s work it how the correct paradigm should look like. I will be creating a series of tables to show how to arrive at the correct paradigms. To interpret the tables, bear in mind the following:

  1. We know that the orientation affixes are applied to the verb base and the verb base can be a root word or a derived word. If it’s a derived word, then there would be layers of affixes and root, shown by the layers of {   }
  2. Affixes are shown using guillemets: ‹infix›, prefix› and ‹suffix. An affix can also show layers, in that an infix applied later will be inside another infix, like ‹‹um›in›. This is a better representation than ‹um›‹in› because it eliminates other interpretations like, they were infixed at the same time, or that ‹in› is infixed after ‹um› but is not infixed after the initial consonant. 
  3. Truncated segments re shown by (  ).
  4. The table shows three stages of the formative separated by → :
  5. (a) the relationship between affixes and the verb base showing the layers of affixation and bases
  6. (b) the formative showing what was and truncated, and
  7. (c) the remaining formative after truncation.
  8. I have included only orientation and mood in the paradigm, and not aspect as different Philippine languages have different way of marking nonperfective, and pluractionality is only marked in Bikol.
  9. As everybody knows already, the realis Mood infix ‹in› is applied before the orientation affixes are subsequently applied. (This might not be the case with Ilokano.)

Let start from the basic. This is the core paradigm, which applies to all verbs:


The above table indicates that for each word, realis mood is applied first before any orientation affix is added. This is the regular affixation order for Tagalog and Bikol. Even Bisayan verbs from derived bases follow this order. However, Bisayan languages have the reverse order for verbs from simple bases. Terminative suffix for Bikol, Sugbuhanon, Hiligaynon, and Samarnon is ‹on. In other languages and dialects in Bikol (Rinconada, Miraya) and Visayas (Kiniray-a, dialects of Sugbuhanon and Samarnon) it is ‹ǝn instead of ‹in.


The core orientation affixes are applied either to simple bases, derived bases or phrasal bases. Derived bases can be  categorized into 5 types, four of them illustrated in the below table.


The fifth, “everything else” is a catch-all for those that can’t be included in any of the four above, so would be a heterogeneous group. All these basic derived bases have subtypes as well. I will not be showing here the subtypes for pag› and paŋ›.

However, the only ones involved in Himmelmann’s works are ka› and pa› derived bases, the paradigms of which are numbered #1 and #2 above. As you can see, the initiative nonrealis formative is ma› for both ka› and pa›.

Below are further paradigms showing how both ka› and pa› can be applied on derived bases. Ka› have 4 subtypes and Pa› has 8 subtypes. How productive these combinations are remains to be investigated. Also, I need to provide example sentences to really prove that these forms can exist in normal conversations, as some of the forms with ? marks are confusing in that the forms by itself may exists in the other paradigms.




One subtype of ka› is ka› on pa› derived bases, as in ka›{pa›  }. Vice versa, one subtype of pa› is pa› on ka› derived bases, as in pa›{ka›  }. Presented side by side, ka› and pa› formatives and the two combinations would look like below, which shows a causative and non-causative type of statives and stative and non-stative type of causatives.


Maka› being a stative causative counterpart of causative ma› is the reason why it behaves similarly to causative/potentive ma›.  Maka› is in the initiative orientation of Himmelmann’s potentives and statives because that is it’s correct orientation. While all the ma› formatives (ma›, ma› ‹an, mai›) in both of Himmelmann’s potentives and statives are in the wrong orientation. This also explains why there is a ka› in maka›, which is not explained in Himmelmann. Furthermore, this is the reason why statives ma› have no maka› counterparts. Adding a stative on top of an already stative meaning does not make sense as it is redundant, although adding a causative over another causative makes sense.

The differences of the above stative paradigms with Himmelmann’s statives are:

  1. Statives have four types, with causative statives just one of them.
  2. Primitive statives paradigm is the base from which all the other stative types are derived.
  3. Ma› is in initiative orientation instead of in the terminative orientation.
  4. Maka› is not part of the stative paradigm at all, but that of the causative, specifically, its subtype stative causative.
  5. Maka› is part of just one paradigm, stative causative, and never straddle several paradigms.

As for the differences between Himmelmann’s potentives and my paradigms:

  1. There is no single potentive verb class. The two types of verbs in this class belongs two different paradigms:
    1. maka› is part of the stative causative verb paradigm.
    2. ma› is part of the primitive causative verb paradigm.
  2. Primitive causatives paradigm is the base from which all the other causative types are derived.
  3. All of the non-initiative orientation formatives in Himmelmann’s potentives are in primitve causatives initiative orientation. It is not shown above, but will be shows in the next table.
  4. Potentives as a group of verb or a unitary paradigm does not exist.


I mentioned that potentives is a result of orientation stacking, not different from case stacking. The below table shows paradigm for involuntary action and ability verbs. These verbs are located in the green cells in My Table 7. As indicated, all these verbs were from pa› derived bases, which themselves came from bases already inflected with orientation affixes.


This is just the tip of the iceberg, so to speak. There are a lot more potentive paradigms where orientation stacking happens, with 5 subtypes, each further subtypes for some of them, for a total of 17. This is not exhaustive as I’m sure I can still extend them with a few more:





This case stacking is not just limited to pa›. It also happens to pag›:


Orientation stacking is the reason why potentives or involuntary action and ability verbs mean and behave like they do. The derived base is already carrying an orientation affix but then subsequently was applied with initiative orientation affix. Contrast the following three pairs of sentences:


Except for the verbs, the pairs have the same exact same structures for each pair. The second sentence in each pair are causatives in terminative, translative and locative orientation respectively. These verbs are formed in differently. Let’s have a look first at the non-initiative orientations which are formed in the following normal or ordinary way on pa› derived bases:


Take note that the bases are simple roots. Potentives, the first of the pairs above, however are formed differently:


In involuntary action and ability verbs, the base is derived. The derived base contains an orientation affix which already indicates the case role of the verb. So, the formatives for these ma› potentives then literally mean the following in the a/b sentence pairs:


Potentive verbs mean that the subject will also play undergoer roles. The involuntary aspect in the meaning stems from the use of pa› when compared with pag›.

Potentive verbs, since they are all applied with initiative orientation, can not have derived bases already with initiative orientation as shown by My Table 15. At least in Tagalog and Bikol.


The above also explains why potentive verbs appear in initiative orientation. They cannot appear in non-initiative orientation outside of the derived bases in either Tagalog or Bikol as shown by the following tables. Whether they can appear in non–initiative orientation depends on looking at other related languages if those are possible.


Missing Pai›

Two of the forms in Mt Table 10 seem not to exist in Tagalog: *pa›ibili and *p‹in›aibili which come from the combination *pa›i and *p‹in›ai. It does not exist in Bikol as well. A quick look at Rubino’s Ilokano grammar, Wolfenden’s Hiligaynon grammar, Wolff’s Sugbuhanon grammar and Romualdez’ Samarnon grammar showed nothing as well. I will check later their Spanish grammars.

However, these forms exists in Sugbuhanon when I check Wolff’s Cebuano dictionary, where there is a pahi- entry for the words balu, ígù, matngun. Example for pahibalu:


In Edgie Polistico’s list of Cebuano Affixes, he has the forms mahi› and nahi› for these:



Beato De la Crus and David Zorc have this comment in their A Study of Aklanon Dialect Volume One, page 67:

3,3. THE ACCIDENTAL MOOD (p-) states that an action takes place completely by happenstance. It has come down by usage generally unmarked by an aspect morpheme, though on some occasions (mostly of deep or archaic use) it can occur with either na- or ma- respectively. Most commonly, however, some other element in the sentence or clause expresses the time of the action. The general forms, then, are:

     hi–      [nahi—]


And in Volume Two, those prefixes have these definitions on pages 252 and 267:


[Happenstance verb prefix denoting future or unreal accidental action. ] [G. 67-68, 95] [D. 19, 22]  Indi nakon ikaw mahilipatan. / I’ll never forget you./

nahi—(pfx) [Oak] 

(Happenstance verb prefix denothig present or real
accidental action. ] [G. 67 -68, 95] [D. 19, 22)

These forms, pahi›, mahi› and nahi› were the same forms as pai›, mai› and nai›, the difference being the loss of h as I have mentioned in another post.

Why does maka› behaves like its in two paradigm?

Himmelmann included maka› in both his potentive and stative paradigms. Quoting Himmelmann:

Potentives easily allow actor voice derivations with maka-, hence nakàkita siyá ng aso ‘she saw a dog’… The base galit ‘anger’ is one of the bases which allow a clearly stative maka- derivation, makagalit meaning ‘to be the cause of anger, to give offence, to irritate’. In contrast to potentive actor
voice formations, the subject of a stative maka- formation has to be an inanimate cause (some state of affairs or a thing):
[Himmelmann #1, page 511]

I said above that it is in stative causative paradigm only. So how is it possible for maka› to have actor and no actor in different constructions? It’s because it does not actually refer to an actor. Being pa› and derived from bases with ka›, it just means “to cause to be in a state”. So the cause may have volition (actor) or may not have volition (natural force or process), and doesn’t even have to be a force (a reason).

Below are the possible derivation of maka› verbs. As expected, only the initiative orientation forms make sense.


This is another evidence that maka› is not the “actor voice” of undergoer potentive ma› forms, because maka› has those forms as well: makabaril, maikabaril and makabarilan.


In Bikol, the same formatives are present, with very few differences. The first noticeable difference is that there is no vowel lengthening in potentive ma›. This is because ma:› is the nonrealis imperfective initiative formative of pag›. a few of the derived bases are less common , especially those involving pag› forms.

Among Bisayan languages, Tagalog ma:› is present as maha›, plus there are a few more pa› derived bases, and ka› seems much more common. I will expand this section in the coming days.


I learned a lot from Himmelmann about paradigms, and have recast his paradigms to suit the way I understood how Tagalog, Bikol and Bisayan languages work. One thing I learned is that by paying attention to the forms first, one will be rewarded with better explanations rather than by going by meaning of the forms first. Correspondence in in paradigms does not guarantee correct paradigms. One must pay attention to the forms first, especially in agglutinating languages.