Wiktionary:Beer parlour/2022/December

Archived revision by Mervat (WMF) (talk | contribs) as of 13:00, 12 December 2022.

Japanese kyujitai

Previous discussion: Wiktionary:Beer_parlour/2020/November#Move_kyujitai_to_t:ja-kanjitab

Currently in Japanese entries, both t:ja-kanjitab and the headword template are capable of displaying kyujitai. But obviously we want only one of them. So I suppose the community should make a decision on which to stay and which to go.

Considering that kyujitai is specific to the details of how a word is spelled in kanji, and that is the entire purpose of {{ja-kanjitab}}, it makes more sense to me to have all of the spelling, script, and reading-type information consolidated into that template. The headword is already crowded with other information, which was (I think) a big part of the impetus in the creation of {{ja-kanjitab}} in the first place. ‑‑ Eiríkr Útlendi │Tala við mig 22:58, 2 December 2022 (UTC)[reply]
To me 旧字体 (kyūjítai) and 歴史的仮名遣い (rekishiteki-kanazúkai) should both be at the end of the "Alternative spellings" box, with their own "Historical spellings" caption and kanji/kana labels. No need to add either to the headword, it's just visual noise. — Sartma 𒁾𒁉𒊭 𒌑𒊑𒀉𒁲 12:42, 3 December 2022 (UTC)[reply]
Adding a bit more: separating "normal" alternative spellings from historical spelling would allow us to deal with cases like 掴む (tsukámu) and 我が儘 (wagamáma) in a clearer way.
I would put kyūjítai and rekishiteki-kanazúkai together under "historic" with kanji and kana labels since they were both the standard before the post-war reforms that gave us Modern Japanese orthography, so they do belong together like modern kanji and kana belong together. — Sartma 𒁾𒁉𒊭 𒌑𒊑𒀉𒁲 12:52, 3 December 2022 (UTC)[reply]

(Notifying Eirikr, TAKASUGI Shinji, Atitarev, Fish bowl, Poketalker, Cnilep, Marlin Setia1, 荒巻モロゾフ, 片割れ靴下, Onionbar, Shen233, Alves9, Cpt.Guapo, Sartma, Lugria): I think this is worth a vote if no conclusion can be drawn here. -- Huhu9001 (talk) 08:59, 3 December 2022 (UTC)[reply]

@Huhu9001 As a bot owner but not a Japanese editor, I think we should do what's right irrespective of how many pages need to be changed. Changing 6000 pages by bot is really not a big deal (I did one change a few years ago that hit about 1.4 million pages ...); the only question is how much can be automated vs. how much needs to be done manually. Any idea about that? For example, if 100 pages can't be handled automatically, that's fairly easy to do manually; doing 1,000 pages manually is more difficult and would best be handled by the "push-manual-changes" method I use for such situations (where you load all the pages into a text file, do all the edits there and then push the results using a bot). Benwing2 (talk) 19:44, 3 December 2022 (UTC)[reply]
  • I agree with Eirikr and Sartma that ja-kanjitab is the more sensible place for kyujitai, as it is a matter of written form. I would have no objection to a 'historical' section that is somehow separated from other alternative written forms. Cnilep (talk) 01:20, 6 December 2022 (UTC)[reply]

Position of box templates

While working on Umbrian, it was pointed out to me that my usage of {{normalized}} was going against what's written on its documentation, that is, to place it at the end of the entry, where as I place it at the beginning (see avif, persklom, etc.).

This is also common practice with {{LDL}}, and I find it odd, since all the other box templates I can think of are placed at the topmost of the entry: most notably {{reconstructed}} (which actually comes before the L2 header) and {{phrasebook}}. {{hot word}}, although not a box, might also be worth mentioning.

With our current positioning (1) the box is theoretically inside the last header, usually References or Further reading, (2) important information is at the bottom of the page, which is not ideal since due to the bright green everyone is going to look there first thing anyways, so placing it at the top would make the reader follow the normal top-to-bottom order, and (3) it looks worse: I mean... look at the spacing (eg: nyelingur, суъптаъ), it looks like it ended up there by mistake. Cast your votes.

Catonif (talk) 20:15, 3 December 2022 (UTC)[reply]

Top. Vininn126 (talk) 21:21, 3 December 2022 (UTC)[reply]
Top, after L2 header for this template and all other box templates that apply to the entire L2 entry. JeffDoozan (talk) 17:25, 4 December 2022 (UTC)[reply]
At the very Top of L2 section. DCDuring (talk) 17:49, 4 December 2022 (UTC)[reply]
Does that mean above or below the L2 itself? Vininn126 (talk) 17:50, 4 December 2022 (UTC)[reply]
Below the L2. Theknightwho (talk) 19:27, 4 December 2022 (UTC)[reply]
Top makes more sense IMO. —Al-Muqanna المقنع (talk) 21:08, 4 December 2022 (UTC)[reply]
Bottom, as these boxes do not contain any important information at all. MuDavid 栘𩿠 (talk) 01:30, 6 December 2022 (UTC)[reply]
Does this imply that box templates should not be used if they do not apply to all the homographs in the language section? --RichardW57 (talk) 01:51, 6 December 2022 (UTC)[reply]
As appropriate. I think nyelingur, which uses {{LDL}}, looks better as it is. Seeming to be in a References section is actually appropriate for that template. The box for that template will be disruptively thick if it appears at the start of the language section. By contrast, boxes for {{rfv}} indicate something that needs attention; if one is looking up a word found in a durable medium, the user may be able to help. --RichardW57 (talk) 02:04, 6 December 2022 (UTC)[reply]
I suppose I could see moving the LDL template up a la {{hot word}}. Moving {{normalized}} up just seems like clutter, if we're comprehensively normalizing all the words from a reference work in one script/orthography to another script, so my personal aesthetic preference would be to leave that box at the bottom. - -sche (discuss) 02:27, 6 December 2022 (UTC)[reply]
I'm happy to see this matter is gaining so much input.
I would like to underline that we should not have this floating homelessly in the entry, that is, not being technically under any of our headers, defying the structure. If we really want it to be in the References (even though it's not a reference), so be it, as long as in EL we clearly state "the References L3 is for references... and green boxes", and that in the presence of (e.g.) Anagrams, the latter will be placed below. Now this doesn't sound so great, but it's the only way to make the boxes not defy our tree-like structure while still staying at the bottom. Either this, or top.
@MuDavid: while I agree that it is subjective whether the information is important or not, the point is that the bright green is going to attract the eye nonetheless, and in that case, better reading top-to-bottom than top-jump-to-bottom-go-to-top-and-read-to-bottom. @RichardW57: not sure how homographs should be dealt with, but that sounds an exceptionally good reason to have the box at the top (right under the ===Etymology N=== header). For it at the bottom, see at saman#Azerbaijani. @-sche: are you suggesting we move {{LDL}} to the top, but not {{normalized}}? They work and look very similarly, it would be weird to have them with separate positioning. Imagine ката#Udi. About the clutter, I need to point out that (in Umbrian, which I presume is what you're talking about) not all words are normalized, some being lemmatized in the same spelling in which they are actually attested.
Catonif (talk) 13:37, 6 December 2022 (UTC)[reply]
(Currently it's 4-2, so it'll likely be top.) Vininn126 (talk) 13:45, 6 December 2022 (UTC)[reply]

Ok, I waited to not take any premature decision. Seeing that the discussion had the result of top (5-2, counting myself), I can change the the documentations, but on the other hand, I can't manually move all istances of the templates, can this be automated by a bot? Catonif (talk) 15:29, 9 December 2022 (UTC)[reply]

I can have my bot enforce this, which templates should always be at the top of the language entry after the L2 header? Just {{normalized}}, {{reconstructed}} and {{hot word}} or are there others? JeffDoozan (talk) 01:54, 10 December 2022 (UTC)[reply]
Thank you! They should be {{normalized}} and {{LDL}}. {{hot word}} should technically already be there, and {{reconstructed}} is actually before the L2, and I think everyone's fine with that. Catonif (talk) 07:08, 10 December 2022 (UTC)[reply]
Thanks Jeff! Vininn126 (talk) 11:47, 10 December 2022 (UTC)[reply]
The 'top' position is not immediately after the L2 header. It is after the L2 header or Etymology N header. --RichardW57 (talk) 10:37, 11 December 2022 (UTC)[reply]

{{defdate}} vs {{etydate}}

Would anyone mind if I changed etydate to be placed in the etymology line? Vininn126 (talk) 09:23, 5 December 2022 (UTC)[reply]

Support. Hopefully also a bot to do the cleanup. Catonif (talk) 19:03, 5 December 2022 (UTC)[reply]
@Vininn126: I object to your proposal on the grounds of unintelligibility. What change are you proposing? --RichardW57 (talk) 23:42, 5 December 2022 (UTC)[reply]
Object RichardW57 (talk) 23:43, 5 December 2022 (UTC)[reply]
That is... an odd reason to object? Currently etydate is supposed to be on the definition line like defdate. It is overlapping with defdate in that area if you were to put both. Plus it's ETYdate. Vininn126 (talk) 07:49, 6 December 2022 (UTC)[reply]
You mean in the etymology section like ampersand#Polish, or on the definition line like {{defdate}}? When used on the definition line does seem to overlap with defdate, though I concede it's not redundant because it automates "first attested in" and some other things. If we're putting it in the etymology section, IMO it should be reformatted, because I see no reason for it to be in brackets and at a small font size if it's in the etymology section, though it could still be helpful as a time-/keystroke-saving templatization of our current 'handwritten' etymologies like "First attested in 1644; engineering sense first attested in 1793". - -sche (discuss) 01:58, 6 December 2022 (UTC)[reply]
I could really get behind that. If we increased the font we'd want to increase the reference size as well. Vininn126 (talk) 07:49, 6 December 2022 (UTC)[reply]
I agree that it would need reformatting to be moved into the etymology section. Graham11 (talk) 08:03, 6 December 2022 (UTC)[reply]
We can also discuss if it should be at the beginning or the end. Vininn126 (talk) 08:21, 6 December 2022 (UTC)[reply]
Does that need to be determined? I don't think it really matters if an etymology section says "first attested in 1900, from X + Y" or "from X + Y, first attested in 1900", though my personal preference is for the latter. Agree with -sche about removing the brackets and size formatting in any case. —Al-Muqanna المقنع (talk) 13:06, 6 December 2022 (UTC)[reply]
It's been discussed on the discord, plus there's an argument to be made about consistency and uniformity of entries making them easier to read. Vininn126 (talk) 13:13, 6 December 2022 (UTC)[reply]

Wugniu tone notation

We’ve generally come to a conclusion as to how the Shanghainese Wugniu rollout will work. For a refresher on the romanisation scheme, see User:ND381/Wu Expansion, which also has notes on what will be done for the Wugniu display integration. However, as you can see, we do not yet have a consensus as to how to display tones. Here are a few ideas for you, please leave a comment as to what you prefer. (I’m working on the assumption that we can all agree that left-prominent sandhi is to be notated with a dash, but if you disagree, let me know)

1. Diacritics
Wugniu, as the website displays it, does not have diacritics. However, due to the “two phonemic tones” analysis of Shanghainese, many have opted to simply notate the dark level (陰平) tone with a diacritic - usually acute or grave accent.
non tsén mo-ve
麻煩
An important advantage of this that this makes the transcription a lot cleaner. Though it is to note that this will not be possible for lects such as Suzhounese where no analyses have all non-first-syllable tones lose phonemic tone.
2. Numbers
What we currently do reflects that of many romanisations, however, Wugniu prioritises historical tone distribution, and thus tones 2-5 will be renumbered 5-8. This is, frankly, all that people which use number notation can agree on. Whether to use super/subscript numbers before/after the syllable are all points of contention. Unfortunately, due to how the old module is programmed, there is no way to re-implement tones for syllables after the first.
a. all behind syllable
non⁶ tsen¹ mo⁶-ve
麻煩
b. all behind syllable, except for sandhi chains
non⁶ tsen¹ ⁶mo-ve
麻煩
c. all in front of syllable
⁶non ¹tsen ⁶mo-ve
麻煩
3. Right prominent sandhi
It is also of note that many people don't actually notate the right-prominent sandhi in Shanghainese. However, this can lead to changes of tone. I'm not sure whether we should notate it as well (the current module already forces use of +), and if we do decide to, how we ought to do it.

If there are any further thoughts, let me know as well. (yoinking from justin's message from last time: @Atitarev, Thedarkknightli, ChromeGames, Mteechan) — 義順 (talk) 19:15, 5 December 2022 (UTC)[reply]

My two cents is that "non⁶ tsen¹ ⁶mo-ve" is confusing; I would think that the tone a syllable has should be notated either consistently after or consistently before that syllable, but not in different places. (If the issue is that in this case tsen itself is pronounced with a tone that goes from 1 to 6, I would still think that the indication of this should be attached to tsen, not half to tsen and half to mo.) Of those options (after vs before a syllable), it seems like languages in general and Chinese languages in particular usually notate tones after the relevant syllable (tsen¹), rather than before (¹tsen), so notating tone after the syllable here too would be consistent. - -sche (discuss) 02:24, 6 December 2022 (UTC)[reply]
Upon asking several Shanghainese people (most of which having an understanding of Wugniu and/or linguistics), the general consensus seems to be thaf right-prominent sandhi is too variable to be practical to include (ie. the 1 + 6 should not be written). The overwhelming majority agree that 2c looks the best (including those that know of systems such as Jyutping), with one supporting sticking to 1. I personally also agree that 2c looks the best, but we may want more Wiktionarians to reply first. — 義順 (talk) 23:06, 7 December 2022 (UTC)[reply]
2b should not be used, IMO, because of the ambiguity that -sche mentioned. a and c both seem fine to me. —Al-Muqanna المقنع (talk) 11:46, 8 December 2022 (UTC)[reply]

Location of Footnotes for Etymologies

The problematic text is in Wiktionary:Etymology#References. What does "Etymologies should be referenced if possible, ideally by footnotes within the “Etymology” section" mean? Web pages don't naturally do literal footnotes. For talking of Wikimedia pages, I suggest that 'footnote' should normally mean the display the content implied by the domain of a <ref> tag; such is typically displayed as the expansion of a <references/> tag. --RichardW57 (talk) 01:17, 6 December 2022 (UTC)[reply]

If it means what I think it means, I propose that "footnotes within the "Eymology" section" be replaced by "inline references", and that "inline references" be added to the glossary. Otherwise, I will defend adherences to the currently proposed policy. Silence is consent. --RichardW57 (talk) 01:17, 6 December 2022 (UTC)[reply]

Should they be below or above further reading? Vininn126 (talk) 14:15, 6 December 2022 (UTC)[reply]
I would expect them to be in a 'References' section. --RichardW57 (talk) 23:52, 6 December 2022 (UTC)[reply]
Yes, but I mean the references section itself. Vininn126 (talk) 08:14, 7 December 2022 (UTC)[reply]
@Vininn126: If they be within the Etymology section, then the order is unspecified, but I feel they would be better outside the etymology section even if we have the bodies of the references within the etymology section. As sisters within the same section of another type, I would expect 'References' to come before 'Further Reading'. RichardW57m (talk) 14:39, 7 December 2022 (UTC)[reply]
I ask because on many pages, particularly Proto Slavic pages, References is under, but I would expect it to be above Further Reading as well. Vininn126 (talk) 09:38, 8 December 2022 (UTC)[reply]

This has relevance to the layout of the etymology section of พริก (prík). Potential edit warrrer: @This, that and the other. --RichardW57 (talk) 01:17, 6 December 2022 (UTC)[reply]

Footnotes should go at the bottom. As you can ready in WT:EL, references go below. MuDavid 栘𩿠 (talk) 01:26, 6 December 2022 (UTC)[reply]
I'll jump in here since I think my bot edits provoked this. I would read that as meaning that Etymologies are simply encouraged to use the <ref> tag and that consequently those references would be displayed in the References section at the end of the entry. I see no good reason for any entry to have more than one References section and certainly not one stuck inside the Etymology. JeffDoozan (talk) 01:46, 6 December 2022 (UTC)[reply]
I agree with Jeff, I'd take this to mean etymologies should have <ref>s, not that the <references/> also needs to be directly in the Etymology section (above the POS section and definitions); I would support rewording the guidance to be clearer. On the last point: sometimes, if an entry has two different etymology sections, it may have ====References==== sections at the end of each overall Etymology division, i.e. after the POS, etc. That seems OK. But yeah, don't put <references/> directly inside the ===Etymology=== section i.e. above the POS and definitions. In the exceptional circumstance that it's needed for the exact quote from a reference to be directly adjacent to the etymology, just quote the reference... - -sche (discuss) 02:10, 6 December 2022 (UTC)[reply]
What do you mean by 'entry'? Do you perhaps mean 'language section' or 'language section or numbered etymology section'? You can't mean 'lemma' or 'form', because there may be multiple lemmas for a single etymology, especially in languages where Europeans readily confound verbs with prepositions, or absolute neuter adjectives with abstract nouns. RichardW57 (talk) 14:07, 6 December 2022 (UTC)[reply]
Apologies for the ambiguity, by entry I mean the entire language section. JeffDoozan (talk) 14:20, 6 December 2022 (UTC)[reply]
I'm in complete concurrence with the three users above. If nobody objects, I'll make the change to EL as suggested by Richard, on the basis of this consensus. This, that and the other (talk) 11:28, 6 December 2022 (UTC)[reply]
Do you mean "WT:Etymology"? --RichardW57 (talk) 14:11, 6 December 2022 (UTC)[reply]
@RichardW57 Yes, I do. I thought you were referring to EL, which is a protected policy page, but as this is WT:E, which is not protected, feel free to make the change yourself. This, that and the other (talk) 09:36, 8 December 2022 (UTC)[reply]
Done. --RichardW57 (talk) 12:38, 11 December 2022 (UTC)[reply]

Syllable breaks in English pronunciations

User:Kwamikagami seems to be on a one-person crusade to expunge syllable-break markings from English pronunciation transcriptions (e.g., here, here, here, here, here, here), claiming that English syllabification is theory-dependent, when, in fact, English words naturally fall apart cleanly into separate syllables, something that's inconsistent with syllabification being theory-dependent (as this would require the actual pronunciation of the word to change depending on which theory one subscribes to, which is obviously ludicrous). Other people's thoughts? Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 08:04, 6 December 2022 (UTC)[reply]

Whoop thinks it's "obvious" that Vashti is syllabified /ˈvæ.ʃti/. To me, it's obviously /ˈvæʃ.ti/. But to Wells it's clearly /ˈvæʃt.i/. If we're going to mark syllable boundaries by default, then we need consensus on an algorithm as to where they are. For example, do we agree that in GA girl is disyllabic? And then there's the question of how to handle ambisyllabicity.
[Or maybe Ladefoged. I forget: who is it that analyses nitrate as /ˈnaɪtr.eɪt/?] kwami (talk) 08:07, 6 December 2022 (UTC)[reply]
User:Kwamikagami Personally I think you should avoid unilaterally removing syllable boundaries until this has been discussed here and there is consensus to make these changes. Benwing2 (talk) 08:24, 6 December 2022 (UTC)[reply]
@Benwing2: Should we go and undo Kwami's syllable-break purges until there's a consensus here one way or the other? Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 08:46, 6 December 2022 (UTC)[reply]
(ec) Well, Wells-or-Ladefoged-or-whoever apparently has some... interesting ideas about what kinds of consonant clusters can serve as an English syllable coda, ideas that seem to not always correspond perfectly with reality (at least if "/ˈnaɪtɹ.eɪt/" is anything to go by). girl can be either disyllabic (/ˈɡɚ.əl/) or monosyllabic (/ɡɚl/) in GA; this isn't a notational difference, but an actual variation in pronunciation (the GA dialects haven't developed a consensus as to the number of syllables in girl). As for Vashti, I strongly suspect that the difference in what various speakers consider the "obvious" syllabification might well reflect actual differences in pronunciation among GA speakers, similarly to the situation with the number of syllables in girl - in which case this isn't a question of what syllabification theory one subscribes to, but, rather, a question of multiple actual coexisting pronunciations that each need to be included. Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 08:31, 6 December 2022 (UTC)[reply]
What is there to stop /ɡɚəl/ being monosyllabic, like British English /bɪəd/ beard? --RichardW57 (talk) 14:38, 6 December 2022 (UTC)[reply]
The R-coloring of the first schwa. Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 01:00, 7 December 2022 (UTC)[reply]
How is that a problem? It is quite possible for only the second half of a vowel to be rhotacised. --RichardW57m (talk) 14:48, 7 December 2022 (UTC)[reply]
As regards ambisyllabicity, the natural way to notate that seems to be to include the consonant in question twice, first as the coda of the first syllable and then as the onset of the second syllable, which also seems to correlate the best with how the words in question actually sound. Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 08:35, 6 December 2022 (UTC)[reply]
No, that's not natural, because it implies that the consonant is held longer than others in the word, which is generally not true. Consonants are not geminate simply by virtue of landing on syllable boundaries. Andrew Sheedy (talk) 08:37, 6 December 2022 (UTC)[reply]
It's not geminate; the first syllable has a half-length coda and the second has a half-length onset, with the syllable break coming in the middle of the sound lying across the syllable boundary. How would you go about notating ambisyllabic pronunciations (and don't try to avoid the problem by omitting syllable boundaries altogether, since that wouldn't help in cases where the presence of stress on at least the second syllable requires the syllable break to be marked)? Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 08:44, 6 December 2022 (UTC)[reply]
The fact that you don't understand something doesn't mean that it doesn't correspond to reality. If you think you know better than internationally recognized experts, then it could be that you understand less than you think you do. kwami (talk) 10:03, 6 December 2022 (UTC)[reply]

Reconfirmed, it is Wells, author of the English Pronouncing Dictionary and Longman Pronunciation Dictionary. Woop, I'm curious how you would syllabify the following words, compared to one of the main RS's for English pronunciation. (Just add periods or hyphens if you like:)

petrol, selfish, feature, dolphin, hamper, brandish, carpeting, crisis, banker, attestation, apex, freedom, mattress, squadron, paltry.

Without concordance, and considering that dictionaries contradict each other, I'm wondering how we would be able to decide on syllabification. kwami (talk) 10:31, 6 December 2022 (UTC)[reply]

@Kwamikagami: /ˈpɛ.tɹəl/, /ˈsɛl.fɪʃ/, /ˈfi.t͡ʃɚ/, /ˈdɔl.fɪn/, /ˈhæm.pɚ/, /ˈbɹæn.dɪʃ/, /ˈkɑɹ.p(ɪ/ə).ɾɪŋ/, /ˈkɹɑi.sɪs/, /ˈbæiŋ.kɚ/, /ˌæ.ɾəˈsɾɛi.ʃ(ɪ/ə)n/, /ˈɛi.pɛks/, /ˈfɹi.dəm/, /ˈmæ.tɹ(ɪ/ə)s/, /ˈskwɔ.dɹ(ɪ/ə)n/, /ˈpɔl.tɹi/. Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 00:58, 7 December 2022 (UTC)[reply]
Okay, Wells disagrees with you on every one of those. E.g. for 'selfish', Wells argues it's self.ish, forming a near-minimal pair with 'shellfish', which is syllabified shell.fish. Other dictionaries agree with some of yours but not others. E.g., you have short/lax vowels in open syllables in pe.trol and ma.ttress, which most treatments argue is not allowed in English. So it's not obvious how we should approach this. kwami (talk) 01:06, 7 December 2022 (UTC)[reply]
That's confusing to me, because I interpret the aspiration of a stop in petrol, mattress, paltry as an indication that it comes at the beginning of a syllable, so they would have a syllable onset with /tɹ/ or /t͡ʃɹ/. Similarly with Wisconsin, some people pronounce the c as aspirated [kʰ] and others don't; that means to me that consonant cluster is either split across the syllable boundary /s.k/ or not /sk/. But I don't know how to harmonize this with the lax vowel rule. — Eru·tuon 14:02, 8 December 2022 (UTC)[reply]
Wells argues that /tr/ acts like an affricate in the syllabification of words like /ˈmætr.əs/, but it is not a popular solution. (In accents that don't affricate /tr/, is there any more aspiration here than in words like happy, apple or heckle?) Other alternatives are ambisyllabicity (not as unpopular, but there's far from a consensus in its favor) or concluding that English allows word-medial syllables to end in ways that word-final syllables cannot (this is not so implausible if we view the ban on words like */ˈmæ/ or */səˈmæ/ as having to do with minimal length requirements for feet, rather than restrictions on syllables).--Urszag (talk) 00:16, 9 December 2022 (UTC)[reply]
I often don't affricate the t in mattress and the r is still usually devoiced. But even when it's an affricate I think I'd still aspirate it. — Eru·tuon 03:45, 9 December 2022 (UTC)[reply]
Another possibility would be that the lax-vowel rule isn't an actual rule of English phonology, but merely a coincidental lack of words that violate the so-called rule; a point in favor of this theory would be that some (mostly-onomatopoeically-derived) words do exist which end in lax vowels, like eh and baa. Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 06:32, 9 December 2022 (UTC)[reply]
That's not an accidental gap. Interjections frequently have their own phonotactics. E.g. you wouldn't say English is a click language because of tsk! tsk! or tchick! So yes, in lexical vocabulary, English words (and perhaps syllables) do not end in 'lax' vowels. kwami (talk) 08:39, 9 December 2022 (UTC)[reply]
Re "include the consonant in question twice", I'd say that'd be bad for a different reason than Andrew: if I'm understanding correctly, you're suggesting to write something like /-d.d-/, for a case where the word has a /-d-/ sound which is hard to pin down to one syllable or the other? But it's still one /d/; /-d.d-/ would wrongly say there are two consonants, the way there actually are in some words like bookkeeper, or wholely and solely /-l.l-/ when contrasted with holy, soul-y /-l-/. - -sche (discuss) 10:34, 6 December 2022 (UTC)[reply]
Umm, wholly and solely aren't contrasted with holy and souly; they're homophonous with the two latter words. Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 01:04, 7 December 2022 (UTC)[reply]
They're homophones for me as well, but contrastive according to Longman. That may be an RP/GA difference, I don't know. kwami (talk) 01:07, 7 December 2022 (UTC)[reply]
Markedly different for me as a BrE-speaker, both in terms of vowel sound (cf. goat split) and gemination, so probably. —Al-Muqanna المقنع (talk) 01:09, 7 December 2022 (UTC)[reply]
Likewise - different for me. The 'l' is clearly lengthened in wholly and solely, but not in holy and souly. Theknightwho (talk) 02:13, 7 December 2022 (UTC)[reply]
It isn't even an RP/GA difference, as American dictionaries also acknowledge the double l in solely, vs single l in holy. Some speakers don't distinguish them is about as much as can be said, and merger seems to be more common for some words (like wholly, where the original morphemic division whole+-ly has become obscured) than others (like solely and bookkeeper where the fact that they're composed of different parts, one of which ends with /l/ or /k/ and the other of which begins with it, is transparent). Checking Cambridge, the old Century, Collins, Dictionary.com, Longman, MacMillan, Merriam-Webster, the old OED, and Oxford Learner's, all of them have double k as the only option for bookkeeper or bookkeeping (none allow single /k/), and all of them have double l as the only option for solely except MW which allows either double or single /l/. For wholly, Cambridge, Collins, Longman and Oxford Learner's have only double l, Dictionary.com and MW and the OED allow either double or single /l/, Century allows only single /l/ and MacMillan has single /l/ for the US and double for the UK. - -sche (discuss) 06:32, 7 December 2022 (UTC)[reply]
The problem of consonants being ambisyllabic / hard to pin down to one syllable or another is a known/longstanding problem, but it's a problem we face regardless (when we have to insert stress markers), and I don't think we should start removing syllable breaks as a result. (I also don't think "notate syllable breaks like other dictionaries generally do" requires "also list every alternative syllable-breaking scheme any phonologist anywhere has devised.) I will say, in the specific case at hand, /ˈvæʃ.ti/ seems to be a better analysis than /ˈvæ.ʃti/; it's my understanding that English speakers prefer to avoid ending syllables with checked vowels like /æ/ whenever possible (as is readily possible here by ending the syllable in /æʃ/ instead), and Dictionary.com also breaks it as /ˈvæʃ.ti/, and although Collins just has /ˈvæʃti/ without a syllable break marked, they list an alternate pronunciation /ˈvæʃˌtaɪ/ where they do mark the break. - -sche (discuss) 10:34, 6 December 2022 (UTC)[reply]
Can you give an example of where stress marking would require us to decide on ambisyllabicity?
It's not just ambisyllabicity, but that Whoop's idea of "obvious" contradicts mine, that Wells contradicts what is obvious to all three of us (e.g. he has /ˈvæʃt.aɪ/), and that respected dictionaries contradict each other. Given that, how are we to decide how to syllabify words consistently? kwami (talk) 10:48, 6 December 2022 (UTC)[reply]
Also, it's important to remember that we're not transcribing pronunciations here, but rather the phonemic abstractions that underlie the pronunciations. Phonemic analysis may produce something different than what we'd see in a spectrogram. E.g. ambisyllabic consonants might be necessarily codas or onsets phonemically, regardless of how they're realized phonetically. kwami (talk) 11:04, 6 December 2022 (UTC)[reply]
Re "Can you give an example of where stress marking would require us to decide on ambisyllabicity?": well, since you're arguing syllable divisions are inherently or widely ambiguous and hard to decide on, the answer is that any word where the stress isn't on the first syllable will require deciding where, exactly, relative to the word's various consonants and vowels, to insert the stress marker, just as we decide where to insert the /./. - -sche (discuss) 06:32, 7 December 2022 (UTC)[reply]
I don't think it's that bad. Most analyses agree pretty unanimously on syllabifying a consonant that comes between a reduced fully unstressed vowel and a stressed unreduced vowel with the following vowel, e.g. I think hardly anyone would argue for ambisyllabicity of /l/ in a word like political /pəˈlɪtɪkəl/ or of [m] in a word like information /ˌɪn.fɚˈmeɪ.ʃən/. The only type of word where I can imagine it being argued that the consonant at the start of the stressed syllable is ambisyllabic are certain words with an unreduced vowel before the stressed syllable, especially if it is a short/"lax" vowel, such as tattoo, elasticity, plasticity.--Urszag (talk) 08:06, 7 December 2022 (UTC)[reply]

"English words naturally fall apart cleanly into separate syllables" is certainly false. If this were true, there would be no disagreement among theoreticians about how to syllabify English words, but there is, as kwami observes. The perception of syllables by lay speakers is also variable in a number of cases and can be influenced by spelling (see e.g. David Eddington , Rebecca Treiman & Dirk Elzinga (2013) Syllabification of American English: Evidence from a Large-scale Experiment. Part I∗ , Journal of Quantitative Linguistics, 20:1, 45-67, DOI: 10.1080/09296174.2012.754601). Many aspects of syllabification are entirely predictable, and so not that helpful to display; however, there are some small contrasts in pronunciation that in some systems like that of Wells constitute examples of contrastive syllabification, which we ideally would be able to display somehow (either by means of marking syllable boundaries, or in some other way). These contrasts usually involve one of the items having an "unpredictable" syllable division due to an intervening morpheme boundary: hopefully, the placement of those will not be controversial, since the position of morpheme boundaries is generally clear. Of the linked examples, cupola, Vashti, Monty Python, vindaloo don't seem to benefit from showing syllable divisions. But for t-girl and understudy, I think the transcriptions /ˈtiɡɝl/ and /ˈʌndɚstʌdi/ leave some useful information out: they would benefit from either showing a syllable division marker as /ˈti.ɡɝl/ and |/ˈʌndɚ.stʌdi/ or a secondary/tertiary stress marker as /ˈtiˌɡɝl/ and /ˈʌndɚˌstʌdi/. I think I would perceive a slight difference between the rhymes found in these and in hypothetical words "league-earl" and "underce-tuddy" or "underst-uddy". These examples show that transcription of a secondary/tertiary stress after the main stressed syllable in a word is often an alternative possibility to the hypothesis of contrastive syllable divisions. Another example, from Wells, where the distinction that Wells reports making could be explained either in terms of contrastive syllable division or contrastive secondary/tertiary stress is "selfish" (with default syllable divsion, whatever you think that is, and definitely no stress on the second syllable) vs. "shellfish" (per Wells, /ˈʃɛl.fɪʃ/; per our current transcription, /ˈʃɛlˌfɪʃ/).--Urszag (talk) 13:33, 6 December 2022 (UTC)[reply]

It will indeed be useful to mark syllable boundaries in some cases. Another possibility might be to write compounds with a space between the elements in the IPA. We shouldn't use the stress marker as a syllable marker, though: that should only be for stress, and none of your examples have secondary stress. We might want to have a guideline something like "the syllable break should only be used to separate vowels and at morpheme boundaries." Currently we say that it needs to be used for one vowel sequence would would otherwise be ambiguous. kwami (talk) 01:13, 7 December 2022 (UTC)[reply]
With "We shouldn't use the stress marker as a syllable marker" do you mean we should use both the syllable break and stress marker before a non-initial stressed syllable? We try not to do that on Wiktionary; that's regarded as an error and tracked in Category:IPA for English using .ˈ or .ˌ. I get the impression it's avoided on Wikipedia as well. — Eru·tuon 14:08, 8 December 2022 (UTC)[reply]
<.ˈ> and <.ˌ> are correct IPA, but no, that's not what I meant. I meant that we should not use <ˌ> as a substitute for <.> on a non-stressed syllable. kwami (talk) 04:28, 9 December 2022 (UTC)[reply]
Good, I think everybody would agree with that. — Eru·tuon 14:29, 9 December 2022 (UTC)[reply]

At some point recently this was changed to be self-contradictory, but as far as I can tell the note-to-the-note is redundant to the usually=1 option. I guess another question is, if a Latin word is exclusively used as a taxonomic epithet and never inflected, shouldn't it just be listed as Translingual? —Al-Muqanna المقنع (talk) 13:00, 6 December 2022 (UTC)[reply]

Right, the current note looks pretty bad. The issue as I think of it so far is that these words are hypothetically supposed to have certain forms built according to Latin rules, but that doesn't mean that they are ever used in any other Latin context, and in practice I'm not sure taxonomic nomenclature should even be categorized as Latin anymore, given how little most coiners of new names actually are involved in a community of Latin speakers or writers. (I'm not sure whether the idea that these names are in Latin has been officially abandoned, or whether that varies depending on the codes according to which different types of organisms are named.) Listing as Translingual is OK; but as the note points out, it's an overgeneralization to say that taxonomic epithets are "not inflected except in the nominative singular"; plural forms are sometimes found and the genitive singular is not infrequently found in the formation of parasite names. So it is useful to provide some further information about inflected forms (if that information is available). There is no fixed pronunciation of these names, but if coined from Latin or Greek roots, the original vowel lengths may also be helpful information as in theory the stress should probably follow the Latin stress rule.--Urszag (talk) 14:16, 6 December 2022 (UTC)[reply]
The note-to-the-note isn't redundant to that parameter. Even if an epithet wasn't used in Latin, there can still be inflection as can be seen by ruderalis (German example, taxonomics, inflected in Dat./Abl. Sg.) and Homo neanderthalensis together with Citations:Homines neanderthalenses (various examples, with Pl.).
Indeed, the note without parameter (i.e. with the text: "Used exclusively as a taxonomic epithet and thus not inflected except in the nominative singular") makes no sense in Latin entries. It's not that taxonomic terms stay uninflected in Latin (like Gen./Dat./Acc./Abl. Sg. Homo sapiens). They are inflected the Latin way in Latin (as ruderalis shows). But some terms simply aren't (attested in) Latin (for which maybe see also Category:Pseudo-loans from Latin by language). --14:29, 6 December 2022 (UTC)
Yes, I see your point, taxonomic epithets can be inflected. In that case I think the template should be reworded—as it stands it just looks like two different editors arguing. —Al-Muqanna المقنع (talk) 14:33, 6 December 2022 (UTC)[reply]
Well, there are some similar issues even with non-taxonomic Latin names being displayed as "singular only". The proper name of an individual person is by its nature not pluralizable as such, but there are semi-productive ways to semantically coerce the meaning of plural proper names by giving them a meaning like "someone named X", "a person like X" or "a version/account of X", and in that case there is often no greater obstacle in Latin than in English to using a morphologically predictable plural form. E.g. consider the form Oedipōrum (currently marked with an RFV since we display Oedipus as "singular only"); I would say this is in reality simply no more or less possible in Latin than "Oedipuses" is in English (found in various contexts, e.g. "the Oedipuses of Harold Bloom and Gilles Deleuze"). Perhaps one could argue that we should have explicit sub-senses for names that have attested uses of that kind (and only for names with attested uses), but that seems a bit impractical and also not that valuable.--Urszag (talk) 14:46, 6 December 2022 (UTC)[reply]
I think another case worth considering is that taxonomic epithets were originally coined and discussed in Latin prose, and continued to be at least into the late 19th century. Epithets would naturally be used and declined in Latin in that context, e.g. here ("in sched. foliis ut in G. Burmauni Cass. et G. natalensi et abyssinica opacis"), where G[erbera] natalensis and G[erbera] abyssinica are in the ablative. —Al-Muqanna المقنع (talk) 15:08, 6 December 2022 (UTC)[reply]
Proper nouns: That's (IMHO) another topic. Proper nouns can be set in plural as pointed out above. For Hercules and Oedipus a plural is also mentioned in dictionaries. Though sometimes it's not the plural of a proper noun (with a meaning like multiple persons named X), but instead the proper noun turned into a common noun (person like X, with characteristics of X) and then for the common noun there is a plural. Example: There're Krösus (proper noun, a certain rich king) and Krösus (common noun, rich person, has a plural). --22:05, 6 December 2022 (UTC)

Per the above I've adjusted the template to remove the note-to-the-note and change the wording in the template's relevant forms to indicate that other inflections may be theoretical/rarely found as appropriate, rather than that they are theoretical. —Al-Muqanna المقنع (talk) 18:15, 10 December 2022 (UTC)[reply]

Let's add Jisho.org to the abuse filter

There have been edits that are based on copying from jisho.org. The problem with Jisho.org is that it is a tertiary source. Here is an example and its reversion. Here is another example. The content and tone of these edits tend to be a bit more informal than usual, and these are being slightly more frequent lately (the remaining ones that weren't reverted).

If this isn't WP:COPYVIO, this has a potential risk of being WP:CIRCULAR. Jisho.org is a site that aggregates information from different dictionaries to present a user-friendly display. It's like trying to cite Google.com. Perhaps one day it might even source information from Wiktionary itself, which would have us copying from our own mirror. Don't get me wrong, I use Jisho.org a lot to help me study Japanese. The thing is that it, like any tertiary non-expert source, needs to be cross-referenced and I do that with AnkiWeb and Google Translate. Before submitting to Wiktionary I go further and cross-reference against Yahoo Chiebukuro, DeepL, HiNative, and the underlying dictionaries that Jisho.org displays from. This at least resolves the copyright concerns especially with the restrictive EDRDG license.

If one does not want to look through all those sources, then one could at least cite the dictionaries that Jisho.org uses. That site states that it uses "the JMdict, Kanjidic2, JMnedict and Radkfile dictionary files". Those appear professional and are primary/secondary expert sources which are acceptable. Nippon Jisho is a different source that's probably fine, but Jisho.org isn't citable. The users adding these seem to be well-intentioned although beginner or intermediate Japanese students. Advanced students know how to consult a wide variety of sources like Japanese-Japanese dictionaries. If there is a warning before entering Jisho.org in the article bodies or edit summary, sources will be more critically examined and the quality of edits should improve. Therefore, I propose adding Jisho.org to the abuse filter. Daniel.z.tg (talk) 12:00, 10 December 2022 (UTC)[reply]

Narrow IPA norms for English

Let's try to make a list/table of how things should be represented in narrow IPA for GenAm (and British if possible). Appendix:English pronunciation already has a few notes, e.g. that word-initial /p t tʃ k/ are aspirated [pʰ tʰ tʃʰ kʰ], but we should try to cover as much as possible: "narrow IPA for morpheme-final /e/ (day, gayly) should be [___] while narrow IPA for /e/ before same-morpheme /l/ (gale-y) should be [___], [___]", etc, etc; then we could make an effort to add (consistent) narrow IPA to entries more routinely.
I figure, it we routinely have narrow IPA covering flapping, aspiration, dark L, vowel allophones, etc, it'll address some of the concern that we make it seem like certain things have the same vowels or consonants when they actually (allophonically) differ, while the broad IPA stays phonemic. But I figure we should establish agreed-on notations, not just encourage everyone to add whatever narrow IPA seems right to them, because recent discussions amply demonstrate that people are often both confident and mistaken in their assessments of what the typical GenAm (etc) pronunciation of something is. So, what norms can you think of for narrow IPA notations of GenAm, British, etc; e.g., in what situations is /u/ one thing and when is it another? - -sche (discuss) 20:57, 10 December 2022 (UTC)[reply]

What a mess. @Useigor Why have you created such a profusion of row/column templates when we already have {{col}} and variants such as {{col2}}, {{col3}}, etc. as well as {{top2}}, {{top3}}, etc.? Can you please explain what they accomplish that the existing templates don't? We need to clean this up, and I am going to undo all your changes unless there is a good reason for them and a clear plan to clean them up. Thanks. Benwing2 (talk) 01:59, 12 December 2022 (UTC)[reply]

Happy birthday Wiktionary!

Apparently it's our 20th birthday today. We should rename ourselves "Wikintionary" for the day, or week... This, that and the other (talk) 06:48, 12 December 2022 (UTC)[reply]

Congrats! That's a major milestone in a lifetime! And that joke deserve a round of applause too!   Noé 08:43, 12 December 2022 (UTC)[reply]
Here's to another 20 years! Vininn126 (talk) 10:23, 12 December 2022 (UTC)[reply]

<languages/>

Reminder to provide feedback on the Movement Charter content 

Hi all, 

We are in the middle of the community consultation period on the three draft sections of the Movement Charter: Preamble, Values & Principles, and Roles & Responsibilities (statement of intent). The community consultation period will last until December 18, 2022. The Movement Charter Drafting Committee (MCDC) encourages everyone who is interested in the governance of the Wikimedia movement to share their thoughts and opinions on the draft content of the Charter. 

How to share your feedback? 

Interested people can share their feedback via different channels provided below: 

If you want to help include your community in the consultation period, you are encouraged to become a Movement Charter Ambassador. Please find out more about it here

Thank you for your participation! 

On behalf of the Movement Charter Drafting Committee Mervat (WMF) (talk) 13:00, 12 December 2022 (UTC)[reply]