Wiktionary:Beer parlour/2024/August: difference between revisions

From Wiktionary, the free dictionary
Jump to navigation Jump to search
Content deleted Content added
Line 304: Line 304:
:::Does your belief apply only to rivers, or also to countries and regions (see above)? If the latter, my concern is that these usage notes would need to be added to every country and region, and would be more compactly conveyed in the headword (following the example of English and German, among others). [[User:Benwing2|Benwing2]] ([[User talk:Benwing2|talk]]) 21:35, 8 August 2024 (UTC)
:::Does your belief apply only to rivers, or also to countries and regions (see above)? If the latter, my concern is that these usage notes would need to be added to every country and region, and would be more compactly conveyed in the headword (following the example of English and German, among others). [[User:Benwing2|Benwing2]] ([[User talk:Benwing2|talk]]) 21:35, 8 August 2024 (UTC)
::::Personally I'd only add usage notes when something deviates from the pattern. Of the countries mentioned so far that's just {{m|fr|Israël}}. [[User:Nicodene|Nicodene]] ([[User talk:Nicodene|talk]]) 22:41, 8 August 2024 (UTC)
::::Personally I'd only add usage notes when something deviates from the pattern. Of the countries mentioned so far that's just {{m|fr|Israël}}. [[User:Nicodene|Nicodene]] ([[User talk:Nicodene|talk]]) 22:41, 8 August 2024 (UTC)
:::::I'd just find it easier to include the definite article in the headword for learners, since it's not like it's particularly common to find them with the indefinite article. And then for the prepositions used, we definitely have to include usage notes or usexes (like fr.wikt) after having seen [https://www.btb.termiumplus.gc.ca/tpv2guides/guides/clefsfp/index-fra.html?lang=fra&lettr=indx_catlog_l&page=9iXDUHKL12ns.html#zz9iXDUHKL12ns this page for countries] and [https://vitrinelinguistique.oqlf.gouv.qc.ca/24891/la-syntaxe/les-prepositions/preposition-devant-un-nom/les-prepositions-devant-un-nom-detat-americain this page for U.S. states], which is what I've done at pages like {{m|fr|Alabama}} and {{m|fr|Barbade}}. [[User:AG202|AG202]] ([[User talk:AG202|talk]]) 04:37, 9 August 2024 (UTC)


== Bot rights ==
== Bot rights ==

Revision as of 04:37, 9 August 2024


Language code for Baltic German

I would like to request adding a language code for Baltic German on this platform. A lot of Estonian terms (and Latvian terms) are derived from Baltic German and there's currently no real way of displaying that, other than Baltic {{der|et|de|<term>}}, which not only looks ugly, but is also wrong. It categorizes the term to [[CAT:Estonian terms derived from German]], which is incorrect, as there is a clear distinction (at least in Estonian) between terms derived from (High) German and the Baltic German dialect spoken here. As such, the code could also be etymology-only. In essence, the Baltic German dialect is a vernacular dialectal form of a mixture of High and Low German with a clearly recognisable regional flavour (Estonian and Latvian dialects) in pronunciation, morphology, syntax and vocabulary. EKI (Institute of the Estonian Language) has an online dictionary of Baltic German, with a myriad of sources for various terms: https://arhiiv.eki.ee/dict/bss/. I feel like having a language code for Baltic German is justified. Joonas07 (talk) 19:31, 1 August 2024 (UTC)[reply]

My comprehension of German is pretty rudimentary, but from my understanding, there is a distinction with Baltic German and central European German varieties, so I agree that this is justified. Joonas, do you know to what extent this is also true for Latvian or even Lithuanian? Danke. —Justin (koavf)TCM 19:37, 1 August 2024 (UTC)[reply]
Lithuanian barely has any influences from Baltic German, if at all. Are you asking whether this distinction exists the same way in Latvian? I'm not extremely familiar with Latvian, but I believe both of these languages have been influenced the same way by Baltic German, as the history is the same. A quick look at [[Category:Latvian terms derived from German]] as well makes me believe that is the case. Joonas07 (talk) 20:21, 1 August 2024 (UTC)[reply]
Ja, that was my question. I figured that the influence wouldn't be as strong due to the Polish–Lithuanian Commonwealth. —Justin (koavf)TCM 20:23, 1 August 2024 (UTC)[reply]
vernacular dialectal form of a mixture of High and Low German with a clearly recognisable regional flavour – this is fiction. It is either Standard High German with Baltic characteristics in vocabulary (e.g. Burkane, but admittedly they suffice for whole dictionaries which maintain borrowings from Low German and the local Baltic language) or it is Low German, with Baltic-influenced accent. I also speak German with Slavic twang due to speaking Russian, doesn’t mean I have created a new dialect or creole. Most commonly it is High German like “Austrian German” is High German, or just German. w:de:Baltisches Deutsch knows that even around 1600 High German “setzte sich durch” prevailed over Middle Low German – which seems exaggerated to me, but perhaps only by half a century, and Middle Low German ends in 1650 precisely at the point of being supplanted by High German for cultivated and literary purposes –, and then around the first half of the 19th century the academic upper class was oriented towards “a trim Standard German”. But Baltic Germans were only upper class, so there is no third language for even diglossia to fit in.
Baltic German should be no more than a label of German and occasionally Low German for any traces of it remaining, a distinction between (High) German and the Baltic German dialect spoken [in Estonia, Latvia, in St. Petersburg or the Baltics in general] is incorrect, it was not present for speakers. It was like Euro-English in Brussels: behind it in Brussels there is Flemish and French, and in Reval now Tallin and Dorpat now Tartu Estonian and on another level Russian. Fay Freak (talk) 20:35, 1 August 2024 (UTC)[reply]
I would argue the Baltic German varieties developed enough from their High or Low German origins to warrant an etymology-only code. There is a noticable difference between you speaking German in a Russian accent, and German settlers in Estonia and Latvia speaking a variety of their language for hundreds of years. I definitely disagree that the distinction wasn't present for the speakers. Besides, that isn't even that important, as the distinction is present in target languages. Joonas07 (talk) 20:55, 1 August 2024 (UTC)[reply]
German settlers … variety of their language There you have it, their language of the mainland.
The distinction is not present in the remaining sources either. It must be in use and not merely by declaration in target languages. Its texts look like Standard German texts with peculiar words we of course seek. E.g. the sentences quoted in Wörterschatz der deutschen Sprache Livlands – there aren’t actual dialect dictionaries. And all quoted on w:de:Baltisches Deutsch.
Of course only a single word suffices for a Baltic Germans speech to be marked and ridiculized further west, which they themselves did not expect, since they only knew one German, Standard German, not a diglossic situation of local Standard German plus dialect as is now known from Switzerland and Arabic countries, hence the quoted Harry Siegmund from Liepāja writes about his stay in Königsberg, because of the sensitive nationalist climate in Germany: “Ich schwieg auch, weil ich fürchtete, mit meiner baltischen Sprechweise als Fremder aufzufallen und ihnen in jeder Hinsicht unterlegen zu sein.” – “I was silent for fear of raising attention as a foreigner due to my Baltic mode of speech and be outgunned in every way.” It was a mode of speech. This is the situation in its last 150–250 years. Going further back, the variance is within the standard variance of all Early New High German and Middle Low German. For a 16th-century text it is highly problematic to e.g. claim it specifically Swabian or Category:Alemannic German language instead of Early New High German, which was just developing as a standard. And then in the Baltics you don’t even have a solid basis of untarnished dialect speakers because the peasants spoke Estonian and Latvian, and “German” were those who worked in administration and churches and their language—occasionally also a Russian, an Englishman, or a Swede, your target language sources may generalize it—, quite different also e.g. from the Volga German situation, which were homogenous German societies with little if any Russian or Turkic etc. encroachment until Sovietization.
This also does not mean though we can’t have “Baltic German” as an etymology-only language. As I implied with the Austrian German we can have a code, I probably would have added it myself if I had cared enough about the variety. For your purposes you should know it is still German however. Reminds me a bit of the pendants amongst Hungarian editors who liked to be sure whether a Hungarian word is borrowed “from German”, “from Austrian German” or “from Bavarian”. Linguistic works vary in the declaration. There is no actual idea behind such questions. Fay Freak (talk) 22:28, 1 August 2024 (UTC)[reply]
Yeah, I didn't mean it in the sense that it has developed into a language of its own right, rather that the variety of Standard German that is Baltic German has developed far enough to be notable. Didn't quite understand what you're getting at in the second half of your third paragraph. Re: Hungarian, it doesn't hurt to be exact. I don't know what you mean by "there is no idea behind such questions". For Estonian, it is often significant whether a word was borrowed from German or Baltic German. Joonas07 (talk) 23:30, 1 August 2024 (UTC)[reply]
This significance I understand. It might be from the historical German there or it might have intruded into the standard from present Germany, or even its predecessor Reich. Similarly one judges whether a word entered Ethiosemitic from Egyptian Arabic or Yemenite or Ḥijāzi usage, but all under one Dachsprache. I feared that you tried to introduce a distinction that is impossible to make out, exaggerating diglossia. The Hungarian etymology statements are more fanciful than reliable in this respect. Fay Freak (talk) 23:52, 1 August 2024 (UTC)[reply]
What is your preferred code for Baltic German then? de-BAT? de-BLT? There seem to be no codes for geographic regions comparable to ISO 3166. Region code means something different. Fay Freak (talk) 22:37, 1 August 2024 (UTC)[reply]
That's a good question. Does it have to be in the format <language code>-<REGION CODE> to be an etymology-only code? Joonas07 (talk) 23:32, 1 August 2024 (UTC)[reply]
@Joonas07: There is no rule, compare the list of etymology-only languages in WT:LOL/E, so I can only inquire about preferences. Fay Freak (talk) 23:52, 1 August 2024 (UTC)[reply]
I don't really know. de-bal maybe? ger-bal? The list you linked seems to indeed have various different formats (btw, I really enjoy the abbreviation WT:LOL): some are just three-letter codes, but I don't know if there's an intuitive one for Baltic German that's not already in use. Some start with gsw-, which I gather is for High German varieties? So there seems to be some conventions. I'm open to suggestions. Joonas07 (talk) 00:23, 2 August 2024 (UTC)[reply]
@Joonas07: de-cle For Curonia, Livonia and Estonia, because they called their storage-chambers by Proto-Slavic *klětь and we store information here. I will go to sleep now, before implementing it. Fay Freak (talk) 00:43, 2 August 2024 (UTC)[reply]
Let's just do de-bal. That's analogous with other language varieties not from a specific country. Joonas07 (talk) 00:57, 2 August 2024 (UTC)[reply]
Done Done, @Joonas07. Fay Freak (talk) 11:25, 2 August 2024 (UTC)[reply]
@Joonas07: I have added you the online dictionary of Baltic German as a reference template, {{R:de:BSS}}. Most of the dictionaries and whole sentences quoted from Baltic German therein are in Standard German. Schiller-Lübben is for Middle Low German. Fay Freak (talk) 23:04, 1 August 2024 (UTC)[reply]

AWB access

Hello, I would like to request access to the AutoWikiBrowser tool. I have been contributing significantly by adding entries in Old Tupi and Guaraní, and I often need to correct some inaccuracies in the entries of these languages. Furthermore, the creation of Old Tupi entries only really started to take off last year; we are in a somewhat unstable phase where some quotation templates are occasionally renamed. For what it's worth, I already have access to AutoWikiBrowser on enwiki. Thank you, RodRabelo7 (talk) 04:48, 2 August 2024 (UTC)[reply]

This seems uncontroversial, based on edits such as this. I'm not familiar with the language, but your work seems reasonable to me. Please ping me if no one else grants access in a week. I'll try to check in on this thread to see if there are any other comments. —Justin (koavf)TCM 04:53, 2 August 2024 (UTC)[reply]
Just in case, a decent and modern Old Tupi grammar in English is Ferraz Gerardi's A Role and Reference Grammar Description of Tupinambá. RodRabelo7 (talk) 04:56, 2 August 2024 (UTC)[reply]
Obrigado. —Justin (koavf)TCM 05:04, 2 August 2024 (UTC)[reply]

Hyphenation for Row-Splitting versus a Word that Might or Might Not Normally have Hyphenation

Dear Wiktionary: If a word could be normatively be interpreted as either needing hyphenation or not needing hyphenation, and it is hyphenated by a row-splitting hyphenation, how do I take a verbatim quote of that sentence for a Wiktionary citation? This actually comes up A LOT for me, because formal Wade-Giles includes hyphenation, while informal Wade-Giles and postal romanization do not include hyphenation, so many words "could go either way". What I did in this case: diff on the Zichang page was make a context-based decision (i.e. this sentence did not fall out of a coconut tree; in the context of the book and the other usage of the word in a different entry of the dictionary, it appears that the authors might likely have intended that this hyphen is more than just a row-splitting hyphenation). But I also want to imagine what could be unburdened by what has been before (that is, the author may have intended non-hyphenation for this specific instance, even if the publisher did hyphenate for the row-split, and even if the same word was hyphenated elsewhere, and even if other similarly situated words in the book are hyphenated). Thanks for any guidance. Yours Truly, --Geographyinitiative (talk) 11:26, 2 August 2024 (UTC) (Modified)[reply]

I'm not sure how familiar you are with CSS and HTML, but have you by chance seen these web design solutions?
I think these kind of solutions will work for what you're going for, which may involve inserting raw HTML/CSS rather than a template or other wikitext. —Justin (koavf)TCM 11:41, 2 August 2024 (UTC)[reply]
Thanks-- Okay, I'm looking at this, but does this coding allow me to signal to the reader that, within the context of the published book, there is an ambiguity as to whether the hyphen is merely a row-splitting hyphen or actually a part of the word proper (i.e. the hyphen would have been included if the word were not on the edge of a row)???--Geographyinitiative (talk) 11:47, 2 August 2024 (UTC)[reply]
I think your solution of an HTML comment is probably the best you can do. —Justin (koavf)TCM 11:52, 2 August 2024 (UTC)[reply]
If the surrounding context makes the intended usage clear (for example, if the same document has examples within a single line of the same word/proper noun spelled with or without a hyphen, or of analogously formed words or names) it seems fine to follow that. In cases where that can't be determined, I would say it should be considered whether these specific quotations are really essential.--Urszag (talk) 12:20, 2 August 2024 (UTC)[reply]
If it’s significant (for example, because a term has both hyphenated and non-hyphenated forms), I indicate this as “roly[-]poly” in a quotation. You can also use the template {{quote-gloss}}, which results in “roly[-]poly”. — Sgconlaw (talk) 15:26, 2 August 2024 (UTC)[reply]
As to Sgconlaw's comment, I am not sure if the hyphen is a gloss on the quote, and I don't want to misuse quote-gloss, though I see how this could be good.
Concerning Urszag's comment (that I have usually agreed with) that "In cases where that can't be determined, I would say it should be considered whether these specific quotations are really essential." I have to admit that I have followed that line of thinking before. However, I have later come to feel that I really don't want to cause a bias in my citations by just blatantly ignoring a category of ambiguous situations in English. So I really want to embrace the citations as I come to them. There should be a normative way to deal with this category of scenario beside "fuck it". This is not a lowly or vulgar usage of English- this is a category of ambiguity that is baked into English, and I believe Wiktionary should have a way to confront the situation head-on and properly cite them as what they are. In the above quote the word "An-ting" is not the essential word, but instead the rare word "Tzu-ch'ang" which is super rare because it is a Wade-Giles name derived from a communist-only Chinese original (Taiwan did not use it), so Tzu-ch'ang is pretty rare, and the book is pretty authoritative. So I want to deal with "An-ting" in the "right" way that fully acknowledges the ambiguity rather than do my grab ass horseshit of writing something in the html. I came up with that shit ages ago as a work-around; now, I want to fucking do to beautifully and make it clear to the reader of the quote what the fuck is happening, and unambiguously tell the reader that there is a potential ambiguity. --Geographyinitiative (talk) 22:35, 2 August 2024 (UTC)[reply]
@Geographyinitiative, Sgconlaw: The purpose of {{quote-gloss}} is to contain text not present in the original text, so An{{quote-gloss|-}}ting would mean "there was no hyphen in the original, but there was supposed to be" — probably not your goal. The OED does something like this: An-ting [variant reading Anting]. Although I prefer something more explicit like this: An-ting [or Anting, if the hyphen is a line-breaking hyphen] Ioaxxere (talk) 01:53, 8 August 2024 (UTC)[reply]
@Ioaxxere: ah, true. In that case I’d go with the first option I suggested which is to indicate the hyphen as “[-]”. — Sgconlaw (talk) 05:08, 8 August 2024 (UTC)[reply]
Following Ioaxxere's comment, from now, I will plan to explore the possibilities of make these kinds of edits: diff. It's so murky. --Geographyinitiative (talk) 07:07, 8 August 2024 (UTC)[reply]

Unless someone comes up with a better solution, I'm going to leave this quote (diff) as is. I'm going to eventually take this topic to Grease Pit to see if a real solution can be created for this kind of ambiguous situation. However, right now, for this quote, I don't think this quote is a good "model case" for the larger problem because I really feel that the context of the book itself more heavily favors the hyphenated form of "An-ting" than unhyphenated "Anting". But I'll keep this case in mind and come back to it later; please ping me if you have more help/input/advice on the topic generally. --Geographyinitiative (talk) 23:56, 2 August 2024 (UTC)[reply]

@Ioaxxere: I thought OED indicates variant readings when there are multiple versions of the same work, and some use one form of a term and some use another form. That’s how I interpreted it anyway. — Sgconlaw (talk) 11:21, 8 August 2024 (UTC)[reply]
I do agree that it would be nice to have a standardized solution. I've come across this issue more than once and it can be annoying when it's a rare word and I'm trying to figure out whether the hyphenated form is more common or not. Andrew Sheedy (talk) 05:00, 3 August 2024 (UTC)[reply]
Yeah guys, please keep me in mind if you come up with a good solution for this. I will keep Sgconlaw's use of quote-gloss in mind. But I really want to give readers of a quote the full picture on the quote and not either (a) ignore the potential ambiguity, (b) just opt not to use the quote, or (c) use the quote anyway without fully acknowledging potential ambiguity in some way that the reader can see without misusing quote-gloss (in my opinion) or using a work-around or similar, or relying on my personal assessment of what the author meant to pick one over the other. --Geographyinitiative (talk) 10:22, 3 August 2024 (UTC)[reply]
Bit late here, but I follow OED in using [-] in these kinds of cases. @Geographyinitiative This, that and the other (talk) 10:14, 8 August 2024 (UTC)[reply]
@This, that and the other Is that right? Is there a paper about this? I'd like to learn about the finesse behind when they use [-] and use it the same way they use it. It's very bizarre looking to me, so I want to be 100% clear what I'm doing if I follow that method- (1) EXACTLY what specific situations is it used in? (2) EXACTLY how is it formatted? (3) Do other dictionaries deal with this issue in a similar manner? (4) Is there any clear policy-level guidance on this issue anywhere in Wiktionary? If not, why not? Should it be created? --Geographyinitiative (talk) 11:02, 8 August 2024 (UTC)[reply]

List and topic categories again (how many types, and how to name them)

I notice CAT:en:Waterfalls says it's for "names of specific waterfalls, not merely terms related to waterfalls, [nor] types of waterfalls." But even before I added to it, almost all its contents were related/type terms (and I could add more: byfall, catadupe, maybe spray bow, foambow, plunge pool, stickle, huck).
I could solve this by changing "Waterfalls" to a "related-to" category; in this case, that wouldn't even cause other languages much hassle, as other languages barely use it. However... I think it is reasonable to have a category for specific Falls too, like we have for cities. But what could it be called?
We use "CAT:en:NAME" for both set categories ("terms for seasons, not merely terms related to seasons. It may contain [...] types of seasons [or...] names of specific seasons"), related-to categories ("This is a "related-to" category. It should contain terms directly related to winter"), and name lists. In our schema, the category for terms related to waterfalls or which are types of waterfalls, and the category for names of specific waterfalls, should both be "CAT:en:Waterfalls" AFAICT.
And because type isn't predictable from name, some people (reasonably!) think e.g. Category:en:Cities is named like a set category and put "capital city", eperopolis et al in it (and where else should these go?), while other people think it's named like a related-to category, or (yes) a name category... so, like many categories, its contents are a mix.
A solution would be to specify the purpose in the name: ":en:set:Seasons", ":en:topic:Winter", ":en:names:Cities"... but this highlights another issue: does it make sense that :en:Winter can include wintery, but :en:Seasons says it shouldn't contain seasonal? Maybe not! (And is it unmaintainable, anyway? It seems like in practice the more fine-grained distinctions we assert, the less well people maintain them.)
Should we merge "sets" into "related-to" categories, so "CAT:en:Seasons" could contain summer and seasonal? (In theory, set categories could just be ====Hyponyms==== sections of entries like [[season]], not needing to be categories at all.) - -sche (discuss) 21:24, 3 August 2024 (UTC)[reply]

@-sche: I would be in favour of merging the two types of categories, as I don't really think the distinction is easy to maintain. Alternatively, if it is felt that in some cases it is appropriate to have a "name" category, maybe the default should be a related-to category (for example, "Category:Cities") and the "name" category should be a subcategory called "Category:Names of cities". — Sgconlaw (talk) 23:57, 7 August 2024 (UTC)[reply]
@-sche: I prefer a naming scheme that makes the purpose clear, so you might have "types of waterfalls", "waterfalls", and "particular waterfalls" for the three kinds of categories. Ioaxxere (talk) 01:36, 8 August 2024 (UTC)[reply]
@Ioaxxere: I don’t feel it’s necessary to distinguish between “Category:Waterfalls” and “Category:Types of waterfalls”. I’m somewhat concerned that if we have distinctions which are too fine we are just going to get editors dumping everything in “Category:Waterfalls”. — Sgconlaw (talk) 11:23, 8 August 2024 (UTC)[reply]

Something I found neat in our PIE entries is the feature in WT:AINE allowing the splitting of reconstructed PIE terms by morpheme with hyphens in the alt parameter of links in Derived and Related terms. Not only does it allow more derivation transparency, but also you can square-bracket link the individual morphemes involved so less familiar visitors can be taken to the compositional morphemes to learn more about them.

I would like to amend WT:Reconstructed terms to allow this practice to be used on other proto-language pages, not just PIE (and not on non-proto-language entries).

The amendments to WT:Reconstructed terms#Entries would be something like this, derived from language at WT:AINE:

Separating hyphens can be used in the displayed form of links in Derived terms and Related terms sections of proto-language pages to clarify the formation, as long as it is not used in the page name itself.

Ceso femmuin mbolgaig mbung, mellohi! (投稿) 08:54, 4 August 2024 (UTC)[reply]

Strong support, have been doing this for non-proto reconstructions in, e.g., Prakrit and Ashokan Prakrit. It will be nice as a frequent reader of Proto-Indo-European entries as well, though I understand that this is obviously not always possible when there are factors like sandhi in play. Svartava (talk) 09:24, 4 August 2024 (UTC)[reply]
Please don't do this for lower-branched Uralic Proto-languages. I think this is not helpful for agglutinative languages overall.
Not sure I like it for PIE, either, but it is kind of a tradition in IE linguistics, so I guess. For languages where this isn't done in literature - not sure it's helpful. Thadh (talk) 09:51, 4 August 2024 (UTC)[reply]
As Thadh points out, this is something needs to be decided on a language-to-language basis. If Proto-North Caucasian feels this works best for them, godspeed. What I am opposed to is doing so on Proto-Italic and Proto-Celtic entries, as you have been doing. Those need to go to a vote, because the status quo is not to have hyphens. --{{victar|talk}} 17:23, 4 August 2024 (UTC)[reply]
...which is why I posted here in the first place, to narrow down the boundaries of such a vote? — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 18:29, 4 August 2024 (UTC)[reply]
Since when have we needed votes for a content issue like this? Theknightwho (talk) 19:21, 4 August 2024 (UTC)[reply]
I don't see any policy prohibiting morpheme hyphens elsewhere... — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 19:29, 4 August 2024 (UTC)[reply]

Vote has been drafted

@Svartava, Thadh I have started a vote at Wiktionary:Votes/2024-08/Allow hyphens in link displays for Indo-European proto-languages. Feel free to discuss or ask for amendments. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 19:20, 4 August 2024 (UTC)[reply]

This is a silly vote. As you pointed out, there is no policy prohibiting hyphens in entry links, let alone alternatives to links. Again, it is up to communities to decide what conventions they use. If you want to change status quo conventions for Proto-Italic, start vote on that, like at Wiktionary:Requests_for_deletion/Reconstruction#Proto-Italic_terms_with_only_one_descendant. --{{victar|talk}} 21:55, 4 August 2024 (UTC)[reply]
@Victar You are the one who suggested a vote. Pick a lane. Theknightwho (talk) 01:56, 6 August 2024 (UTC)[reply]
I suggested a vote specific to Proto-Italic or Proto-Celtic. --{{victar|talk}} 03:08, 6 August 2024 (UTC)[reply]

Latin months: nouns or proper nouns? Capitalized or uncapitalized?

Another Latin "proper noun" question. Currently, there seems to be no standardization in how we format entries for Latin month names. Aprīlis (April) only has a capitalized entry, and is marked as an Adjective and Noun. Maius is marked as an Adjective and Proper noun; there is a stub at maius noting it is an Alternative letter-case form. On the other hand, iānuārius is used as the main entry (Adjective and Noun) while Iānuārius is marked as an Alternative letter-case form. Contributing further to the mess, Category:la:Months includes multiple variants of some names such as Jānuārius.

What should the main entries be, what POS should be used, and how much information should be included in the alternative case form entries? In English, the POS of months is treated as "Proper noun". Urszag (talk) 11:10, 4 August 2024 (UTC)[reply]

Do any Latin dictionaries indicate when something is a proper noun? (In English, one hurdle to consulting other dictionaries about whether some class of word is a common noun or proper noun has been that many lazily have just one 'noun' category into which everything goes.) I seem to recall the fact that Russian month names are listed as uncapitalized common nouns being the result of a discussion where Russian editors argued for that based on how Russian references/speakers treated them.
Do you have a sense of whether modern editions of Latin texts usually capitalize month names, the way they usually capitalize personal and place names? Poking around Google Books, it looks to me like "modern" Latin texts (actually, everything that turns up, from texts written in Latin the 1500s and 1600s to recent editions of ancient Roman works) almost always capitalizes month names, which suggests the capitalized forms should be the main entries. - -sche (discuss) 15:52, 4 August 2024 (UTC)[reply]
I'm not familiar with any Latin dictionary that indicates proper nouns. Typically they just mark nouns or proper nouns by providing the gender (m, f, n); DMLBS also makes some use of the label "sb." (substantive) for both nouns and proper nouns. In my experience, capitalization is the usual editorial convention.--Urszag (talk) 17:26, 4 August 2024 (UTC)[reply]
Since it seems (by capitalization) that Latin dictionaries treat months as capitalized proper nouns, I would argue we should do the same. Likewise the adjectives should be capitalized. Benwing2 (talk) 18:39, 4 August 2024 (UTC)[reply]
Capitalisation doesn't mark proper nouns: Several dictionaries also capitalise other adjectives like Homēricus (Homeric), Rōmānus (Roman) and common nouns like Rōmānus m (Roman (person)).
As for months and spellings: It's also a matter of attestion. Is always both like Februārius and februārius attested?
Likewise for months and POS: Is always both mēnsis Februārius/februārius (or something like: Kalendae Februariae/februariae, Nonae Februariae/februariae, Idus Februariae/februariae) and simply Februārius/februārius m attested? --16:18, 6 August 2024 (UTC)
I don't really understand the second part of this comment. Ancient texts don't use capitalization, so there is no relevant ancient attestation distinguishing the two. Pretty much every modern edition I've seen (or modern Latin works, such as "Lingua Latina Per Se Illustrata") follows the convention of capitalizing the names of Latin months. This isn't restricted to English editors either: you can see "Augustus" capitalized in French texts such as the Gaffiot dictionary. I did see some lowercase examples of Latin month names on Google Books (e.g. "mensis augustus") so they are also attested, but I'm confident that uppercase is currently the more usual convention.--Urszag (talk) 15:02, 7 August 2024 (UTC)[reply]
Another example of capitalization: "Datum Romae, apud S. Petrum, die XIX mensis Martii, in sollemnitate Sancti Ioseph, anno MMXVIII, Pontificatus Nostri sexto" in Pope Francis's Gaudete et exsultate (2018).--Urszag (talk) 15:25, 7 August 2024 (UTC)[reply]
I found an older discussion from when month names were moved to lowercase versions: Wiktionary:Tea_room/2015/June#Latin_month_names. It looks like EncycloPetey based this on (some edition of?) "Josip Lučić Spisi Dubrovačke Kancelarije, a series of legal documents in Latin from Ragusa in the late 13th century". I'm not convinced yet that the cited text is representative of medieval usage as a whole, or that medieval usage should be relevant compared to the typical usage of more recent centuries, but I wanted to link to that discussion for greater context. I have already started moving the names (back) to capitalized versions based on the input from -sche and Benwing2.--Urszag (talk) 15:38, 7 August 2024 (UTC)[reply]
The publication in question was the source of citations, used because it was the easiest at hand, and because the text had both capital and lowercase lettering. A search of other medieval records containing dates should be able to furnish additional citations, as long as the scribe wrote out month names rather than numbers. At the time of the earlier discussion, the Latin months were treated as adjectives because the available citations in both classical and medieval Latin demonstrated use as adjectives. Modern dictionaries and Modern Latin do use capitalized forms, but Augustus is not a good example, since it specifically derives from the name of a person. Capitalization of months like october and november would be stronger evidence for capitalization, but as I say, evidence at the time suggested the practice of capitalizing month words was a modern practice. --EncycloPetey (talk) 16:30, 7 August 2024 (UTC)[reply]
Thank you for the clarification; so this is a compilation which is being cited as showing multiple independent examples of medieval usage? I guess it seems to me that the first question to be resolved (before getting into the question of what typical medieval usage was) would be whether capitalization on Wiktionary should be based on modern capitalization practices (e.g. "Datum Romae, Laterani, die XV mensis Octobris, in memoria sanctae Teresiae a Iesu, anno MMXXIII, Pontificatus Nostri undecimo", Est utique fiducia/C'Est La Confiance, 2023) or on medieval capitalization practices. I think that in general, we follow modern practices for spelling Latin words in entry titles; e.g. the use of "ae" and "oe" rather than æ, ę, œ, although I guess it is often difficult to distinguish between Classical conventions and modern conventions.--Urszag (talk) 17:09, 7 August 2024 (UTC)[reply]

Our treatment of MIA reconstructions

@Pulimaiyi, Kutchkutch, Svartava (feel free to ping others, no idea who is interested in this stuff these days): There are many terms that are only attested across several New Indo-Aryan languages but not at any earlier stages of Indo-Aryan. Sources like Turner's {{R:CDIAL}} reconstruct ancestral forms for such cognate sets, but due to phonological degradation (e.g. consonant cluster assimilation) the reconstructions can only go back to Proto-Middle Indo-Aryan rather than a language we clearly know how to deal with like Proto-Indo-Aryan or Proto-Sanskrit.

For the past couple years our strategy has been to call these reconstructions Proto-Ashokan Prakrit, which is a language we made up and not a label that is really used in any literature (0 hits on Google). We settled on Ashokan Prakrit since it is likely the ancestor of all New Indo-Aryan languages (including "Dardic") and we didn't have a later node that unifies NIA subfamilies, since e.g. we used to treat Prakrit and Apabhramsha as collections of languages.

Now that we have codes for unified Prakrit and unified Apabhramsha, I think we should move any Proto-Ashokan Prakrit terms without "Dardic" descendants (e.g. *𑀟𑀼𑀓𑁆𑀓𑀭 (*ḍukkara, pig)) to Proto-Prakrit. Proto-Prakrit is a term used in scholarly literature on IA historical linguistics, including by Turner. Also, this way we are not overclaiming the age of the word.

One edge case to consider is that often, a term may be constrained to non-Dardic NIA but also happen to have a descendant in Kashmiri; an example is *𑀝𑁄𑀓𑁆𑀓 (*ṭokka, basket)). Kashmiri is the "Dardic" IA language that is most in-contact with plains Indo-Aryan (particularly Punjabi). I think this should also be called Proto-Prakrit but we can debate this. Ideally, we reserve Proto-Ashokan Prakrit for any NIA terms with non-Kashmiri Dardic cognates. —AryamanA (मुझसे बात करेंयोगदान) 20:40, 4 August 2024 (UTC)[reply]

I agree. Some related followup Qs:
1. How do we want to handle cases like *dākka, which Turner reconstructs [1]here with both a long vowel and consonant cluster (which is generally considered invalid in Middle Indo-Aryan). It appears that Turner is reconstructing Old Indo-Aryan. In this case, do we want to say that the descendant is Sanskrit *डाक्क (ḍākka), Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "inc-pra" is not valid. See WT:LOL and WT:LOL/E., Ashokan Prakrit *𑀟𑀸𑀓𑁆𑀓 (*ḍākka​), or Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "inc-pra" is not valid. See WT:LOL and WT:LOL/E.?
2. Is Proto-Prakrit a separate language or just a shorthand for referring to reconstructed Prakrit? I haven't seen any Proto-Ashokan Prakrit language in Wiktionary, so I'm guessing what you're referring to is reconstructed Ashokan Prakrit, right? Dragonoid76 (talk) 18:57, 5 August 2024 (UTC)[reply]
One more question—what are the cases where it makes sense to reconstruct "Sanskrit", as opposed to "Proto-Prakrit" or "Proto-Ashokan Prakrit"? Can we make (or does it already exist?) a clear decision on these cases? For example:
Dragonoid76 (talk) 20:00, 5 August 2024 (UTC)[reply]
I also agree. For Dardic descendants, and also Pali descendants of Turner reconstructions, we might want a Proto-Middle Indo-Aryan but that ways the age of any word will obviously be implied more than what it would be if it was called "Proto-Prakrit". I'm also open for Sanskrit reconstructions which do seem better suited in some cases like *ध्वजदण्ड (dhvajadaṇḍa), *तिथिवार (tithivāra), etc. and this can be easily dealt with on a case-by-case basis (due to the low number of MIA editors) as to which reconstruction fits better. I would also like to point out that despite being less frequent, early MIA like Pali does show both a long vowel and consonant cluster and even some Prakrit words do that, so I don't think it would be very problematic to have Proto-Prakrit reconstructions having both a long vowel and consonant cluster. Svartava (talk) 04:55, 6 August 2024 (UTC)[reply]
@AryamanA: Hello! It's great to see you active again. As a matter of an incredible coincidence, @Svartava and I have been, for the past few weeks, discussing on Discord about having a Proto-Prakrit code. Having a Proto-Prakrit code is surely less problematic than taking Turner reconstructions (which were intended by Turner to be Sanskrit) and showing them as Ashokan Prakrit, a practice unique to Wiktionary. Moreover, in Ashokan reconstructions, we spell out the geminated stops (case in point: *𑀝𑁄𑀓𑁆𑀓 (*ṭokka)) but we know that gemination was not reflected in spelling in the edicts of Ashoka. So we have to either change these reconstructions to Proto Prakrit or render them in the Latin script. Also, Ashokan needs to be set as the ancestor of Dardic (it's not, for now).
@Dragonoid76: To address your queries: a long vowel followed by a geminated consonant cluster is uncommon, but not invalid in MIA, as cases like dātta definitely exist. As for Prakrit entries in the reconstruction namespace vs Proto-Prakrit as a separate code, I am of the opinion that since Prakrit has been merged, we might as well use Prakrit reconstructions. As for your next question of how to decide between Ashokan vs Pkt reconstruction vs Sanskrit, as Aryaman said, if it has non Kashmiri Dardic reflexes, it will be an Ashokan reconstruction. As of now, inc-ash is not set to be the ancestor of inc-dar-pro but that can be fixed. I believe it should be, because Shahbazgarhi Ashokan shows many features which can be said to be the ancestor of the corresponding features in Dardic. Deciding between Sanskrit and Ashokan can be much more challenging, given Ashokan contains sounds like /ṣ/, /ś/ and non simplified consonant clusters. So ciṣṭa might well be early MIA. One rule of thumb I'd use is, compounds where the components are discernable as Sanskrit words are Sanskrit, such as *bhaginī-putra -- 𝘗𝘶𝘭𝘪𝘮𝘢𝘪𝘺𝘪(𝘵𝘢𝘭𝘬) 05:18, 6 August 2024 (UTC)[reply]
@AryamanA: Moving a few entries from reconstructed early MIA Ashokan Prakrit to reconstructed middle MIA Proto-Prakrit seems to be uncontroversial since that was the original proposal.
@Dragonoid76, Pulimaiyi: Regarding, Is Proto-Prakrit a separate language or just a shorthand for referring to reconstructed Prakrit?
I agree with
since Prakrit has been merged, we might as well use Prakrit reconstructions
rather creating a new code for Proto-Prakrit. This is because creating a new code for Proto-Prakrit would mean that we would have to decide whether it is an ancestor, descendant or contemporaneous with the merged Prakrit language. Furthermore, Prakrit reconstructions are usually one-off for special cases unlike protolanguages such as Proto-Indo-Iranian. Protolanguages such as Proto-Indo-Iranian are entirely reconstructed while Middle Indo-Aryan is a mixture of attested and reconstructed terms.
Would the script continue to be Brahmi? … we have to either change these reconstructions to Proto Prakrit or render them in the Latin script
When Proto-Indo-Aryan reconstructions were moved to Ashokan Prakrit reconstructions, it was a delight to see them in Brahmi script instead of Latin script. Then, Victar started this discussion WT:Beer_parlour/2021/March#Reconstructions_in_Latin_script
Victar: I'd like get a discussion going about adding a guideline to WT:PROTO that states that all reconstructions should be in Latin script. Most already are, but here's a list of the ones that buck that standard…Ashokan Prakrit … Sanskrit
Mahāgaja: Devanagari seems perfectly natural to me
Victar: If we're going by academia, reconstructions will always usually be in Latin script, which does also go for Sanskrit and Avestan. Seeing RC:Sanskrit/लुट्टति is rather weird to my eyes
I agree with Fay Freak’s comment. However, I also see what Victar meant. Academia in the English language will probably not consider Wiktionary’s reconstructions seriously if they are not in the Latin script.
At Talk:बद्ध, AryamanA said
It is not useful to reconstruct with the idiosyncracies of Ashokan Brahmi being applied, in comparative linguistics we care about the phonology not orthography.
If the idiosyncracies of Brahmi are not to be applied to reconstructed Brahmi, and if we care about the phonology not orthography, then that might suggest that the Latin script might be used for reconstructions if the Latin script better represents the phonology. However, it could be argued that even the Latin script has idiosyncrasies of its own.
Question 1: @Pulimaiyi, Svartava: If middle MIA reconstructions continue to be in Brahmi, would the anusvara be used for homorganic nasal consonants, or would they be written as the Brahmi equivalents of ङ् ञ् ण् न् म्? The middle MIA convention is to use the anusvara. RC:Ashokan Prakrit/𑀟𑀗𑁆𑀓 uses ङ्, while RC:Ashokan Prakrit/𑀫𑀡𑀺𑀕𑀁𑀞𑀺 uses the anusvara.
As for … how to decide between Ashokan vs Prakrit reconstruction vs Sanskrit, … if it has non Kashmiri Dardic reflexes, it will be an Ashokan reconstruction … given Ashokan contains sounds like /ṣ/, /ś/ and non simplified consonant clusters … ciṣṭa might well be early MIA
What this means is that there will reconstructions at three stages:
OIA (Sanskrit)
Early MIA (Ashokan Prakrit)
and Middle MIA (Prakrit)
By analogy with RC:Sanskrit/चिष्ट, RC:Ashokan Prakrit/𑀧𑀝𑁆𑀞𑀸𑀦 might be moved to RC:Ashokan Prakrit/𑀧𑀱𑁆𑀝𑀸𑀦 especially since there is a Kashmiri descendant K. paṭhān m. (see Reconstruction_talk:Ashokan_Prakrit/𑀧𑀝𑁆𑀞𑀸𑀦#*paṣṭāna?). However, RC:Ashokan Prakrit/𑀙𑁄𑀝𑁆𑀝 has a Kashmiri descendant, but it does not resemble early MIA.
Question 2: @Pulimaiyi, Svartava: With such a scheme shouldn’t we use the ===Reconstruction notes==== section to explain why a particular stage was chosen for a reconstruction rather than another stage (in addition to other details)?
For example, when I look at
RC:Sanskrit/ध्वजदण्ड
RC:Sanskrit/तिथिवार
RC:Sanskrit/उन्नग्न
RC:Sanskrit/स्यालभार्या
it always takes me a few minutes to justify why these are being reconstructed as OIA (Sanskrit) rather than middle MIA because of
Special:Permalink/65062470#बुभुक्ष्
Pulimaiyi: Sanskrit reconstructions are very rare in wiktionary and are generally not favoured by wiktionary's convention … Sanskrit reconstructions are not favoured by wiktionary's convention because of the lack of reliable reconstruction sources to base it on.
See also:
Reconstruction talk:Sanskrit/ध्वजदण्ड
We already have RC:Sanskrit/तिथिवार, which is why I even thought of creating this reconstruction. Or else, I'd have simply added {{inh|hi|sa||*ध्वजदण्ड}}, without linking it.
[[User_talk:Inqilābī#Status_of_{{R:CDIAL}}_reconstructions]]
Kutchkutch: Do you have a opinion on whether RC:Sanskrit/उन्नग्न should be modified to a Prakrit form or remain as [it] appear[s] in {{R:CDIAL}}?
CDIAL Introduction:
Many of the headwords, like so much of classical Sanskrit vocabulary, are in reality Middle Indo-Aryan clothed, for the convenience of presentation, in an earlier phonetic dress
Inqilābī: No idea, but it might be the case that Turner reconstructs both OIA and MIA terms.
Talk:सलहज
PUC: Wow, the phonetic erosion was rather strong in there! No?
AryamanA: Yep. It's syālabhāryā > sālahāyya > sallahayya > salhaj
At one point I was deciding whether RC:Sanskrit/तिथिवार should be moved to Ashokan Prakrit and then decided not to. If the ===Reconstruction notes==== section explicitly explains why that particular stage was chosen (in addition to other details), then that would clear up the confusion.
At RC:Sanskrit/युट् despite saying,
Turner posits that all forms of this root may have originated from *युट्ट which was a MIA replacement for युक्त
it seems that the justification for having RC:Sanskrit/युट् as Sanskrit rather than middle MIA is that we agreed not to have middle MIA roots at Talk:घोट#𑀖𑀼𑀝𑁆𑀝𑁆-_(ghuṭṭ-). However, early MIA CAT:Ashokan Prakrit roots are permissable according to the following statement in that discussion:
Ashokan Prakrit roots are tolerated because *we* consider the unattested terms in Turner's dictionary to be Ashokan Prakrit
One rule of thumb I'd use is, compounds where the components are discernable as Sanskrit words are Sanskrit, such as *bhaginī-putra
The components of RC:Ashokan Prakrit/𑀫𑀡𑀺𑀕𑀁𑀞𑀺 are discernable as Sanskrit, but I placed it in MIA rather than OIA (Sanskrit). *bhaginī-putra differs from RC:Ashokan Prakrit/𑀫𑀡𑀺𑀕𑀁𑀞𑀺 because it has the Kashmiri descendant K. bĕnathᵃr m..
The relationship between reconstructed MIA and Dardic languages has been discussed several times such as at
Reconstruction talk:Ashokan Prakrit/𑀕𑀼𑀧𑁆𑀨𑀸
The existence of a Dardic cognate could suggest that this word existed in late Old Indo-Aryan/early MIA: this is precisely why initially a code for "Proto MIA" was proposed so that Pali and Dardic could be included; but that idea did not garner much support and we had to settle for Ashokan Prakrit instead, which albeit quite pervasive, unfortunately does not extend to Pali and Dardic.
Special:Diff/73057977 at RC:Ashokan Prakrit/𑀕𑀸𑀟𑁆𑀟
Any other way to deal with Dardic terms cognate with Ashokan prakrit without having to reconstruct Sanskrit?
Special:Diff/73407835 at گاڑے#Torwali
Apparently there are more Dardic terms than just Kashmiri corresponding to CDIAL 4116 *gāḍḍa 'cart'
Kashmiri is the "Dardic" IA language that is most in-contact with plains Indo-Aryan (particularly Punjabi)
Although Kashmiri is the most spoken Dardic language, the other Dardic languages are also in contact with “plains Indo-Aryan”, which might explain گاڑے#Torwali. RC:Sanskrit/चिष्ट has the Shina descendant چٹھ#Shina. Also, CDIAL Introduction: derives the Khowar term ātΛpik from reconstructed MIA :
Khowar ātΛpik `to have high fever' must rest either upon a late MIA. *ātapp- (newly formed compound with ā from tappaï) or upon MIA. *āttapp- with analogical -tt- (after type ā-tt- < ā-tr-, etc.). The head-word ātapyatē under which the Khowar word appears is thus in reality a Middle Indo-Aryan word in Old Indo-Aryan form.
What is probably meant by “Punjabi” here is “Punjabic languages” such as Pahari-Potwari and Hindko in addition to the standardised Majhi Punjabi.
Urdu as a lingua franca is also in contact with Dardic languages to a significant extent.
Pashto is another lingua franca that is in contact with Dardic languages in Khyber-Pakhtunkhwa and Afghanistan. Although Pashto is an Iranian language, Pashto borrows from Urdu and Punjabic languages including Lahnda/Saraiki. Perhaps it is too much of a stretch for a Dardic language in Khyber-Pakhtunkhwa or Afghanistan to have a “plains Indo-Aryan” term through Pashto. For example,
RC:Ashokan Prakrit/𑀕𑀸𑀟𑁆𑀟ګاډی#Pashtoگاڑے #Torwali
(See CAT:Pashto borrowed terms) Perhaps there is a possibility with RC:Ashokan Prakrit/𑀧𑀝𑁆𑀞𑀸𑀦 that a Dardic language acquired the term first and then it spread to “plains Indo-Aryan”.
Kutchkutch (talk) 16:00, 6 August 2024 (UTC)[reply]

etymology sections and a lack of standardization on detail

We have basically zero standards on the level of detail one should put in an etymology section, some will only list the direct ancestor regardless on if it's derived from another language or not (DeJulio, others will list the ancestors of a word all the way to say, Latin(like dictionary or this Malay term for June), and then others still will go all the way back to PIE or similar. that's not getting into entries like admiral, orange or pizza that start to look run on paragraphs with stuff like cognates and miscellaneous etymological detail.

I do recognize a pattern of more common or popular words having the larger etymology sections but that not really the "problem" here, and the longer sections are all pretty much on topic even if they get rambly. For one, we aren't Wikipedia, and these long paragraphs are a bit unwieldy to the average reader(read: eyesore), and if i probably wouldn't have broached this topic this time last year on the merit of having the full etymology on the same page to be quite useful, and probably was the intent prior, however with the introduction of the etymon template among other technical revolutions on the site this year, there's now much better ways(imo) to present the info to the average readers. another argument for reducing these large sections would be synchronicity, as I've encountered plenty of cases where one entry is missing details provided by another or one having an error that the other had fixed.


now it might sound like i'm advocating for said "only list the direct ancestor" situation but honestly my main gripe with how things are are mostly just presentation of the info, I've brought up on the Discord the suggestion of if entries are to be going the distance of providing an exhaustive etymology, that it doesn't need to be presented in paragraph form, particularly given that it's mostly just three to five word statements like(now presenting how it could be presented instead of paragraphs):

- Word A from language A

- Word B from language A

- Word C from Language B

- Word D from Proto Language Akaibu (talk) 06:39, 6 August 2024 (UTC)[reply]

I mentioned this on Discord: with the {{etymon}} template, I don't think it'll hit widespread usage until it's easier to use than the basic etymology templates like {{der}}, {{bor+}}, {{inh}}, etc. etc. Having to learn/use IDs and the whole system is daunting for the average editor. I do agree though that our etymologies do need cleanup in terms of what to display. A lot of times I'll just show the initial borrowing and put "ultimately from" for entries like Hawaiian ʻApekanikana or Yoruba Alibéníà. AG202 (talk) 17:08, 6 August 2024 (UTC)[reply]
@AG202: The goal of {{etymon}} is to connect entries like puzzle pieces, so I think the main problem currently is that very few entries are using it. In the future it will hopefully be saving massive amounts of time on stuff like categorization, finding derived terms, and writing out long etymological chains by hand. Ioaxxere (talk) 19:56, 6 August 2024 (UTC)[reply]
I'm in favor of increased usage of etymon to reduce the problem of different etymology sections not being in sync with each other, but I agree that in its current form the template is not simple enough to be easily used (e.g. the ID system is cumbersome, and the conditions for when not to use "from" are not intuitive).--Urszag (talk) 20:25, 6 August 2024 (UTC)[reply]
We've previously discussed - and seemingly agreed upon - how to make the syntax of {{etymon}} more intuitive. Due to the unfortunate choice of title I cannot link the thread directly, so here is the URL in plaintext: https://en.wiktionary.org/wiki/Wiktionary:Beer_parlour/2024/June#{{etymon}}
Incorporating Benwing's last suggestion, we'd have something like:
{{ety|en#X|clever#Y|-ly#Z}} “[cleverly is] from clever + -ly
{{ety|en#X|enm:charitee#Y}} “[charity is] from Middle English charitee
{{ety|ru#X|de:montieren#Y|-овать#Z}} “[монтировать is] from German montieren + Russian -овать
The X, Y, Z following the hashtags are IDs, and various additional parameters can be added like |inh=1 |bor=1 |blend=1 |backformation=1
If the syntax were like that, I'd actually be happy to use it as an general-purpose etymology template. Nicodene (talk) 23:48, 6 August 2024 (UTC)[reply]

Reminder! Vote closing soon to fill vacancies of the first U4C

You can find this message translated into additional languages on Meta-wiki. Please help translate to your language

Dear all,

The voting period for the Universal Code of Conduct Coordinating Committee (U4C) is closing soon. It is open through 10 August 2024. Read the information on the voting page on Meta-wiki to learn more about voting and voter eligibility. If you are eligible to vote and have not voted in this special election, it is important that you vote now.

Why should you vote? The U4C is a global group dedicated to providing an equitable and consistent implementation of the UCoC. Community input into the committee membership is critical to the success of the UCoC.

Please share this message with members of your community so they can participate as well.

In cooperation with the U4C,

-- Keegan (WMF) (talk) 15:30, 6 August 2024 (UTC)[reply]

Micronations inclusion criteria

Micronations are not explicitly mentioned in Wiktionary:Criteria for inclusion#Place names, yet we have at least seven pages for micronations on Wikt. Seeing as they do count as place-names, I am asking here for input on whether or not micronations should be allowed to have their own entries/just be subject to the same criteria as any entry. FWIW, I am of the opinion that they should be allowed to have entries, and, for clarity, be added to the aforementioned policy link as legal scholars tend to classify them as political entities, which are already allowed entries on Wikt. Would appreciate any feedback or comments, including any opposition to this proposal! Kindest regards, LunaEatsTuna (talk) 03:30, 7 August 2024 (UTC)[reply]

I've discussed on the Discord that they should be counted, because they are names of places, and could be seen as already have been included in CFI.
My reasoning for this is as follows:
1. We include "[h]uman settlements: cities, towns, villages, etc."
2. Micronations are human settlements, in the sense that they have/had people who live in them. (We also list ghost towns with 0 people in them, so people actually living in them isn't a concern).
3. As such, human settlements are implicitly included in CFI.
Regardless of if you think my reasoning is sound, I do feel that they should be included, as they can achieve the same level of being talked-about as the towns in Arizona, or even more, in some cases (such as Sealand.) CitationsFreak (talk) 04:00, 7 August 2024 (UTC)[reply]
I firmly believe that they are not currently included under our current criteria for Wiktionary:Criteria for inclusion#Place names. Looking at the list found at w:List of micronations, I would be hard-pressed to say that our policy states that we should include all of them. Most of them have no people living in them, and some don't even have an actual territory. A resort, a farm, a bank, two sculptures, straight-up fraud, and more should not be included by default as purported micronations. While I don't necessarily support our current policy that includes ghost towns & unincorporated communities with no people living in them, those at least receive recognition from an actual state and can be found on official government documents.
A lot of micronations are essentially "I made this up". Some should fall under WT:COMPANY. For example, I simply do not think that the Principality of Snake Hill, from a "family in New South Wales who were unable to afford their taxes seceded from Australia", should be included by default here. Some micronations are just online communities, and I don't think we'd want to open the floodgates to the name of just any online community that declares itself a micronation. A number of them claim territory that they don't even live on. It just rings as unserious, frankly, and our place names policy is broad enough as is. They don't rise up the level that an actual unrecognized state like Somaliland does.
That being said, I would support a policy to explicitly include notable micronations such as Sealand, but I'm not yet sure what the notability criteria should be. But for now, I'd say that they fall under this policy: "Most manmade structures, including buildings, airports, ports, bridges, canals, dams, tunnels, individual roads and streets, as well as gardens, parks, and beaches may only be attested through figurative use.", if even that. Or they could go into the Appendix. AG202 (talk) 04:28, 7 August 2024 (UTC)[reply]
I would like to point out this line from the rationale of the CFI place names vote: "[T]he categories are left open-ended to allow more of our existing entries." This means that if a specific type of place is not explicitly spelled out, it does not mean that it falls under the criteria.
Also, the regular CFI criterion protects us from having to deal with every little obscure micronation made up by a ten-year-old in their bedroom. I would say that any micronation that is mentioned in three+ independent sources over a period of one year should be included. If enough people talk about, say, Melchizedek, then I'd say it's notable enough for us.
(Plus, is a fake nation really more similar to an airport or street or anything else mentioned in that sentence than a nation?) CitationsFreak (talk) 04:47, 7 August 2024 (UTC)[reply]
"Plus, is a fake nation really more similar to an airport or street or anything else mentioned in that sentence than a nation?" Yes? I'm almost certain major and even some minor airports have more notability and usage than the vast majority of the micronations listed. Let alone actual nations and sovereign states. Like I said some of them are literally a singular building. Also, honestly, our CFI criterion doesn't protect us, considering all we need are 3 Usenet comments, or at this point simply 3 tweets. AG202 (talk) 05:02, 7 August 2024 (UTC)[reply]
I meant that in terms of function. A micronation acts like a nation, with its own government and rulers and flag and so on. This is unlike an airport or a street, which doesn't.
Also, like I said, I am using the CFI standard, "use in durably archived media, conveying meaning, in at least three independent instances spanning at least a year". If these conditions are met when people are talking out a building, why shouldn't it be in Wikt? CitationsFreak (talk) 05:08, 7 August 2024 (UTC)[reply]
Because they are explicitly excluded by WT:CFI#Place names, unless they have figurative usage, which is exactly my point. If we included buildings and such by default, I wouldn't be replying here, but that's not the case. We can't simply have someone redress a building or company or farm or something similar as a "micronation", get 3 independent usages, and then bam, we include it by default. That just does not align with how I'd expect our policies to be read. And looking at the list from WP, based on the references they have, I would expect the vast majority, if not all, of them to pass if we include them by default. AG202 (talk) 05:15, 7 August 2024 (UTC)[reply]
I would totally expect a reader to look Sealand, but not the name of any other sea fort, since it is famous. (However, I wouldn't expect a reader to look up Bob's Principality of North-East Main Street.)
(In the Discord, I had also suggested "a new rule for microstates, that says something like "Ignore all references to the founding of the state[, since they are not independent]"? What do y'all think?) CitationsFreak (talk) 05:33, 7 August 2024 (UTC)[reply]
Well, a micronation is a territory around a building plus a government. So we include them by comparison to, or their partial identity with human settlements, neighbourhoods and countries, as we even include fictional countries. This does not exclude that some shall not be included also according to our inclusion criteria because they are more similar to constructed languages, for instance.
We should be more concerned with violation of WT:BRAND by their artificialities. Some are organized like a cult, a club or criminal organization, though we include 'ndrangheta, Hamas, Islamic Revolutionary Guard Corps and the Unification Church, or what: I think about Reichsbürger here, whose constructs aren’t considered micronations however. Somewhere it does go too far. We won’t agree on their being noted in references per se supporting their inclusion, though notability is important, since just for clarity and not being confused with Wikipedia Wiktionary editors will avoid mentioning notability in the CFI, which they fear not to even understand in the same way as you if they don’t edit Wikipedia.
We can make RFDs for any reason later if the current inclusion situation goes out of hand, I don’t see a benefit of a theoretical community agreement on inclusion criteria specific to micronations. It is right, necessary and sufficient that we have discussed it, this well help us later to find out what goes too far. Fay Freak (talk) 13:15, 7 August 2024 (UTC)[reply]
Micronations aren't all one kind (Liberland denotes a specific area, Obsidia is a movable rock; some are oft-mentioned, some scarcely-mentioned), so IMO we shouldn't add blanket acceptance of all micronations to CFI. But if enough people use a term like Liberland to refer to a given area, I don't see an obvious dividing line between that and other coinages for specific (or nebulous!) regions—which may not have administrative significance or population—we don't bat an eye at: the Triangle, the Golden Strip, Trójmiasto, Mariana Trench, not to mention terms where sovereignty is disputed, like Northern Cyprus, Judea and Samaria, or Donetsk People's Republic. If there are cites to support it, I don't see a reason not to include Liberland: but that doesn't mean we should define these as real nations; I might lead with Liberland being a name for a particular area (used by people who claim it's a nation), and likewise might merge the first two senses of Seborga and just mention that the town is claimed to be a micronation.
It's true anyone can make up a micronation and we could be flooded, but we can RFD things were needed (if we don't blanket-include them), and AFAICT people could already coin and flood us with coinages for non-micronation regions: if people start calling an arbitrary U-shaped snake of land from Hamburg down to Hannover and east to Berlin and north up to Waren "HaHaBeWare" (or something as self-promotional as some micronations, like "Rachel's Backyard"), not asserting it to be a micronation but just saying "this is a name for this region, a la the Golden Strip", I don't currently see on what basis we wouldn't include that... (Also, while Obsidia, which I mentioned above, is less a placename and more like Ishango bone or Einang Stone, we seem to be deciding at RFD to keep such "names of specific individual stones and bones", so maybe Obsidia is fine too? I don't know; I'm more sceptical of it, and we don't currently have other stone-names I checked like Stone of Scone, but I'm curious why Ishango bone would get a pass and not Obsidia... maybe we want to reconsider including Ishango bone?) I am, as always, liable to change my mind as I hear more arguments... - -sche (discuss) 21:36, 7 August 2024 (UTC)[reply]

Billion: a thousand millions or a milion millions

Garner still uses the regular plural in these types of definition. Thus, for trillion he states that in Great Britain, it traditionally means a million million millions.

When it comes to defining nominal meanings as different from numerals, should the wording not reflect this? JMGN (talk) 16:13, 8 August 2024 (UTC)[reply]

(Notifying PUC, Jberkel, Nicodene, AG202, Benwing2): There has been a conflict on what to do with the headword line (pinging the article creator @Olybrius). My understanding is that the article "la" seems to be always used with the name of this river, and it is not capitalised; but I don't think we should change the headword line. What should we do? Are there other names in the same situation? This is not like the situation of La Défense where "La" is lexicalised as part of the name and is always capitalised (however there are also some websites that perhaps by mistake have left it uncapitalised.) --kc_kennylau (talk) 19:29, 8 August 2024 (UTC)[reply]

@Kc kennylau This is very common with French rivers as well as other entities, e.g. most countries (les États-Unis, la France, but just Israël). We don't have a general policy on how to handle this; in English, there is now a param |the=1 for cases like this, which displays "the" in the headword (e.g. the White House; but not always used, cf. the Castro, a well-known district in San Francisco). In German, {{de-noun}} also has special support for this. I'm not sure about other languages. Benwing2 (talk) 19:42, 8 August 2024 (UTC)[reply]
For that matter, most (all?) rivers in English use the as well. Benwing2 (talk) 19:42, 8 August 2024 (UTC)[reply]
I do feel that we should maybe add articles in the headword for things like la la Barbade or l’Alabama. It makes it more clear for learners, especially since not every region or country uses an article. It's brought up quite often in French-learning spaces. AG202 (talk) 20:58, 8 August 2024 (UTC)[reply]
I agree. I don't know if it's necessary for rivers because AFAIK all rivers take an article, but for countries and regions it varies from term to term and is very useful to include. That's why it's included in English and German, for example. Benwing2 (talk) 21:17, 8 August 2024 (UTC)[reply]
I also agree. CitationsFreak (talk) 03:55, 9 August 2024 (UTC)[reply]
I don't see the point of adding the article, knowing the gender is enough. See Nil, Rhône, Meuse, Rhin, Danube, etc. PUC19:46, 8 August 2024 (UTC)[reply]
This is my view as well.
For Luynes, if the concern is that a reader may not know to use la (as opposed to *les), that can be clarified in a usage note. Nicodene (talk) 21:32, 8 August 2024 (UTC)[reply]
Does your belief apply only to rivers, or also to countries and regions (see above)? If the latter, my concern is that these usage notes would need to be added to every country and region, and would be more compactly conveyed in the headword (following the example of English and German, among others). Benwing2 (talk) 21:35, 8 August 2024 (UTC)[reply]
Personally I'd only add usage notes when something deviates from the pattern. Of the countries mentioned so far that's just Israël. Nicodene (talk) 22:41, 8 August 2024 (UTC)[reply]
I'd just find it easier to include the definite article in the headword for learners, since it's not like it's particularly common to find them with the indefinite article. And then for the prepositions used, we definitely have to include usage notes or usexes (like fr.wikt) after having seen this page for countries and this page for U.S. states, which is what I've done at pages like Alabama and Barbade. AG202 (talk) 04:37, 9 August 2024 (UTC)[reply]

Bot rights

We really should have a policy for removal of bot rights from accounts that have become inactive for a reasonable period. We could say a bot is temporarily inactive after 2 years and permanently after 3 years. For example, NanshuBot and Websterbot have not edited since 2003, and TheCheatBot has made no contributions since 2008. There must be a notice to the bot owner prior to removal of rights. Any bot removed due to temporary inactivity must be restorable at the request of the owner. However, if the rights are permanently taken away after a longer period, it would require another vote for their reinstatement. Let me know what you think. — Fenakhay (حيطي · مساهماتي) 02:45, 9 August 2024 (UTC)[reply]

@Fenakhay Sounds good to me. 2 years sounds good for a temporary revocation but for a permanent revocation maybe 5 years; 3 years seems maybe too close to 2 years. I would add that if a bot owner requests that the bot rights be restored, this doesn't reset the clock; if they ask for a restoration but don't do anything with their bot, then the bot is still subject to permanent revocation after the relevant period from the last edit performed by the bot. Benwing2 (talk) 04:14, 9 August 2024 (UTC)[reply]