User talk:Kiril kovachev
Add topicWelcome
[edit]Please read this before making any edits.
Welcome!
Hello, welcome to Wiktionary, and thank you for your contributions so far.
If you are unfamiliar with wiki editing, take a look at Help:How to edit a page. It is a concise list of technical guidelines to the wiki format we use here: how to, for example, make text boldfaced or create hyperlinks. Feel free to practice in the sandbox. If you would like a slower introduction we have a short tutorial.
These links may help you familiarize yourself with Wiktionary:
- Entry layout (EL) is a detailed policy documenting how Wiktionary pages should be formatted. All entries should conform to this standard. The easiest way to start off is to copy the contents of an existing page for a similar word, and then adapt it to fit the entry you are creating.
- Our Criteria for inclusion (CFI) define exactly which words can be added to Wiktionary, though it may be a bit technical and longwinded. The most important part is that Wiktionary only accepts words that have been in somewhat widespread use over the course of at least a year, and citations that demonstrate usage can be asked for when there is doubt.
- If you already have some experience with editing our sister project Wikipedia, then you may find our guide for Wikipedia users useful.
- The FAQ aims to answer most of your remaining questions, and there are several help pages that you can browse for more information.
- A glossary of our technical jargon, and some hints for dealing with the more common communication issues.
- If you have anything to ask about or suggest, we have several discussion rooms. Feel free to ask any other editors in person if you have any problems or question, by posting a message on their talk page.
You are encouraged to add a BabelBox to your userpage. This shows which languages you know, so other editors know which languages you'll be working on, and what they can ask you for help with.
I hope you enjoy editing here and being a Wiktionarian! If you have any questions, bring them to the Wiktionary:Information desk, or ask me on my talk page. If you do so, please sign your posts with four tildes: ~~~~ which automatically produces your username and the current date and time.
Again, welcome! —Aryaman (मुझसे बात करो) 19:14, 10 October 2017 (UTC)
Templates
[edit]Здравейте, Kiril kovachev!
Забелязах, че от скоро добавяте думи към Българския речник на Wiktionary и на Bulgarian мартеница (martenica) сте обяснили, че все още не знаете кои темплети да ползвате, когато посочвате етимология. Най-често ползваните са:
- Произход:
{{inh|дъщерен език|език-родител|прото-форма}}
(inh = inherited), когато съответната дума е пряк наследник на реконструирана пра-дума- или
{{der|дъщерен език|произходен език|прото-форма}}
(der = derived), когато думата е производна, но не пряк наследник на пра-дума:
- или
{{inh|bg|sla-pro|*slovo}}
за произхода на думата слово- Приетите предшественици на Българския език са
cu
(Old Church Slavonic) <sla-pro
(Proto-Slavic) <ine-bsl-pro
(Proto-Balto-Slavic) <ine-pro
(Proto-Indo-European). inh не работи ако сложите друг език (пр. Oghur или Bulghar = прабългарски). Темплетът der работи със всякакви езици:
{{der|bg|trk-pro|*bulga-||to stir, to diffuse}}
за произхода на думата Българин- Ако не сте сигурни в произхода на определена дума, просто пропуснете този вид етимология.
- Заемка:
{{bor|език-заемател|език-подател|заета дума}}
(bor = borrowed) за произхода на заемки:
{{bor|bg|en|bit}}
за произхода на думата бит- Образуване:
{{suffix|език|основа|-наставка}}
или{{prefix|език|представка-|основа}}
за наставки/представки. Преводът на ставката или основата се дава с t1=, t2=, т.н. в зависимост на кое място е зададено. Често, вместо suffix/prefix, се ползва обобщения темплет affix, но той понякога не разпознава кое е основата:
{{suffix|bg|март|t1=March|-еница}}
за образуването на думата мартеница- Свързване:
{{compound|език|дум1|дума2|...}}
за сформирането на сложни думи:
{{compound|bg|женски|t1=feminine|род|t2=gender, sort}}
за сформирането на сложната дума женски родPS Уточнителите t1, t2, ... могат да се сложат на всяко място, но подредбата на ставките и основата е от значение.
Друг полезен темплет е {{ux|език|израз|превод}}
, когато давате пример на някакво значение на думата. Темплетът ux показва примера и превода му на отделни редова. Ако примерът е кратък, обикновено се използва темплетът uxi. Tой представя израза и превода му на един ред:
{{uxi|bg|това е пример|this is an example}}
за Също така, за улеснение, можете да пробвате темплета {{bg-IPA|дума на български}}
. Разбира се, няма нищо грешно в ползването на {{bg|IPA|/IPA произношение/}}
, но той е по-трудоемък, тъй като изисква Special characters.
{{bg-IPA|прѝмер}}
за Надявам се да съм ви бил от помощ. Успешна работа и не се притеснявайте, дори и да допуснете форматна грешка. Българската част на Wiktionary не се следи толкова из късо и е малко вероятно да ви налязат админите-цербери (такива има, но те обикновено работят върху други проекти). 5.150.99.73 11:55, 20 January 2020 (UTC)
Your Bulgarian edits
[edit]Hi. Thank you for your contributions. We are now working specifically on improving Bulgarian coverage and providing correct stressed inflected forms, inflections to Bulgarian entries. You will learn with the time. I'm asking you to be more accurate and pay attention to details, as your edits had inconsistencies and obvious errors. As a native speaker, you should be able to spot those, like incorrect inflected forms in a table, wrong stress or a completely different word when you copy/paste from another entry. Please don't leave such entries behind, review your own edits :) We all make mistakes. If you have any questions or have trouble creating new entries or using the new templates, you can ask me or User:Benwing2 who is very experienced and is creating the new modules and templates but he knows Bulgarian even less than me - we don't speak it, we just use resources we can find and ask native speakers for help. Your assistance will be very valuable. --Anatoli T. (обсудить/вклад) 22:57, 17 April 2020 (UTC)
Don't Panic
[edit]I blocked your bot because that's standard procedure for any new bot. Please read our bot policy for information on what you need to do to get your bot approved. Good luck! Chuck Entz (talk) 05:17, 19 June 2021 (UTC)
- @Chuck Entz Thanks for letting me know, all good. I'm now in the process of finishing my work insofar as the code, and I'll be seeking approval once I'm sure everything is working as intended. Have a good day :) Kiril kovachev (talk) 10:12, 19 June 2021 (UTC)
подробност
[edit]Hello, I believe that the inflection table for подробност (podrobnost), which you have created, is incorrect. It appears to treat the word as a masculine noun, whence *подробностът for подробността. Martin123xyz (talk) 06:47, 26 July 2021 (UTC)
- @Martin123xyz Ah, thanks very much, that is indeed correct. I believe I created that entry in a hurry and neglected to check the declension table. I'll make sure to inspect the declension table more thoroughly in future; I'm sorry for my inaccuracies. Especially for words ending in -ост, I should have known to mark them as feminine. The table should now be corrected. Kiril kovachev (talk) 07:31, 26 July 2021 (UTC)
bot speed
[edit]Hi, I notice your bot seems to be delaying 10 seconds between edits. It's probably not necessary to do that; the MediaWiki servers themselves impose a minimum one-second gap between page saves, which in my experience is plenty slow enough to avoid overloading the servers. Benwing2 (talk) 03:16, 25 June 2023 (UTC)
- @Benwing2 Hi, thanks for this, maybe this is a configuration problem on my end, but it does this automatically on the Pywikibot side—I didn't write any code to slow it down. Since I first used Pywikibot it's always output messages like this, saying
Sleeping for 8.8 seconds, 2023-06-25 03:46:17
, so I thought this was the mechanism for reducing spam. Having done some inspection of the config file, I'm guessing this is managed by theput_throttle
parameter, which like you said was set by default to 10. Should I completely zero this, or just put it to e.g. 1s? If it were truly zero, and I sent numerous requests to the servers at once, would they make sure they go through, or just drop them if more than 1 is sent per second? Thanks, anyway, for your advice! Kiril kovachev (talk) 09:37, 25 June 2023 (UTC) - I think it's fine to set it to 0; you'll get a one-second delay coming from the server itself so it won't overload the server and it won't drop any requests as they're sent one at a time. Benwing2 (talk) 10:32, 25 June 2023 (UTC)
- Great, I've updated it now. Kiril kovachev (talk) 10:47, 25 June 2023 (UTC)
Hi Kiril, the audio for the Bulgarian pronunciation of вино is very quiet and seems to be cut short. SimonWikt (talk) 16:40, 12 July 2023 (UTC)
- @SimonWikt Thanks for letting me know. I also noticed the quietness of some of my uploads from that batch as well, which I tried to address when I was recording today; could you please tell me whether you think алкохол (from today's recordings) is still too quiet? In the meantime, I will try to re-record my old audios. Unfortunately my mic is very poor, so I'm having to make do...
- With this particular word, I can hear the final vowel just fine; perhaps if you turn it up, you'll be able to hear? Kiril kovachev (talk) 16:48, 12 July 2023 (UTC)
- алкохол seems to be fine. I have my volume on maximum! Definitely can't hear the "о" in вино
- SimonWikt (talk) 17:00, 12 July 2023 (UTC)
- @SimonWikt Hmm, how very strange...! I'll definitely make sure to re-record that one, then. On my end I definitely hear it: it might also be noteworthy that it's pronounced almost like an у (/u/), so it may almost sound as though this phoneme merges with the /n/ to sound more nasalised? Unfortunately I cannot hear what it sounds like on your end, so I'm very sorry it's not playing correctly. This conversation reminded me, though, that I previously experienced a glitch just like what you describe, in which the audio playback on the Wiktionary site's audio player just seemingly cuts off short, for whatever reason. Could you also try downloading the file directly from its source, here, and then play that back locally? If it's still the same then the problem must of course lie in my recording quality, but if there's a chance you might be experiencing the same bug, then it's worth trying this sanity check... Kiril kovachev (talk) 17:40, 12 July 2023 (UTC)
- Now I have listened to the source link and know what to listen for I can just about make out the sound, but it is drowned out by a background hissing noise. For some reason the audio sounds truncated on my laptop (Windows 10, Edge Browser) but not on my mobile! SimonWikt (talk) 06:39, 13 July 2023 (UTC)
- P.S. the noun section for вино indicates that the stress can be put on either vowel, so if this is correct, then maybe we need audio for both versions 😀 The first reference shows the stress on both vowels and the second only shows it on the first. Keep up the good work! 👍 SimonWikt (talk) 06:56, 13 July 2023 (UTC)
- @SimonWikt I see, apologies that the mic quality is not the best. It's good(?) that the audio's only truncated on one end, and I presume due to that same bug I was referring to, but still a bit unfortunate, it's been around a while with no resolution... You're of course right in saying there are two possible stresses, and I do mean to record the other one as well. The only reason I didn't do so already is because the workflow on the LinguaLibre site, where I do the recordings, generated a bunch of words, but since it's just basing their generation off of normal orthography (where the accent isn't differentiated), it only provisioned a single recording option for вино. I'll go in and do it separately in a moment, I think. Thanks for the feedback, Kiril kovachev (talk) 12:05, 13 July 2023 (UTC)
- The truncation is often just some kind of random interaction between the player app and other processes. I sometimes get the same audio both with and without the truncation when I click to play multiple times in a row. Chuck Entz (talk) 14:00, 13 July 2023 (UTC)
- @SimonWikt Hmm, how very strange...! I'll definitely make sure to re-record that one, then. On my end I definitely hear it: it might also be noteworthy that it's pronounced almost like an у (/u/), so it may almost sound as though this phoneme merges with the /n/ to sound more nasalised? Unfortunately I cannot hear what it sounds like on your end, so I'm very sorry it's not playing correctly. This conversation reminded me, though, that I previously experienced a glitch just like what you describe, in which the audio playback on the Wiktionary site's audio player just seemingly cuts off short, for whatever reason. Could you also try downloading the file directly from its source, here, and then play that back locally? If it's still the same then the problem must of course lie in my recording quality, but if there's a chance you might be experiencing the same bug, then it's worth trying this sanity check... Kiril kovachev (talk) 17:40, 12 July 2023 (UTC)
Test cases for bg-hyph
[edit]Hi! You can find some test cases for {{bg-hyph}}
in my sandbox. Things look good - there was only one case which provided a wrong hyphenation. I might add a few more, e.g. for compound words separated by spaces rather than a hyphen, e.g. интернет страница. Chernorizets (talk) 21:27, 25 July 2023 (UTC)
- @Chernorizets Hey thanks for this! Would you mind if we swiped some of these onto the main test cases page?
- I also see you noticed the . workaround; unfortunately, I think it's impossible to know whether дж is meant to be kept separate or not, so the hyphenation can't ever be 100% correct without human assistance, if I'm not mistaken. That said, good point... I forgot to include that notice in the documentation. I went ahead and added it now, so thanks for the tip-off. Also—as long as you have the time for it, please feel free to make the changes yourself if you feel that they're necessary—don't feel obliged to wait for me. Kiril kovachev (talk) 07:41, 26 July 2023 (UTC)
- @Kiril kovachev feel free to steal as many of these test cases as you'd like ;)
- Chernorizets (talk) 20:24, 26 July 2023 (UTC)
- Thanks, it's already been done! Kiril kovachev (talk) 08:38, 27 July 2023 (UTC)
Future expansions of the anagram bot's word list
[edit]Hi,
Just wanted to make sure you were aware that the Bulgarian Academy of Sciences makes the dictionaries of the Bulgarian National Corpus available for download online: https://dcl.bas.bg/bulnc/en/dostap/retchnitsi/. As with any example of corpus linguistics, the dictionaries are derived from a large body of Bulgarian text spanning several decades. Some of the downloadable resources include per-style and all-inclusive wordlists ordered by decreasing frequency, and they end up containing both lemma and non-lemma forms (although not necessarily exhaustively in the case of non-lemma forms). Of course, care must be taken when using such wordlists, because they will "remember" commonly misspelled words, foreign scripts, etc. One way to control for that is to only consider a subset of the full wordlist where the number of occurrences of a word is above some threshold, and where the words are made up of an allowable set of characters.
Here's a sample from the top 300 terms by frequency in GENERAL_lemma__byFreqDes.txt
всеки 1175903 при 1136745 си 1083023 през 1069297 след 1031026 трябва 996209 има 985691 кажа 970027 бъда 960766 ли 896523 година 890084 * 889809 ако 884788 бе 872103 когато 832338 само 825752 със 815285 два 803170 много 797271 време 741035 още 729926 така 719594 страна 685246 във 669156 какво 638578 1 637916 нещо 601619 както 584887 знам 582589
Independently, in a future iteration, we should probably make some choices w.r.t. how non-alphabetic characters are handled for the purpose of anagram formation. For example, the bot determined that an anagram of таз (taz) is таз-
, which looks a little weird at first, but it sort of depends on whether the wordlist contains e.g. prefixes, suffixes and/or abbreviations. That answers questions like whether something like пран
is an anagram of the shortening напр.
for например
.
None of those are urgent questions, and I'm fine with the status quo, but it once again reinforces the importance of how the wordlist is selected - what's in, and what's out. The code layer should be able to handle whatever is in our chosen wordlist, per whatever criteria we put in place.
Cheers,
Chernorizets (talk) 05:03, 4 August 2023 (UTC)
- @Chernorizets Thanks. This looks like a valuable expansion for future runs. Also, I reckon the addition of таз- as an anagram of таз is mistaken, as we later (after the bot finished running) discussed that we should exclude words from being anagrams if they normalise to the same (as those two do). I'll figure out a way to revert all of those for now, and put them as see-alsos instead.
- With regard to the wordlist we choose to use, you're right in that there are going to be a lot of cases to consider, which we're now trying to discuss for English; with the sheer size of these data sets, it's basically impossible to proof the entire thing, so we need to make sure it's truly very sound before expanding into more complex data sets. We were fortunate that Chitanka features basically exclusively alphabetic characters + "-" and " " at most. I suggest in this case pruning all correct, Cyrillic-script entries in the list, which would leave only the edge cases behind, which would allow us to see what we're dealing with.
- Anyway, thanks for this development, Kiril kovachev (talk・contribs) 11:46, 4 August 2023 (UTC)
Hey, would you mind providing an audio recording for this word? I created the entry and nominated it for Foreign Word of the Day - it has IPA but it would be even better with an audio track. My recording situation is... meh right now. Thanks! Chernorizets (talk) 06:51, 4 August 2023 (UTC)
- You got it :) I'm just now going to sit down to record so no problem! (But it's not like my setup is anything to write home about, either...) Kiril kovachev (talk・contribs) 11:47, 4 August 2023 (UTC)
- @Kiril kovachev tenkju tenkju :) Chernorizets (talk) 11:57, 4 August 2023 (UTC)
- @Chernorizets It's been done now :p I put it on the page as well. Also, a heads-up about the usage of
{{bg-hyph}}
: if you don't give any arguments, it'll just use the page title as the argument, so that'll save you a bit of typing if you want to omit it. Unless you need to do a respelling, which ought to be quite rare I reckon. Kiril kovachev (talk・contribs) 13:31, 4 August 2023 (UTC) - BTW, I feel like I can't stop saying the -ка at the end of гадулка almost ejectively, could you confirm that the audio's adequate for that IPA? Kiril kovachev (talk・contribs) 13:33, 4 August 2023 (UTC)
- @Kiril kovachev I don't hear an issue with the -ка. What I do notice is that you, just like me, sometimes pronounce "л" like /w/ :-) That was the main reason I didn't want to try and record this. If I've misheard it and you said it with an L - apologies, and no further action needed. Otherwise, if you could rerecord it with a clearer L (without sounding strained), that would be ideal. Thanks! Chernorizets (talk) 20:19, 4 August 2023 (UTC)
- @Chernorizets Sure thing, I can re-record no problem ^^ it's just I was kind of unsure about the difference between /w/ vs /ɫ/, and I think I literally always say /w/... I don't think I know what standard /ɫ/ actually sounds like. I thought it was just an allophone of /w/ pretty much. I don't know what to do with the other recordings I've done, because I know there're loads of others where I make the same deviation, and I don't know whether I should be labelling the audios in some way to show it if it's somehow atypical... Kiril kovachev (talk・contribs) 20:24, 4 August 2023 (UTC)
- In fact, I think I often say /w/ instead of /ɫ/ in English as well... what a bother xD Kiril kovachev (talk・contribs) 20:25, 4 August 2023 (UTC)
- @Kiril kovachev many people have that allophone of /ɫ/. As the Bulgarian Phonology article explains:
- Furthermore, in the speech of many young people the more common and arguably velarized allophone of /l/ is often realized as a labiovelar approximant [w]. This phenomenon, sometimes colloquially referred to as мързеливо л ('lazy l') in Bulgaria, was first registered in the 1970s and isn't connected to original dialects.
- If this is part of your speech, I wouldn't go rerecord a bunch of words, because you probably wouldn't end up with "natural-sounding" recordings (I know, from experience, how hard it can be to say L). As a former stage actor, I did have to go to a логопед to fix this, but she gave up on me after several weeks haha There are few of us around here with the ability and desire to record Bulgarian words, so unless we get someone on board who doesn't have this variant pronunciation, we should be good with what we have. Chernorizets (talk) 20:35, 4 August 2023 (UTC)
- Ahaha, that's interesting, I'm glad to share your pain at least... Were you still able to act, in the end? I didn't think it was so necessary to use standard speech.
- Even if I can't change it everywhere, I'll do my best for the FWOTD at least :') As for the /w/ allophony trend, although I haven't paid the greatest attention, I'm positive my parents both say it that way in quick speech, as do my cousins and even my grandmas, if I'm not mistaken... I seem to internally "hear" it as /w/, too—I guess from my lack of consciousness about the difference. Interesting stuff, very interesting! Thanks for informing me, Kiril kovachev (talk・contribs) 20:46, 4 August 2023 (UTC)
- @Chernorizets Sure thing, I can re-record no problem ^^ it's just I was kind of unsure about the difference between /w/ vs /ɫ/, and I think I literally always say /w/... I don't think I know what standard /ɫ/ actually sounds like. I thought it was just an allophone of /w/ pretty much. I don't know what to do with the other recordings I've done, because I know there're loads of others where I make the same deviation, and I don't know whether I should be labelling the audios in some way to show it if it's somehow atypical... Kiril kovachev (talk・contribs) 20:24, 4 August 2023 (UTC)
- @Kiril kovachev I don't hear an issue with the -ка. What I do notice is that you, just like me, sometimes pronounce "л" like /w/ :-) That was the main reason I didn't want to try and record this. If I've misheard it and you said it with an L - apologies, and no further action needed. Otherwise, if you could rerecord it with a clearer L (without sounding strained), that would be ideal. Thanks! Chernorizets (talk) 20:19, 4 August 2023 (UTC)
- @Chernorizets It's been done now :p I put it on the page as well. Also, a heads-up about the usage of
- @Kiril kovachev tenkju tenkju :) Chernorizets (talk) 11:57, 4 August 2023 (UTC)
Реконструкции в БЕР
[edit]Здравейте, току-що премахнах коментара ви в лодка (lodka) относно реконструкцията *aldьji, дадена в БЕР. По същността си, тя е еквивалентна с тази, която съществува в Wiktionary. Разликата е, че БЕР прави реконструкции по фонетичните правила на ранния пра-Славянски, докато Wiktionary (следвайки ЕССЯ) я дава по правилата на късния пра-Славянски. По-специално, ранни *a, *ā съответстват на късни *o, *a. Причината за това разминаване е, че в еволюцията си славянските езици са претворили количествените различия между гласните в качествени такива. Безименен (talk) 11:02, 4 August 2023 (UTC)
- @Bezimenen Благодаря за това попълнение. Не бях сигурен, дали рекунструкцията не бе просто грешка или вариантна форма на „официалната“, и тъй като не познавам много добре пра-славянския, нямах начин да потвърдя. Съжалявам, че написах, че тя била „аберация“, очевидно трябва още да понауча по темата. Но благодаря за поправката, Kiril kovachev (talk・contribs) 11:52, 4 August 2023 (UTC)
Първата ни страница за рими
[edit]Rhymes:Bulgarian/uɫkɐ ;-) Рекох си че ще ти е интересно :)
Chernorizets (talk) 05:58, 6 August 2023 (UTC)
- @Chernorizets Абсолютно! :') благодаря, че я направи страницата, и дано тя да една от дълга серия...! Kiril kovachev (talk・contribs) 17:48, 6 August 2023 (UTC)
Hi Kiril, the indentation of stock exchange on the борса page looks a bit odd. Cheers, Simon. SimonWikt (talk) 15:58, 8 August 2023 (UTC)
- @SimonWikt Thanks for letting me know. I guess I wrote it a little unclearly; it was supposed to show that a stock exchange is meant to be included under the general "exchange" meaning, but with the example sentence there too, it made it kind of confusing. I did some general improvements now, and also moved the stock-exchange sense to the top-level definition; is that better? Kiril kovachev (talk・contribs) 16:11, 8 August 2023 (UTC)
- Cool! Looks good now :)
- SimonWikt (talk) 16:23, 8 August 2023 (UTC)
template editor
[edit]Hi, you should probably have template editor status, so you can edit protected Lua modules. If you are interested, you can post in the Beer Parlour requesting this; I will support you. Benwing2 (talk) 21:53, 9 August 2023 (UTC)
- I'm interested—I'll go ahead and make a post for that. Thanks for your support :) Kiril kovachev (talk・contribs) 22:05, 9 August 2023 (UTC)
Lemme know if you find it useful :-) Chernorizets (talk) 22:18, 11 August 2023 (UTC)
- @Chernorizets Ah, thanks, that's a valuable one for OCS-related terms. And much bigger than the online histdict they have from the University of Sofia! Kiril kovachev (talk・contribs) 22:20, 11 August 2023 (UTC)
This Romanian word comes from Bulgarian, but the accent of the original etymon is missing, could you add it whenever you can? Rodrigo5260 (talk) 23:13, 16 August 2023 (UTC)
- @Rodrigo5260 Sure, done. Kiril kovachev (talk・contribs) 13:50, 17 August 2023 (UTC)
Hi Kiril, regarding your recent edit, I'm curious as to why a monosyllabic word with one vowel requires a stress indicator. Is this a feature of the pronunciation module? Should I be doing this for all such words? I believe I have added some monosyllabic Bulgarian entries, I will go back and check. Thanks, Simon SimonWikt (talk) 06:20, 19 August 2023 (UTC)
- @SimonWikt sorry to step in, but I assume @Kiril kovachev made the change because this is a multi-word term. Single-word monosyllabic terms don't require a stress mark, but multi-word terms (regardless of syllable count) now do, after some recent changes. Kiril can provide the details. Chernorizets (talk) 09:55, 19 August 2023 (UTC)
- @Chernorizets Ah, thanks for letting me know.
- SimonWikt (talk) 10:07, 19 August 2023 (UTC)
- @SimonWikt That's right, we made a bug fix plus a general change to the pronunciation template so that it now reduces single words in a multiword expression if they don't have an explicit stress. If you now tried removing the stress, it would try to reduce the vowel in хляб (hljab), which is not right. So don't worry about doing this for all monosyllabic entries - it's only required for monosyllabic words that are part of a multiword term. I apologise if this is inconvenient - currently thinking of whether this reduction can be simplified to only cover certain words: it's really most applicable to short function words such as да (da), which mustn't be stressed, even though they're monosyllabic, whereas хляб would obviously never be reduced. I'll keep you updated if there's a change again, and sorry for the confusion, Kiril kovachev (talk・contribs) 10:26, 19 August 2023 (UTC)
- @Kiril kovachev Thanks for the information, I'll try to remember this in future 😀 SimonWikt (talk) 10:35, 19 August 2023 (UTC)
- P.S. @Kiril kovachev No, need to apologise! I'm a newbie and I'd rather not make edits that other people have to correct! That said, I'm glad that someone is checking my edits, I know that I have made mistakes! :) Thank you and @Chernorizets both, I appreciate your support.
- SimonWikt (talk) 10:58, 19 August 2023 (UTC)
- @SimonWikt Thanks, I very much appreciate your edits. We all make mistakes, so you especially have nothing to apologise for here! In this case it's not your fault at all, it's our module change that caused this breakage, so I do feel sorry for suddenly bending the rules whilst you're still getting used to them. Anyway, I'm glad we cleared it up, and have a great day, Kiril kovachev (talk・contribs) 11:41, 19 August 2023 (UTC)
- @SimonWikt That's right, we made a bug fix plus a general change to the pronunciation template so that it now reduces single words in a multiword expression if they don't have an explicit stress. If you now tried removing the stress, it would try to reduce the vowel in хляб (hljab), which is not right. So don't worry about doing this for all monosyllabic entries - it's only required for monosyllabic words that are part of a multiword term. I apologise if this is inconvenient - currently thinking of whether this reduction can be simplified to only cover certain words: it's really most applicable to short function words such as да (da), which mustn't be stressed, even though they're monosyllabic, whereas хляб would obviously never be reduced. I'll keep you updated if there's a change again, and sorry for the confusion, Kiril kovachev (talk・contribs) 10:26, 19 August 2023 (UTC)
- @Chernorizets Thanks for explaining in my place :) Kiril kovachev (talk・contribs) 10:27, 19 August 2023 (UTC)
Hi Kiril, sorry to trouble you, I am now confused! I put the stress on 'за' in the IPA definition as a result of our conversation above, about monosyllabic words that are part of a multiword in the IPA entry. I obviously misunderstood! Could you please enlighten me? Thanks, SimonWikt (talk) 08:55, 21 August 2023 (UTC)
- @SimonWikt Hi, sorry, I was far too vague above. What I meant up there was that all words without an accent are automatically reduced by the template, so if a word should specifically not be pronounced reduced — i.e. if it must pronounced with its full vowel quality — only then do you need to put an accent on it. Thus, for words like хляб (hljab) in пълнозърнест хляб (pǎlnozǎrnest hljab) above, you would want to put an accent on it, or else it would end up as [pɐɫnoˈzɤrnɛst xlʲɐp] instead of [... ˈxlʲa̟p]. Do you see the difference?
- Basically, for all content words, like nouns, verbs, and adjectives, you almost always want the stress, because the word is meant to stand alone; but words like я, му (short forms of the pronouns), за, със, от, etc. (monosyllabic prepositions) aren't independent, but rather rely on the surrounding words for meaning, and so are not stressed (because their meaning isn't what's important, but the things that they relate in the sentence). The only time за, със, etc. would be stressed is if the speaker is trying to emphasize their meaning, but usually that isn't the case. In кърпа за лице, за's function is to relate the purpose of the towel to the face, so in this case the word за itself isn't being emphasized and shouldn't be stressed. If you're familiar with English's weak forms, then I'd suggest these words in Bulgarian are a nearly perfect mapping onto those. E.g. in English, "the" is pronounced as [ðə] most of the time, but when said with emphasis, it's /ˈðiː/. So here in Bulgarian, we're effectively faced with the same situation. If you want, I can try to compile a subset of words with this weak property so that you don't wonder which ones are meant to be included — fortunately there aren't that many, though. I hope that's more understandable, Kiril kovachev (talk・contribs) 13:12, 21 August 2023 (UTC)
- Thanks @Kiril kovachev, I think I now have a better understanding! To be honest, the nuances are a bit beyond me, so if in doubt I'll put an rfp in the Pronunciation section :)
- SimonWikt (talk) 18:23, 21 August 2023 (UTC)
- @SimonWikt Alright, I'm glad if that helped in any way :) I'll be happy to see your rfps, just @ me in the edit summary and I can check it if you need. Kiril kovachev (talk・contribs) 18:28, 21 August 2023 (UTC)
- P.S. I have also removed the stress in the IPA from 'за' in четка за зъби and паста за зъби. I hope that this is correct and that I am not creating more problems than solutions! SimonWikt (talk) 18:29, 21 August 2023 (UTC)
- Quite right, those are the same situation as кърпа за лице, so good move! @SimonWikt Kiril kovachev (talk・contribs) 18:30, 21 August 2023 (UTC)
Bulgarian pronunciation module
[edit]I see you're developing a Bulgarian pronunciation module! This is great. I should note that rhymes are typically the stressed syllable to the end. Is this tradition different in Bulgarian? I'm asking because the examples in the sandbox do not follow this. Vininn126 (talk) 13:59, 21 August 2023 (UTC)
- @Vininn126 Rather, I'm just building on Benwing's existing work :) if the conversation interests you, we're talking actively on Module talk:bg-pronunciation. Regarding rhymes, what do you mean? It was my understanding that the "rhyme" (what we pass to the rhyme template) is the stressed syllable, minus its onset (so get rid of the consonants preceding the vowel), plus the entire rest of the word after that. For the first IPA, [t͡ʃo̟ˈvɛk], this would be the ɛk, and for the second (fictive, that stress is wrong but I but it there to test the rhyme generation), it should be [ˈt͡ʃɔvɛk] -> ɔvɛk. As far as I understand, you need to remove the consonants preceding the stressed vowel, e.g. map and trap would both rhyme as -æp, which makes sense because we know these words rhyme. Please correct me if I'm misunderstanding what you mean. Kiril kovachev (talk・contribs) 14:31, 21 August 2023 (UTC)
- That seems correct then, it seemed that I saw the accent on the first syllable but the rhyme was frmo the second. All seems on the up and up, keep up the good work! Vininn126 (talk) 15:01, 21 August 2023 (UTC)
- @Vininn126 Right, nice, thanks for keeping an eye out! If I do mess something up, please keep on criticizing it so that we get it right :) Kiril kovachev (talk・contribs) 16:48, 21 August 2023 (UTC)
- That seems correct then, it seemed that I saw the accent on the first syllable but the rhyme was frmo the second. All seems on the up and up, keep up the good work! Vininn126 (talk) 15:01, 21 August 2023 (UTC)
Quick FYI
[edit]Just FYI, I'll be traveling internationally the first half of September, so I probably won't be active on here during that time. Keep holding down the fort ;-) Chernorizets (talk) 04:59, 2 September 2023 (UTC)
- @Chernorizets Safe travels, and do enjoy! I shall do my best in your absence :) I myself will be heading to university starting the second third of September, so who knows what will happen to my regularity after that... here's hoping it won't affect all that much. We've done some good work this summer, at least :) again, hope you have a good travel, Kiril kovachev (talk・contribs) 12:16, 2 September 2023 (UTC)
- @Kiril kovachev thanks! :) Chernorizets (talk) 21:34, 2 September 2023 (UTC)
Discord
[edit]In case you're interested, a number of Wiktionary editors communicate regularly over Discord. Details on how to join the Wiktionary server can be found on Wiktionary:Discord server. As it happens, the #balto-slavic
channel is one of the most active (if not the most active) channels on the server. There is also a #japonic
channel. Just FYI :-) Chernorizets (talk) 07:20, 18 November 2023 (UTC)
- @Chernorizets Ah, that's nice, I hadn't really paid this much attention before - I now created an account. Looking forward to seeing what discussion goes on! Kiril kovachev (talk・contribs) 13:26, 18 November 2023 (UTC)
Hi @Kiril kovachev, would you mind checking the audio on мравка? I can't hear the "r", is that how it is said? Thanks SimonWikt (talk) 20:05, 17 December 2023 (UTC)
- @SimonWikt Hi Simon, it should definitely be pronounced, but you can't hear it so clearly on that recording. I'll try to get that re-recorded. My "r" in this case is overlapping a little with the "m", so you can't necessarily distinguish it so clearly. Thanks for identifying this hole — I'll be sure to put a better recording on. Kiril kovachev (talk・contribs) 20:34, 17 December 2023 (UTC)
- OK, thanks Kiril. SimonWikt (talk) 20:43, 17 December 2023 (UTC)
Mini-project: improving verbal noun entries
[edit]Hi! If you look at Category:Bulgarian verbal nouns - which now has 400+ items instead of the 11 it had before your bot run - a lot of them are missing declension tables, additional senses, references, etc. This is sort of to be expected due to the way they were being created from verb conjugation tables. It's not an immediate thing, and maybe it just falls under the BLIP umbrella, but at some point we should give them some love. I think a divide-and-conquer approach would be saner than one person trying to go thru all of them. Again, not a pressing thing - just FYI. Cheers! Chernorizets (talk) 06:43, 18 December 2023 (UTC)
- @Chernorizets Indeed, we might want to organize that somehow, e.g. you and I take some 200 each or something if you like. I don't insist on making it immediate either — maybe after we run through any more pronunciation things in the immediate future. Good idea. Kiril kovachev (talk・contribs) 20:52, 18 December 2023 (UTC)
alphabetical order
[edit]Thanks for this note. I appreciate it a lot. I'm not gonna change my ways though, as bots can do this shit. Fond of sanddunes (talk) 19:39, 2 January 2024 (UTC)
- @Fond of sanddunes Haha, quite right. Thanks for your thanks. Have a great day, Kiril kovachev (talk・contribs) 19:40, 2 January 2024 (UTC)
Bulgarian audio
[edit]здрасти @Kiril kovachev, Thanks for your recent audio additions 👍 Are you working through https://en.wiktionary.org/wiki/Category:Requests_for_audio_pronunciation_in_Bulgarian_entries? SimonWikt (talk) 16:00, 13 April 2024 (UTC)
- @SimonWikt Hey, sorry, I actually didn't know about that! Or maybe I did, but totally forgot. In principle I was just going on LinguaLibre and letting it generate entries from , but I'm sorry, it would make much more sense to prioritize those first! The way I saw your requests was actually through total coincidence, and it didn't occur to me to look for any others there might be...
- I'm actually in Bulgaria right now, but I'll be back on Wednesday and from then on I'll look in RFP first in case there's more stuff to do :) Hopefully I'll have some measure of time in the next few weeks for all that. In fact I'm now also remembering I forgot to re-record мравка (mravka) — so sorry about that...!
- Thanks for the reminder, and thanks for all your additions recently. It's great to have fellow Bulgarian editors around :) Kiril kovachev (talk・contribs) 21:55, 13 April 2024 (UTC)
- Здрасти @Kiril kovachev I hope you enjoyed your trip to Bulgaria, you'll have to come and visit next time!
- Could you check the audio for те please, sounds like it is intended for a different word!
- Благодаря, SimonWikt (talk) 21:39, 25 April 2024 (UTC)
- Hi @SimonWikt, I did check it out, but it sounds fine to me - what word does it sound like to you? Also, thanks for the kind words, I did indeed enjoy my time, and I plan now to record some more words tomorrow morning :) Kiril kovachev (talk・contribs) 22:33, 25 April 2024 (UTC)
- Здрасти @Kiril kovachev. On my devices the 't' sounds like a 'd' and there is some artefact of the recording at the end of the 'e' that sounds like the beginning of another letter, when I first heard it I thought it was a pronunciation of 'де' or 'ден'.
- Maybe it's just the clarity of the audio, your more recent recordings have more volume and are much clearer than the ones from last summer.
- Keep up the good work!
- SimonWikt (talk) 06:49, 26 April 2024 (UTC)
- @SimonWikt Haha, alright, I'll re-record this one then, in that case. I can't personally hear anything after the 'e', but my audio might not be very well set up. But, since you did have a setup-related audio issue before, maybe try it again in a different environment?
- My other comment is that the reason for the similarity to де (de) is probably caused by the lack of aspiration, whereas in English 't' is strongly aspirated and 'd' not. I don't know if that totally accounts for the discrepancy though, so I'll go ahead and make it clearer :)
- Also, concerning the volume, I'm thinking (probably once I finish Category:Bulgarian lemmas) of trying to run an audio cleanup bot, which can normalize the volumes of all the recordings, check for artefacts maybe, etc., since it's a shame there are quite a few with not-so-great quality, lots of different volumes over all the recording sessions, and so on. Thanks for bearing with me till then :') Kiril kovachev (talk・contribs) 08:56, 26 April 2024 (UTC)
- @SimonWikt I've now recorded most of the requests for audio, although due to the way I'm adding them there will sadly have to be some delay before I can update the pages - I haven't yet configured the bot to remove the RFA when it's adding audio, so it'll end up pasting that on top of/before/after the request without replacing it; but you can still find all the words if you'd like at [1]. Also, I record them in batches, but there were a few that I wasn't happy with, so I'm now gonna re-record those ones, so they'll come out soon. For similar technical reasons I can't re-record те/мравка till afte the rest have been added... Kiril kovachev (talk・contribs) 09:31, 26 April 2024 (UTC)
- Now they're all done but разг., for which I don't know if it's right to pronunce it as [rask], as I've never heard this pronounced TBH. Just as how 'n.' is pronounced 'noun', I imagnine others usually pronounce the whole word too (which I do in my head too), but I don't know. Kiril kovachev (talk・contribs) 09:40, 26 April 2024 (UTC)
- @Kiril kovachev I probably shouldn't have bothered adding a pronunciation section for разг. I notice that most abbreviations don't have them although some do, like г. which has the full word as the pronunciation.
- It probably makes sense to adopt a standard approach, should we leave out the pronunciation section for abbreviations, put a pronunciation section with the whole word, or something else? What do you think?
- Let's get some feedback from @Chernorizets as well :)
- SimonWikt (talk) 18:16, 27 April 2024 (UTC)
- @SimonWikt My preference would be to make the pronunciation section read something like "Same as разговорен.", potentially linking the same audio as appears on the full word, but I can't say what's best. Kiril kovachev (talk・contribs) 20:09, 27 April 2024 (UTC)
- Hi @SimonWikt, I would refrain from adding pronunciation sections to abbreviated words, unless there is a notable/common pronunciation other than the full word.
- As for the specific example of "разг." and others like it, please be sure to use the correct POS header. I forget if we have one for abbreviations specifically, but if not - the actual full form you've provided is the adverb (= "colloquially") rather than the adjective (= "colloquial"). Chernorizets (talk) 00:48, 29 April 2024 (UTC)
- Hi @Chernorizets, thanks for you input. I have removed the pronunciation section from разг.
- I am confused by your POS comment, according to Georgiev, Vladimir I., editor (1971), “разг.”, in Български етимологичен речник [Bulgarian Etymological Dictionary] (in Bulgarian), volume 1 (А – З), Sofia: Bulgarian Academy of Sciences Pubg. House, →ISBN, column 1, page X разг. is an abbreviation of разговорно, which according to “разговорно”, in Речник на българския език [Dictionary of the Bulgarian Language] (in Bulgarian), Chitanka, 2010 and * “разговорно”, in Речник на българския език [Dictionary of the Bulgarian Language] (in Bulgarian), Sofia: Bulgarian Academy of Sciences, 2014 is the third person singular of the Adjective разговорен. Where am I going wrong?
- Thanks, SimonWikt (talk) 07:31, 29 April 2024 (UTC)
- @SimonWikt разговорно (razgovorno) is both the neuter singular form of the adjective разговорен (razgovoren), as well as the base form of the adverb derived from that adjective. I believe the intent of this and other similar abbreviations is to convey adverbs, so that when it's applied to word sense definitions, it's equivalent to saying that the word in that sense is "(used) colloquially".
- Things like разг. (razg.) in Bulgarian dictionaries correspond to labels on Wiktionary, like
{{lb|colloquial}}
. As you can see in Module:labels/data, this is aliased to{{lb|colloquially}}
- the adverb form of the label. So you have a choice for the full form of the abbreviation - if you want to treat it as an adjective, the full form should be the adjective base form, which is always masculine singular - разговорен (razgovoren). If you want to treat it as an adverb, the full form should be the adverb base form - in this case разговорно (razgovorno). I think it should be the latter. - You may want to add a "Usage notes" section to indicate that this particular abbreviation is used in dictionaries to label colloquial words or word senses. Not a requirement, but just a suggestion. Chernorizets (talk) 11:01, 29 April 2024 (UTC)
- @Kiril kovachev Hi Kiril, great work, thanks!
- SimonWikt (talk) 18:06, 27 April 2024 (UTC)
- My pleasure :) Kiril kovachev (talk・contribs) 20:07, 27 April 2024 (UTC)
- @SimonWikt Hey Simon, I think it's sorted now. It turned out not so big of a deal, I just had to remove the bits where there was an
{{rfap|bg}}
, but it's now been handled. There are a few entries where the bot couldn't automatically insert the audio, but I will now go through them quickly. Please post more if you need them, Kiril kovachev (talk・contribs) 15:14, 8 July 2024 (UTC)- Alright, the ones requiring manual attention are done, but there are some whose variant audios I haven't recorded yet. I am still thinking about how to do this in general in my current setup, which is only optimal for words that have a single possible stress, but I can try to concretely focus those words in RFAP at least. Kiril kovachev (talk・contribs) 15:23, 8 July 2024 (UTC)
- Hi @Kiril kovachev, that's great! I'll check them out. I have done a few more rfap's since we last spoke, but not many.
- Thanks SimonWikt (talk) 05:48, 9 July 2024 (UTC)
- Now they're all done but разг., for which I don't know if it's right to pronunce it as [rask], as I've never heard this pronounced TBH. Just as how 'n.' is pronounced 'noun', I imagnine others usually pronounce the whole word too (which I do in my head too), but I don't know. Kiril kovachev (talk・contribs) 09:40, 26 April 2024 (UTC)
- @SimonWikt I've now recorded most of the requests for audio, although due to the way I'm adding them there will sadly have to be some delay before I can update the pages - I haven't yet configured the bot to remove the RFA when it's adding audio, so it'll end up pasting that on top of/before/after the request without replacing it; but you can still find all the words if you'd like at [1]. Also, I record them in batches, but there were a few that I wasn't happy with, so I'm now gonna re-record those ones, so they'll come out soon. For similar technical reasons I can't re-record те/мравка till afte the rest have been added... Kiril kovachev (talk・contribs) 09:31, 26 April 2024 (UTC)
- Hi @SimonWikt, I did check it out, but it sounds fine to me - what word does it sound like to you? Also, thanks for the kind words, I did indeed enjoy my time, and I plan now to record some more words tomorrow morning :) Kiril kovachev (talk・contribs) 22:33, 25 April 2024 (UTC)
Tagalog Anagrams
[edit]@Kiril kovachev @KovachevBot If you don't mind, can you also perform automatic anagrams with your bot for Tagalog language? Thank you! 𝄽 ysrael214 (talk) 13:06, 30 June 2024 (UTC)
- @Ysrael214 Hi, I can try! :) Is there anything I should look out for in Tagalog, or can I basically treat it like English when generating the anagrams? Kiril kovachev (talk・contribs) 13:55, 30 June 2024 (UTC)
- @Kiril kovachev I think just treat it like English. If I see exceptions, I'll edit them and let you know :) Thanks! 𝄽 ysrael214 (talk) 13:59, 30 June 2024 (UTC)
- @Ysrael214 Alright then, I'll let you know when it's started running! ^^ Kiril kovachev (talk・contribs) 14:15, 30 June 2024 (UTC)
- @Ysrael214 One problem though, I can't read Baybayin — I saw you made several entries in it, but should this be an issue at all? The bot will still sort all the Baybayin letters like English ones for the purpose of anagrams, but if there are any special symbols (e.g. there appear to be certain diacritics, like on ᜐ᜔, which I think I don't currently handle), it will probably think that two things aren't anagrams when they actually should be. This isn't catastrophic, as at least it won't generate any false anagrams, but what should I do about it? Kiril kovachev (talk・contribs) 14:24, 30 June 2024 (UTC)
- @Kiril kovachev Oh can you exclude the Baybayin for now? Just the ones with Latin letters. 𝄽 ysrael214 (talk) 14:25, 30 June 2024 (UTC)
- @Ysrael214 no problem then :) Kiril kovachev (talk・contribs) 14:37, 30 June 2024 (UTC)
- @Ysrael214 Okay, it has run a little bit, looking fine so far, you can see its results on sumagot, pistola, paya, Pisis, kawing. Further considerations: it says on the Wikipedia page for Filipino alphabet (which AFAICT is the alphabet used in Tagalog now?) that 'ñ' and 'ng' are distinct letters, so I figure words shouldn't be considered anagrams if they break up 'ng'. I've also made a manual modification to re-join n + tilde into ñ, so that the diacritic doesn't get stripped when normalizing anagrams. I.e. basically ñ will not be turned into n by the script when considering anagrams. But I will need to work harder to figure out how to treat 'ng' correctly — am I correct in the first place to treat these letters like this, or should I be applying the previous English logic to them still? Kiril kovachev (talk・contribs) 15:16, 30 June 2024 (UTC)
- Also g with tilde as well, as in bong̃a. Kiril kovachev (talk・contribs) 15:17, 30 June 2024 (UTC)
- For ñ that is correct, they're a separate letter, for ng, please treat them like they're separate letters n and g. Reasoning for that is ng can be pronounced either /ŋ/ or /ŋg/, which makes it not considered a letter. For this algorithm at least. So sanga can be an anagram for gansa. If you like, you can send me a .txt of the words with "ng" processed. 𝄽 ysrael214 (talk) 15:23, 30 June 2024 (UTC)
- @Kiril kovachev Forgot tag. 𝄽 ysrael214 (talk) 15:23, 30 June 2024 (UTC)
- @Kiril kovachev Yes please treat "ng" like English as if "sing" has an anagram "nigs". 𝄽 ysrael214 (talk) 15:27, 30 June 2024 (UTC)
- @Ysrael214 OK, here is a list of all the anagrams that might be affected by "ng", but no problem, I will treat it like separate letters anyway.
- kawing, wangki
- Gabon, bong̃a, bango, bagon, Agbon
- sungko, sukong
- panutog, patungo
- kalingaan, kailangan
- nanaginip, panginain
- yanga, yaang
- banig, Abing, bagin
- Kumintang, kumintang
- walang-bayag, walang bayag
- hilagang-silangan, hilagang silangan
- Pang., pang-, Pang
- na'ng, nang
- walang-pakundangan, walang pakundangan
- sangko, sakong
- dingding, ngidngid
- hangad, hagdan
- bangsi, bangis
- lingo, ilong, Liong, ngilo
- miting de-abanse, miting de abanse
- lagyan, Ylagan, ngalay
- sagingan, sangagin, sinangag
- dalampasigan, samaing-palad
- panukalang-batas, panukalang batas
- linggo, gilong, Linggo
- gatang, tangga
- gapang, pagang, agpang
- lingap, laping, paling, pingal, pangil
- gising, singgi, siging
- sunugin, suungin
- piing, ginip
- manginsulto, mang-insulto
- sagansan, sasangan
- tangina, ginatan
- sumulong, Sumulong
- magkataling puso, magkataling-puso
- bangga, bangag, bag-ang, bagang
- walang-kaluluwa, walang kaluluwa
- anghang, hanggan
- tingni, tingin
- tinig, iting, tingi, ingit, ngiti
- singli, singil, Lising
- usngal, sungal, sung-al, lugnas, lungas
- karunungan, Karunungan
- yangyang, ngayngay
- gansal, asngal, sangla, sangal, salang
- Panginoon, panginoon
- pagitan, pag-itan, patagin, pangati
- tangis, tignas
- sungi, ungis
- lanog, lango, galon
- saing, angis, sinag, sangi, Isang, singa
- panaog, angpao
- ningas kugon, ningas-kugon
- mangaliwa, maliwanag
- sangkal, langkas, saklang
- Gaston, tangos, gantso
- dagang puti, dagang-puti
- masinggan, magningas
- lusong, sulong, lungos
- kutong-aso, kutong aso
- Kongreso, kongreso
- kulalaying, Kulalaying
- walang bayad, walang-bayad
- tingga, tanggi, gitang, tigang
- Bangkal, bangkal
- pulong, pungol, punglo
- bantog, Tangob
- hingal, haling
- galong, lagong
- walang-sala, walang sala
- bunga, buang, Bunag
- kamanga, mag-anak
- anga, naga, gana, gaan
- laang, angal
- bugtong, Bugtong
- tabang, tab-ang, batang
- langkaan, Langkaan, anaklang
- dulang, lungad, lundag
- ingaw, gawin, ginaw, gaw-in, iwang
- hangal, halang
- gatlang, tanggal
- ingay, yanig
- Langgam, langgam
- tagain, tag-ani, tainga
- Gascon, Sangco
- hibang, bahing, gabhin
- ika nga, ikanga, 'ika nga
- alang-alang, ngalangala
- pusong mamon, pusong-mamon
- Emong, Monge
- Parang, parang
- awanggan, gawangan
- tugno, tungo, ungot, untog, gunot, tugon, tunog, utong
- sangandaan, Sangandaan
- Angeles, Senegal
- Ong, Ngo
- galang-galangan, galanggalangan
- -ng-, 'ng, NG, -ng, n.g., ng̃, Ng, ng, n͠g
- Sangalang, sangalang
- langka, angkla, kalang
- gitling, linggit, tinggil
- babang-luksa, babang luksa
- taping, patnig, pangit, pantig, Pantig, tap-ing, pating
- kandungan, kundangan
- labing-, labing, baling, bilang, langib, libang
- gayang, yagang, yangga
- hinang, hangin, hingan
- Ebanghelyo, ebanghelyo
- ungi, guni
- muang, umang
- magaling, Magaling
- kasing-, sangki, sikang
- banting, bintang
- walang-kain, walang kain, kalawangin
- tangay, Tanyag, tanyag
- lutong, tulong
- ping pong, pingpong
- bagalan, balanga, Balanga, Alabang
- abang, banga, gaban, baang
- gansa, angas, agnas, sanga
- galang, Galang
- tangka, angkat, katang
- ateng, tange, tenga
- ganap, panga, pag- -an, panag-, Gapan
- makiling, mangikil
- lansangan, Lansangan
- Sangley, sangley
- buteteng-laot, buteteng laot
- luwang, lungaw
- walang-hanggan, walang hanggan
- tumangis, masungit
- singga, sanggi, saging, sangig, sigang
- talibong, balingot
- ugong, Ugong, unggo
- Gining, nginig, ngingi, gining, ing-ing
- panget, ngetpa
- walang-bisa, walang bisa
- salubong, balungos
- litang, langit, taling
- busong, bungso, usbong
- tangkad, takdang
- tunggali, lunggati, taguling, tulingag
- dilang, Lingad, dingal
- Gina, angi, inag
- kulong, lukong
- gitna, ganti, ating, tangi, antig, tinga, ingat, tinag
- tong, Tong
- tango, angot, ganto, taong
- lingas, langis, islang, silang, Silang, lasing
- yangit, tiyang
- mang-udyok, mangudyok
- lungib, buling
- Manggahan, manggahan, maanghang
- suminga, sigmuan
- siping, pisngi
- abangan, banagan
- tungki, kuting
- gawan, awang, angaw, ngawa, waang
- sunog, nguso, ungos, sungo, usong, suong
- timbangan, Timbangan
- mamay-awang, mamayawang
- sungay, sugnay
- bungal, lubang
- mang-abat, magtaban, matabang, magbanta, matab-ang
- katulong, kulangot
- bulung-ita, tubig ulan, tubig-ulan
- sang-, sang
- mangayaw, Mangaway, may-angaw
- pambungad, Pambungad
- Gan, nag-, Ang, ang, nga
- pingga, paging-
- alingasngas, alisangsang
- bugnit, Buting
- dalang, dangal
- ango, naog
- tibagan, baitang
- lanting, Lanting
- ganta, tanga, angat, Gatan, Angat, tagan, atang, ngata
- nganga, ang-ang
- walang hiya, walang-hiya
- kusing, sungki
- wangis, singaw, siwang, sawing
- malunggay, maygulang
- dagang bahay, dagang-bahay
- kahinaang loob, kahinaang-loob
- yabangan, bangayan
- dayang, Dayang
- layanglayang, layang-layang
- sakang, angkas
- sumangayon, mag-anunsyo, sumang-ayon
- ngalngal, langlang
- lakdang, dangkal
- yungyong, nguyngoy
- palakas-tinig, isip-talangka
- Kang, kang
- kamuning, Kamuning
- kutong, tukong, tungko
- matang-lawin, matanglawin
- dunong, dungon
- lumbang, Lumbang
- nguya, ugnay
- sabangan, Sabangan
- hukbong panghimpapawid, hukbong-panghimpapawid
- Loleng, lelong
- pagong, pag-ong, gapong
- ganoon, Angono
- Pangan, pang- -an, pag- -nan
- ligwan, lingaw, waling, lawing
- buwang, bungaw
- bangko, bakong
- ngayon, ngay-on
- atungal, gulanta, tag-ulan
- tubig tabang, tubig-tabang
- Dalangin, dalangin
- uang, unga
- Magno, maong, among
- linga, angil, Linga, laing, Aling, ilang, aling
- siling labuyo, siling-labuyo
- tungag, tungga
- upang, punga
- yabong, bayong, bay-ong
- sahang, hasang
- walang-kuwenta, walang kuwenta
- maangal, alamang
- langgonisa, longganisa
- magna, manga, maang, mag- -an, mang̃a
- putangina, putang ina
- nangka, angkan
- saligan, isangla, anislag
- panagko, pangako
- Manong, manong
- bansag, bangas, sangab
- tulingan, lungtian
- taning, tignan
- patong, pantog
- maligayang pasko, maligayang Pasko
- mangilin, maningil
- bungot, bugnot
- uwang, gunaw
- daeng, Deang
- Mababangloob, mababang-loob
- salagubang, Salagubang
- humigit-kumulang, humigit kumulang
- ginang, Ginang, naging
- upong, pugon
- anong malay ko, ano'ng malay ko
- anak ng puta, anak ng tupa
- Silangan, silangan
- iling, ingil, liing
- katungkulan, kalungkutan
- balisakang, sa kabila ng
- ngitngit, tingting
- kalangay, langayak
- malagong, Magalong, maglango
- lungga, gulang
- hinaing, hininga
- kawayan-kiling, kiling-kawayan
- ganid, daing
- kambing, Kambing
- Isabang, bigasan, basagin
- ano'ng problema, anong problema
- Hongkong, Hong Kong
- sumilang, Sumilang
- mang-, Mang, mang
- yagban, yabang
- inang bayan, Inang Bayan
- pangko, angkop
- hanga, ahang
- kantog, takong
- ingrata, Trangia
- tagulamin, tumingala
- Asiong, gasino
- nagwas, aswang, sagwan
- lingil, liling
- bug-ong, bugong, bunggo
- inggo, Inggo
- lamang, Lagman
- kasing-kahulugan, kasingkahulugan
- bay-awang, bayawang
- tuwang, tugnaw, tungaw
- basang sisiw, basang-sisiw
- matinga, maingat
- ketongan, engkanto
- bintag, tabing
- tingid, tindig
- saligang-batas, saligang batas, Saligang Batas
- manghihipo, maghihipon
- mangahas, Mangahas
- banggit, bagting
- inang, ngani, ninag
- mangupit, pumangit
- luningning, Luningning
- Tatlong Hari, Tatlonghari
- Mangoba, magbaon, mabango
- suang, ungas, Sagun, usang
- tanggo, gatong
- tigdas-hangin, tigdas hangin
- mag- -han, mangha
- yanggi, giyang
- kalawang, kawalang-
- lungkot, tungkol, tuklong
- kawang gawa, kawang-gawa, kawanggawa
- pangat, tapang
- pang-uri, pagurin
- ping-il, piling
- yasang, sangay, sayang
- capangyarihan, Capangyarihan
- tabingi, bigatin
- salitang kalye, salitang-kalye
- berdeng dugo, dugong berde
- bingot, tibong, bintog
- hilagang-kanluran, hilagang kanluran
- dagang parang, dagang-parang
- walang-awa, walang awa
- krung krung, krung-krung
- tong-its, tongits
- takdang aralin, takdang-aralin
- duong, udong
- langpas, paslang
- dagang bukid, dagang-bukid
- kinang, angkin
- salitang hiram, salitang-hiram
- pangitain, pag-initan
- panig, ipang-, pinag-, pinga
- pulong-bituin, Pulong-Bituin
- tingka, taking, tikang
- bulingbuling, Buling-Buling
- umingay, yumanig
- Tiaong, antigo, ganito
- muyangit, munyagit
- bangaw, bang-aw, bawang
- wating, tawing, ngawit
- panauhing pandangal, panauhing-pandangal
- apitong, ipatong
- igting, giting, inggit
- ising, ngisi
- sungsong, Sungsong
- walang-dila, walang dila
- Maningas, magisnan
- yangot, tayong
- sanglay, Sanglay
- dagang kosta, dagang-kosta
- linggi, giling
- ayungin, niyugan
- daong-daungan, Daong-daungan
- taong-grasa, taong grasa
- alipunga, paliguan
- Magtanggol, magtanggol
- yumabang, Maybunga
- ungkat, tukang
- gaong, gango, agong, anggo
- walang habas, walang-habas
- duyong, udyong
- dugtong, dunggot
- mang-inis, masining
- pisang, singap
- lalang, langal
- gibang, baging
- matsing, gintsam
- Manang, manang
- dungis, dusing
- kamoteng kahoy, kamoteng-kahoy
- luyang dilaw, luyang-dilaw
- Paskong mahaba, paskong mahaba
- Pangilinan, pangilinan
- salitang ugat, salitang-ugat
- buntonghininga, buntong-hininga
- langgas, lagsang, gangsal
- walang-humpay, walang humpay
- lagnat, talang, ngatal
- libingan, bilangin
- gulung-gulungan, gulunggulungan
- gangsa, sangag
- magpatunay, mapangutya
- Mga Bilang, maglangib
- kanang kamay, kanang-kamay
- ungal, ulang
- hukbong-dagat, hukbong dagat
- alimango, Alimango, Maglinao
- Alangilan, alangilan
- ihing kidlat, ihi ng kidlat
- ano'ng oras na, anong oras na
- pakinabang, Pakinabang
- daong, Gonda
- mangako, magkano
- ampang, mapang-
- barong Tagalog, barong-tagalog
- Magtanong, magtanong
- pagbabago ng klima, pagbabagong-klima
- maango, Manaog
- mangaral, marangal
- P.S. Even if you forget the tag it's okay, I always get a notification if it's on my talk page :) Kiril kovachev (talk・contribs) 15:38, 30 June 2024 (UTC)
- @Ysrael214 It's basically ready to run now BTW, so if you let me know it's OK to do, I will set it off. Kiril kovachev (talk・contribs) 15:39, 30 June 2024 (UTC)
- @Kiril kovachev Cool! Didnt even know some words are anagrams like samaing palad and dalampasigan. Please set it on. 𝄽 ysrael214 (talk) 15:43, 30 June 2024 (UTC)
- @Ysrael214 Alright, sorry for the delay, but it's now going! There are up to 5858 edits it will do, so it should take less than 8 hours hopefully, although it is getting slowed down because it's running the English ones at the same time :O
- Thanks for your help with all this. Please tell me if there are any problems, and also if there are, you can type in "halt" into User:KovachevBot/halt in order to make it stop itself. Kiril kovachev (talk・contribs) 16:54, 30 June 2024 (UTC)
- @Kiril kovachev I was wondering if why aren't the anagrams ordered alphabetically. Is that intentional?
- example in Balatik
- The anagrams were
- shouldnt it be baklita, balikat and talikba? 𝄽 ysrael214 (talk) 17:22, 30 June 2024 (UTC)
- @Kiril kovachev Though, I'm thinking maybe the sorting can be done in Module:anagrams instead. 𝄽 ysrael214 (talk) 17:24, 30 June 2024 (UTC)
- @Ysrael214 Hi, I didn't consider the sorting of anagrams. I didn't realize they should be sorted, my bad. I can go through afterwards and sort them again in a second bot run... ideally, they can just be sorted once in the wikicode, so that the anagrams module doesn't have to do anything extra, but I suppose it'd be fine either way. Apologies for the issue. Kiril kovachev (talk・contribs) 18:45, 30 June 2024 (UTC)
- FTR, this is not intentional, but basically random: it just depends on the order in which the anagrams are added to the data set when the program is initialized, so until you mentioned this I did not realize there was any need to correct this before placing the anagrams into the page. Oh well :') Kiril kovachev (talk・contribs) 18:51, 30 June 2024 (UTC)
- @Kiril kovachev Not sure if all anagrams were already added but thanks! 𝄽 ysrael214 (talk) 22:31, 30 June 2024 (UTC)
- @Ysrael214 I think so, I was just about to message you! It just finished a moment ago apparently. Kiril kovachev (talk・contribs) 22:36, 30 June 2024 (UTC)
- @Kiril kovachev Not sure if all anagrams were already added but thanks! 𝄽 ysrael214 (talk) 22:31, 30 June 2024 (UTC)
- FTR, this is not intentional, but basically random: it just depends on the order in which the anagrams are added to the data set when the program is initialized, so until you mentioned this I did not realize there was any need to correct this before placing the anagrams into the page. Oh well :') Kiril kovachev (talk・contribs) 18:51, 30 June 2024 (UTC)
- @Ysrael214 Hi, I didn't consider the sorting of anagrams. I didn't realize they should be sorted, my bad. I can go through afterwards and sort them again in a second bot run... ideally, they can just be sorted once in the wikicode, so that the anagrams module doesn't have to do anything extra, but I suppose it'd be fine either way. Apologies for the issue. Kiril kovachev (talk・contribs) 18:45, 30 June 2024 (UTC)
- @Kiril kovachev Though, I'm thinking maybe the sorting can be done in Module:anagrams instead. 𝄽 ysrael214 (talk) 17:24, 30 June 2024 (UTC)
- @Kiril kovachev Cool! Didnt even know some words are anagrams like samaing palad and dalampasigan. Please set it on. 𝄽 ysrael214 (talk) 15:43, 30 June 2024 (UTC)
- @Ysrael214 It's basically ready to run now BTW, so if you let me know it's OK to do, I will set it off. Kiril kovachev (talk・contribs) 15:39, 30 June 2024 (UTC)
- @Ysrael214 Okay, it has run a little bit, looking fine so far, you can see its results on sumagot, pistola, paya, Pisis, kawing. Further considerations: it says on the Wikipedia page for Filipino alphabet (which AFAICT is the alphabet used in Tagalog now?) that 'ñ' and 'ng' are distinct letters, so I figure words shouldn't be considered anagrams if they break up 'ng'. I've also made a manual modification to re-join n + tilde into ñ, so that the diacritic doesn't get stripped when normalizing anagrams. I.e. basically ñ will not be turned into n by the script when considering anagrams. But I will need to work harder to figure out how to treat 'ng' correctly — am I correct in the first place to treat these letters like this, or should I be applying the previous English logic to them still? Kiril kovachev (talk・contribs) 15:16, 30 June 2024 (UTC)
- @Ysrael214 no problem then :) Kiril kovachev (talk・contribs) 14:37, 30 June 2024 (UTC)
- @Kiril kovachev Oh can you exclude the Baybayin for now? Just the ones with Latin letters. 𝄽 ysrael214 (talk) 14:25, 30 June 2024 (UTC)
- @Ysrael214 One problem though, I can't read Baybayin — I saw you made several entries in it, but should this be an issue at all? The bot will still sort all the Baybayin letters like English ones for the purpose of anagrams, but if there are any special symbols (e.g. there appear to be certain diacritics, like on ᜐ᜔, which I think I don't currently handle), it will probably think that two things aren't anagrams when they actually should be. This isn't catastrophic, as at least it won't generate any false anagrams, but what should I do about it? Kiril kovachev (talk・contribs) 14:24, 30 June 2024 (UTC)
- @Ysrael214 Alright then, I'll let you know when it's started running! ^^ Kiril kovachev (talk・contribs) 14:15, 30 June 2024 (UTC)
- @Kiril kovachev I think just treat it like English. If I see exceptions, I'll edit them and let you know :) Thanks! 𝄽 ysrael214 (talk) 13:59, 30 June 2024 (UTC)
Could you do something to get this out of Category:Pages with ParserFunction errors? For some reason the system is trying to execute the embedded template code, and there's a bug that keeps it from going in Category:Pages with ParserFunction errors/hidden. The simplest fix would be template-syntax elements in js comments before and after, such as //<nowiki>
and //</nowiki>
, that would tell the parser not to execute it.
Thanks! Chuck Entz (talk) 14:14, 3 July 2024 (UTC)
- @Chuck Entz Dear Chuck, I'm not sure I quite understand why there is an error, but may you please see whether it's gone now? Was it the case that the {{}} notation was making it try to apply some sort of parser function there? Kiril kovachev (talk・contribs) 18:56, 3 July 2024 (UTC)
- Yes, it's fine now. As far as the system is concerned, anything in almost any namespace that's formatted like a template invocation is parsed as a template invocation. I've seen all kinds of things happen in userspace js pages over the years, but nowadays most people just import code from certain other users' js pages, and those other users know how to prevent problems like this. Thanks! Chuck Entz (talk) 05:25, 4 July 2024 (UTC)
- @Chuck Entz Ah, okay. Thanks for letting me know, glad it was fixed :) Kiril kovachev (talk・contribs) 13:05, 4 July 2024 (UTC)
- Yes, it's fine now. As far as the system is concerned, anything in almost any namespace that's formatted like a template invocation is parsed as a template invocation. I've seen all kinds of things happen in userspace js pages over the years, but nowadays most people just import code from certain other users' js pages, and those other users know how to prevent problems like this. Thanks! Chuck Entz (talk) 05:25, 4 July 2024 (UTC)