User:Azylber/RaBOTnik/Task0/Question1

Hi guys,

I've been working on RaBOTnik's task zero. I've been dealing with millions of problems, for example people call the lang-ru template with so many crazy and invalid content that you wouldn't even imagine.

I've been filtering all the non-Russian stuff I've found inside calls to lang-ru. For example I've been filtering Latin letters, numbers, symbols, wrongly encoded characters, references (yes, some people cite sources inside calls to lang-ru...), links (yes, some people put wikilinks inside calls to lang-ru...) and so many other things...

But now, after about 1,000 lines of code and 5,000 hairs pulled out, I think I've now managed to filter everything that is not Russian names and create a proper list that contains each and every Russian name used in calls to lang-ru.

I've further depurated that list and I'm ignoring words where stress is obvious, such as the ones that have only 1 vowel and the ones that contain ё.

I've also found many errors in the existing calls to lang-ru, for example:

1) I've detected 600 calls to lang-ru that contain Russian words where some of the letters are not actually Cyrillic letters - they're Latin letters that look like Cyrillic letters. For example, Барковa. If you look closely, you will notice that the last a is not a Cyrillic a, it's a Latin one. I think it would be nice if RaBOTnik fixed those entries. I might incorporate it into task 1, or separate it as task 2.

2) When analysing the data, I found what I call "incompatible stress marks". That is, certain Russian words appear in different articles with the stress marks placed on different vowels. I think some of them might be errors, and some of them might be different words.

So now I'm going to show you the list of all the "incompatible stress marks" that I've found in the entire English Wikipedia. There are 49 pairs.

What I need your help with:

Please go through this list, and tell me, for each pair, which one is right and which one is wrong. Please put "--" next to the one that is correct. If both are valid, please put "++" next to both members of the pair.

Azylber (talk) 08:39, 25 September 2013 (UTC)

0	Ива́нов++ 1	Ивано́в++ 2	Сергее́вич 3	Серге́евич-- 4	Абрамо́вич++ 5	Абра́мович++ 6	Каме́нский++ 7	Ка́менский++ 8	Александро́вич++ 9	Алекса́ндрович++ 10	Само́йлович-- 11	Самойло́вич 12	Антоно́вич 13	Анто́нович-- 14	Павло́вский++ 15	Па́вловский++ 16	Наумо́вич++ 17	Нау́мович++ 18	Кушни́р 19	Ку́шнир 20	Адольфо́вич 21	Адо́льфович++ 22	Лазаре́вич++ 23	Ла́заревич++ 24	Фо́мич 25	Фоми́ч-- 26	Владими́р 27	Влади́мир++ 28	Ва́димович 29	Вади́мович-- 30	Жу́ковский++ 31	Жуко́вский++ 32	Ду́бовский++ 33	Дубо́вский++ 34	О́стровский++ 35	Остро́вский++ 36	Воскре́сенский 37	Воскресе́нский-- 38	Соко́льский-- 39	Со́кольский 40	Ме́нделевич-- 41	Менделе́вич-- 42	Кото́вский++ 43	Ко́товский-- 44	Кога́н++ 45	Ко́ган++ 46	Новико́в 47	Но́виков-- 48	Сусли́н++ 49	Су́слин++ 50	Лаби́нск 51	Ла́бинск 52	авто́номная 53	автоно́мная-- 54	Се́рги-- 55	Серги́ 56	Вы́готский 57	Выго́тский 58	Быко́вский++ 59	Бы́ковский++ 60	Ароно́вич++ 61	Аро́нович++ 62	У́да 63	Уда́ 64	Ко́рсаков++ 65	Корса́ков++ 66	Ни́колай 67	Никола́й-- 68	Максими́лиан 69	Максимилиа́н-- 70	Гурко́ 71	Гу́рко 72	Кере́нский 73	Ке́ренский-- 74	Кара́ 75	Ка́ра 76	Оре́ст-- 77	О́рест 78	О́ла 79	Ола́ 80	И́льич 81	Ильи́ч-- 82	Максимо́вич++ 83	Макси́мович++ 84	Ахту́ба 85	А́хтуба ++ 86	Гу́ставович-- 87	Густа́вович++ 88	Ру́бин++ 89	Руби́н++ 90	Быко́во 91	Бы́ково-- 92	Ра́кетный 93	Раке́тный-- 94	Чу́па 95	Чупа́ ++ 96	Алексее́вич 97	Алексе́евич--