Sunday, February 27, 2022

Behind the scenes: reporting, development of trees and more

 [#DuolingoForumGems originally posted on 2020-05-23 on the Duolingo Hungarian for English speakers forum by  ]

Behind the scenes: reporting, development of trees and more


Hi there!

If you are reading this post, you probably belong to the small minority of Duolingo users that contributors and other learners can access. This is amazing, you guys provide the most feedback to the courses while having a much better Duolingo experience thanks to all the knowledge that's available on the forum! There are things that you maybe haven't come across yet.
This post is meant for "naive" Duolingo users, like I was a couple of months ago, not quite knowing how Duolingo is programmed and what happens behind the scenes. Now that I have been a contributor for almost two months, I thought it would be good to share a couple of things that I learned from it. This includes to-do's, not-to-do's, general information about what you can expect in certain scenarios and of course my own personal opinion sometimes.

Reverse course

Something that may be unexpected is that a lot of resources are shared between the EN-HU and the HU-EN courses. This isn't a unique thing, just how Duolingo works. X for Y speakers and Y for X speakers always complement each other. This means you can't have the same sentence in both courses as distinct entries. The same way, you can't have the same words in both courses as separate entities either. If you change a common sentence, it will change for the other course as well. This is a huge constraint on acceptable translations and also part of the reason why hints are so problematic. Their reverse-translation is our translation and vice versa. This means certain translation issues can only be resolved with the help of the EN-HU team. (I find this design choice quite weird. It seems fairly obvious not the same people want to contribute to an English course and to a Hungarian course - it's also obvious these courses have very different needs.)


Duo's feedback on your solution ("Another correct solution", "You have a typo" etc.)

First of all, as far as I know, we don't have any control over the accent checker feature. With typos, it's a bit better - we can add new entries that we consider typos of a certain word - most of the time, it's Duo's magic, though. (That is, there are mistakes that Duo will always consider a "typo" and we can't override this decision to "plain wrong".)

I never really cared about the solutions Duolingo showed me as "another correct solution" but it seems to me there are people who care. Some people seem to think the solution Duolingo shows you as "another correct solution" is superior to your own solution - and even more people think that the translation that appears on the discussion page must be superior.

Now, let me introduce you to Duo's system of best translations and accepted translations. Contributors can create sentences and decide about its counterpart in the other language, used for reverse tasks (and, as now you know, possibly used in the reverse tree). This pair of sentences are supposed to be didactic - but it still doesn't mean they always are. They are just as good as the contributors' work overall. So, what about BTs and ATs? Well, the mentioned pair of sentences are always meant to be BTs when their counterpart is to translate. (As far as I know, there can be other BTs for the same excercise but most of the time, it's better to only keep one BT.) The importance of BTs is that they are shown to the user as "another correct solution" whenever they come up with a different solution. As the main principle, we can say only BTs are supposed to be shown to the user. Still, if someone gives a BT as their solution, it's possible that Duolingo will show an AT as "another correct solution". This is an odd behaviour it's good to be aware of.
So, what are ATs, after all? They are correct solutions that are supposedly less didactic than the BTs. Solutions that we don't aim to teach but they are valid nevertheless. Since ATs generally aren't meant to be shown to the user, they can be messy (wrong casing, no punctuation etc. - they shouldn't but it's completely up to the contributors' work) and they can contain stuff you don't have to know. The main purpose of ATs is to provide a better experience for users - you don't have to guess what the contributors had in mind if you have another valid translation. ATs are always accepted for a good reason, they have to be valid and they aren't necessarily inferior to the BT(s) - they can be too advanced to teach, less common or less didactic, let alone there is no guarantee the BTs were always chosen well.

Don't forget that contributors are volunteers with different experiences and technical skills and although they have the best intentions, they can make debatable design decisions and even mistakes. Now, that's why reporting exists. :)

Your feedback on the task, advice on translation & reporting

This part is mostly about what you guys can do when you don't just move on to the next exercise.

First, you can look around on the discussion page. This can be a good thing if you are unsure about something. Please, always look for comments related to your question before asking a new question. Fortunately, we have a couple of active members who usually answer a lot of questions but still, it's better for everyone to just find the solution in minutes, instead of waiting for days or even months while polluting the page. (Something related - please keep the discussion relevant to the examples by focusing on their grammar and related properties. The purpose of the discussion pages is to help users understand sentences, not so much to discuss their content. I'm thinking of sentences like "Férfiakat akarok" - "why isn't it férfikot?" is a more relevant comment than "me too, man")

Now, turning to reporting. What to report and what to make sure before reporting? I want you to know what happens when you report something about an exercise. It shows up for us, contributors in the exercise editor. So mind you, when you write a report, you virtually contact us and not the developers. From our perspective, reports are ambivalent - of course we are happy to improve the course with new translations but at the same time, dealing with useless reports is both a pain in the ass and it can keep any course in beta because the ratio of reports matter.
The creators of the Latin course wrote a very elaborate post about their requests and recommendations about reporting. A lot of things match so I thought it would be worth linking: https://duolingo.hobune.stream/comment/33853120.
 Recall: please, double or triple check your solution before proposing it as correct. This may sound over-sentimental but really, think of the contributors who have to face over 80% of wrong solutions with banal mistakes as "solutions to add".
Now I want to give you a couple of examples about what I wouldn't advise regarding translation and reports.

Let's start with typos. Please, don't report if you realize your sentence had a typo - it's quite pointless as a sentence with a typo can't get approved. If you think you only had a typo but Duo took it as wrong, it may be a better idea to write a custom report about the case.

Now, let's talk about English. Every Duolingo course turns into an English course at some point, right? :) As you might know, there are issues with English in the course. Does this mean you should propose sentences with problematic English? No. We want to get rid of lame English sentences eventually, not to embrace them. In theory, Duolingo courses target the native speakers of the base language so we are discouraged from accepting bad English. This in itself wouldn't convince me personally but think about it - how could we know you understand the Hungarian sentence if your English sentence isn't correct at all? We should assume you know English at least as much as Hungarian so if you submit bad English, it suggests the Hungarian sentence didn't quite make sense for you. (Also, ATs are handwritten so it puts extra mental pressure on you to accept a sentence that you find wrong. :D)

As ATs are handwritten, I think it's worth to talk about word order. This is a topic that causes headache for many learners - I could even say most learners. Still, I wouldn't recommend you to do some "trial and error" method on sentences that consist of more than 3 parts of speech. What do I mean? Don't try to figure out word order by intentionally moving words all around. If you got to a certain solution naturally, that's okay. Permutating words just for the sake of it is not. Noone wants to write dozens of ATs that are unlikely to ever be said anyway. So the only thing that could happen is 1. your solution wouldn't be accepted and therefore you couldn't learn anything from it 2. contributors would have to face tough dilemmas whether a certain word order is still okay or too awkward 3. contributors would have to add a ridiculous amount of new ATs noone would use
A special case of this is: word order when listing things (like "tall and beautiful"). Even if one could argue "X and Y" means basically the same as "Y and X", there are two problems with this. First, it's suboptimal because we cannot assume you didn't accidentally mix up the meanings of the words. Second, for every similar enumeration, it would multiply the number of ATs instantly. Instead of working on something useful, we would be struggling with carefully duplicating all possible translations. I think we can agree this wouldn't be a great trade-off.

What can happen to your report? Unless it's a new solution proposal, the only thing that can happen to it is that we read it and note it, trying to learn from it. If it's a new solution proposal, three things can happen:
1. it can be discarded silently
2. it can be discarded silently with notes about it. This is like noting that a well-known wrong solution was given. In theory, when people type in a solution like this, they should see the notes taken so that they would be discouraged from reporting it again. This feature is broken but still, reports about such sentences won't reach us again, they will be discarded automatically.
3. it can be approved - in this case, you get an email notification about your recommendation. This does nothing in itself but it implies the contributors created new ATs that cover your solution. The important thing here is that technically, you can get a "false alarm" and vice versa: your solution can become accepted without any notification.

Audio

Regarding audio, there are two kinds of remarks. About slow mode: it's not completely up to us. This course has actual recorded audio which, to my knowledge, doesn't even offer slow mode. (Even if it did, I'm not sure anyone would manually manipulate audio for each and every record to provide a slow mode.) From what I heard, maybe we could combine text-to-speech and voice recordings? This seems to be work in progress. I think actual records are much more valuable and they are worth it even without slow mode but I'm not sure how many people share this opinion.
About reports: sure, go ahead. Unlike text-to-speech, we can change the audio. At least in theory. I gotta be honest, I never succeeded with it but it's supposed to work so I guess the contributors can do something about it, even if not me personally. (Again, I would like to add that I find 95% of complaints about the audio irrelevant. People can be picky sometimes, also, not blaming anyone but many people report the audio without knowing Hungarian phonetics well so they end up reporting the correct pronunciation asking for some kind of hypercorrection.)

Old tree, new tree, changes

Since the current tree have been beta for like 5 years (...check, 3 years and 11 months :P), you may be curious about how Duolingo trees evolve. Released trees cannot be reorganized on the fly. Adding and removing sentences in an existing skill, using the same words is possible technically - but even this is beyond usual tasks of average contributors, not something I personally could do on my own. On the other hand, I think we can add "Tips & notes" articles and manage translations, including ATs and BTs. That's all. Besides, some sentences get removed automatically, because of the amount of reports. It's no loss most of the time but I think it's good to know.

A new tree have is being developed. (A couple of posts about it: https://duolingo.hobune.stream/comment/31715523 
https://duolingo.hobune.stream/comment/37159839 )
Now, this is not an easy task, really not, as you can guess by the amount of time that has passed since the work started. Obviously, noone wants to repeat the mistakes of the current tree. I don't think it's a matter of months, it's likely to still take several months and I might be optimistic now. As I'm not among the main designers of the new tree, it's not really up to me either but don't worry, serious work is going on and so far, my impression is that the majority of sentences will feel much more realistic.
On the other hand, as it's still far from being complete, please still be patient with the current tree. Looking at the diagrams in the incubator, it's getting more bearable so we can get away with it until the new tree is finished.

That's all, folks! I hope you found it helpful and your comments are welcome!




Comments:


https://www.duolingo.com/profile/Wyrg14
  • 24
  • 20
  • 678

Many thanks for such an extremely interesting post! I've been wondering about how things happen behind the scenes since I started (e.g., how do they enter multiple acceptable answers? Do they list all possibilities independently or do they have some sort of "programming language" allowing to code for word substitutions, different word orders, etc.? I think your post answered that question of mine quite nicely!). Your peek behind the curtain post is really great!


https://www.duolingo.com/profile/MrtonPolgr
  • 22
  • 21
  • 20
  • 13
  • 6

I'm happy that some of you guys found it useful. :)


https://www.duolingo.com/profile/jzsuzsi
MOD
PLUS
  • 25
  • 25
  • 22
  • 21
  • 19
  • 14
  • 11
  • 10
  • 9
  • 9
  • 8
  • 8
  • 7
  • 7
  • 7
  • 6
  • 6
  • 6
  • 6
  • 5
  • 5
  • 4
  • 4
  • 4
  • 4
  • 2

There is a compact form to make it shorter. For example, I [go/am going] to the [shop/store].

But this does not help with word order, we write different word orders in separate rows.

No comments:

Post a Comment

Frequently asked questions. (What is the difference between ...?)

Here are some questions (and answers) that come up often. Here I mainly focus on the What is the difference? type of questions. Q: What is t...