Jim

Generic Contest Discussion

Contest Setup  

312 members have voted

  1. 2. Publish result list including...?

  2. 3. Preferred building period?

  3. 4. Preferred voting period?

  4. 5. Favorite voting scheme? (multiple answers allowed)

    • 20 points (distribute all, max 10 per entry)
    • 10 points (distribute all, max 5 per entry)
    • Old Formula One style (distribute 10, 6, 4, 3, 2 and 1 points)
    • New Formula One style (distribute 25, 18, 15, 12, 10, 8, 6 ,4, 2 and 1 points)
    • Eurovision Songfestival style (distribute 12, 10, 8, 7, 6, 5, 4, 3, 2 and 1 points)
  5. 6. Public or private voting?

  6. 7. Should we allow digital entries?



Recommended Posts

5 hours ago, gyenesvi said:

I think that had more to do with generic interest in the topic of the contest than with engagement. With this contest, interest seemed much higher, so I'd expect voting participation would have also been higher. But I don't mind jury voting or 50/50 voting actually. I do believe that the jury can be better at enforcing the spirit of the contest and avoiding "bigger is cooler" voting.

However, I do agree with a previous comment, that it could be both easier / better / more transparent to score each entry individually, and then derive the ranking from the individual scores, instead of directly ranking them against each other. For example each entry could get a score on a 1 to 10 scale for each criteria, according to how well it satisfies each criteria associated with the spirit of the contest, and the scores per criteria would be summed up to arrive to a final score for a model. Has this been ever tried?

That would be tricky, like for this past contest a few entries would have scored 10 out of 10 in every criteria, or maybe an equal mix of 9s and 10s, so there would be no way to differentiate between them. These are contests, especially when there are real prizes involved, and you can't avoid having someone in first, someone else in last, and everyone else somewhere in between. Wether the people judging are made up of a jury or they are the wider forum, you just got to make the best model you can based on feedback from the WIP topics. But this time with the jury we actually got full paragraphs of explanations for the top 3 explaining exactly why they were the winners, which is more than we got before.

Share this post


Link to post
Share on other sites
2 minutes ago, allanp said:

That would be tricky, like for this past contest a few entries would have scored 10 out of 10 in every criteria, or maybe an equal mix of 9s and 10s, so there would be no way to differentiate between them.

Of course that is quite possible, but it is easy to solve: after scoring, take all the entries that are candidates for the podium and do another (community) voting round just on those. For example if there are 2 of the highest scores and 3 of the second highest, then all that needs to be decided is which one is first and second out of the two highest, and which one is third out of the 3 second highest. The second round could be done with simple ranking.

Also, I think that if many entries would come out as 10/10 on many criteria then there'd be something biased about scoring. For example, our criteria are often related to functions, but just because something works somehow, it does not mean a 10/10 for that function. It's much more shaded than that, things such as reliability, ease of use, solidity, etc should be taken into account. In fact, what I'd expect with a sound scoring is that scores should have a normal (Gaussian) distribution; most entries should score around average, few should be highly above average (exceptional), and also few should be much below average. Then, equal scores would be more probable among average entries, which does not matter for the podium places, and podium places would be distinct with higher probability. That could actually be turned into a scoring guideline: if something works okay, but nothing special, give it around average score, and only give it high score if it is exceptional.

Share this post


Link to post
Share on other sites

A potential problem with actually doing a 1-10 score on categories in a competition is that you will end up with models having the same overall score. As mathematically there are multiple ways of getting the same number. So then do you then weight those categories? So that say “functions” has more weighting than “looks” ? 
 

The problem with this is the scoring becomes more and more complex, especially if you’re then asking people to do a second pass off the scores to rescore entries as they have the same score.

 

the more complex you make the scoring the more effort it takes for organisation and the harder it is to get engagement for public votes … and it just takes a lot longer to finally find out who won a competition! Imagine having to wait a whole month to find out who won, just because of multiple rounds of voting and a complex scoring system?

 

Share this post


Link to post
Share on other sites
1 minute ago, Seasider said:

A potential problem with actually doing a 1-10 score on categories in a competition is that you will end up with models having the same overall score. As mathematically there are multiple ways of getting the same number. So then do you then weight those categories? So that say “functions” has more weighting than “looks” ? 
 

The problem with this is the scoring becomes more and more complex, especially if you’re then asking people to do a second pass off the scores to rescore entries as they have the same score.

the more complex you make the scoring the more effort it takes for organisation and the harder it is to get engagement for public votes … and it just takes a lot longer to finally find out who won a competition! Imagine having to wait a whole month to find out who won, just because of multiple rounds of voting and a complex scoring system?

19eoxa.jpg

Exactly how I feel about it. We can make it as difficult and specific as we want, but we will always run into problems and have debates.

Share this post


Link to post
Share on other sites
31 minutes ago, Seasider said:

A potential problem with actually doing a 1-10 score on categories in a competition is that you will end up with models having the same overall score. As mathematically there are multiple ways of getting the same number. So then do you then weight those categories? So that say “functions” has more weighting than “looks” ? 

For two entries that are similar quality it can happen the same way if they alternately land on similar spots. For three that alternately land on similar spots it can also be the case and so on. It's easier to get to that situation if scoring criteria are not that clear or when we have community voting and you so people don't want to be too harsh on overall good entries.

35 minutes ago, Seasider said:

The problem with this is the scoring becomes more and more complex, especially if you’re then asking people to do a second pass off the scores to rescore entries as they have the same score.

The more specific criteria for penalty are, the faster the first pass will be. With a clear outline of what should the points be subtracted for each criteria, first pass could be like running a checklist for each entry. Actually having to decide which entry is better or worse against multiple of them is really hard until you've listed yourself specific quality scores.

If you already do that in the jury vote or community vote that everyone scores on each criteria, and then it's translated into the positions - there's a room for gerrymandering-like translation of results in this approach and some of the entries will be misrepresented.

If you don't do this criteria scoring internally before translating it into the positions, then jurors are just picking what they like, I guess? So it's similar to community vote but maybe more organised and quicker. It still is hard to choose the order without preparing clear data.

If it's clear what the penalties are going to be for, what is supposed to be judged and how, and then we end up with multiple entries taking spots on the podium because all of them are close to perfection and having same scores, it is only then that it becomes a problem of the second pass of rating just those that are fighting for the podium. At that point jury could start adding rules for penalties increasing the quality bar this way.

Depending on how clear criteria and penalties are in the first place at the start of the contest it will better or worse when it comes to resolving issue of entries having same score. The more vague the criteria are and more room they leave for interpretation, the more room there is for quality entries scoring same max amount of points. It's the same when you have to pick entries in order and not just pick what you like more but adhere to the criteria that are not clear enough for you to easily penalise entries.

Share this post


Link to post
Share on other sites
1 hour ago, Seasider said:

A potential problem with actually doing a 1-10 score on categories in a competition is that you will end up with models having the same overall score. As mathematically there are multiple ways of getting the same number. So then do you then weight those categories?

I explained that above. I believe if done right, there would be only a small chance of getting the same score for podium entries. For the rest, it does not matter.

Quote

So that say “functions” has more weighting than “looks” ?

Exactly :) That's one key point. You have to realize, that even if you don't specify the weighting explicitly, implicitly there is always a weighting. Either because everything is weighted the same, or because while ranking, everybody does the weighting in their head implicitly (with different weights, which is not ideal). I believe it would be helpful to specify that explicitly, to better understand what to focus on. One simple way of weighting in the above scoring scheme would be to say for example 'functions' can get max 10 points, while 'looks' can get max 5 points. That's a 2:1 weighting in favour of functions, for example. Easy to grasp and score.

Quote

The problem with this is the scoring becomes more and more complex

I see it the other way round. Many people say it is hard for them to rank entries or to choose only 6 best ones. With such a scoring, it would simplify. For one, we'd have to focus on only 1 entry at a time, as scores would become independent of other entries. Second, with some information on max scores for each criteria and the rough meaning of score values (as I explained in my previous comment), people would have guidance for assigning scores, so it would become easier than straight off ranking. I have also seen people report that when they need to rank entries, what they do beforehand is they score them according to their own scoring system.. which sounds understandable, otherwise how can you rank so many entries? Could be better if everybody did so according to a unified scoring system instead of their own.

Quote

especially if you’re then asking people to do a second pass off the scores to rescore entries as they have the same score.

Even if there is a second pass, ranking 3 entries is much easier than ranking 30.

1 hour ago, Jim said:

Exactly how I feel about it. We can make it as difficult and specific as we want, but we will always run into problems and have debates.

Sure, I guess there's always going to be further debates, most of which could be cut off as you wish in order not to get too complicated. For example I would not go too far with the weighting of criteria, just a few max-score categories. I don't see this scheme as complicated, it's just like an exam or school competition. You get points for different subtasks and then sum them up to get a final score. Sure the teacher / jury needs to weigh the different subtasks in the beginning (a little more preparation required by them), but it does not seem too difficult to me.

Anyway, just wanted to share my idea of how I would do it because this sounds more intuitive to me.

Edited by gyenesvi

Share this post


Link to post
Share on other sites

@gyenesvi I think people might then argue over the categories and voting criteria, kinda like how I do with Sariels scoring system which appears to focus on the looks more than the mechanical aspects.

For example, what if the criteria for the shrinking contest was "how much have they shrunk it?" and "how much does it look like the original?" but there was no criteria for "how many functions have they kept" and "how many functions are still mechanised as opposed to just being manual?". There would rightly be complaints that the categories are insufficient and lead to a bias towards looks over functions. So if it matters what the categories are, there's bound to be complaints about them.

Some would say there should be no category for looks, as Technic is about functions. Others will say that the looks are just as important so there should be just as many categories for looks as there are for functions. The problem here is that both opinions are valid opinions and I don't think it should be up to us to decide how you should be casting your votes. It's up to you what is important just like it's up to each individual voter or jury member.

What benefit are you looking to achieve? Maybe there's another way to get it?

Edited by allanp

Share this post


Link to post
Share on other sites

I agree @allanp the discussion would just shift to complaints over the categories we are voting models on.

@gyenesvi I totally disagree if a 1-10 scoring system had been used on TC25 on a few obvious categories … shrink, likeness to original model, interpretation and implementation of functions (to make up my fictitious categories) then I’m sure with the very high quality of models we had for this competition I could easily end up with quite a few with very high and identical scores because I’d scored them differently in each category. If I discount a perfect score, because no model is truly perfect, there are 4 ways of getting 39 points, 10 for 38 points, etc. so I’d suggest you could easily have everyone having to re-rank 10 models (if they all scored 38) which means you’re going to have to rescore 10 models.

Can’t we just appreciate that @Jim and @Milan have been running competitions here for over 10 years now and they’ve worked well so far and the enthusiasm around this recent one has been great.

Share this post


Link to post
Share on other sites

To have no more discussion about the results, who voted, for what reasons, ... Just let an AI system choose the winner. If not happy with the result, go talk to the AI and discuss with him why a model should have been ranked higher. No more human errors/preferences, win. Just be cautious that the AI doesn't submit an entry himself and wins every single contest.

Share this post


Link to post
Share on other sites
5 minutes ago, Mr Jos said:

To have no more discussion about the results, who voted, for what reasons, ... Just let an AI system choose the winner. If not happy with the result, go talk to the AI and discuss with him why a model should have been ranked higher. No more human errors/preferences, win. Just be cautious that the AI doesn't submit an entry himself and wins every single contest.

Sounds like an excellent idea :laugh: :thumbup:

While we're at it, can we get some AI Admins and Mods as well :sweet:

Share this post


Link to post
Share on other sites

Having participaded ín some contests since the first one, the only real problem I see with community voting is letting recently joined members to vote. It's easier to join and vote than to support a project on Ideas. There was a contest whith dozens of one post users voting on the same entry. I guess because of a good Facebook campaign.

Edited by Lipko

Share this post


Link to post
Share on other sites
1 minute ago, Lipko said:

Having participaded ín some contests since the first one, the only real problem I see with community voting is letting recently joined members to vote. It's easier to join and vote than to support a project on ideas. There was a contest whith dozens of one post users voting on the same entry. I guess because of a good Facebook campaign.

That's why there is now a 50 post rule in most contests I think for some time already.

Share this post


Link to post
Share on other sites
1 minute ago, Mr Jos said:

That's why there is now a 50 post rule in most contests I think for some time already.

Oh, than all is ok. I'm not following contests closely recently.

Share this post


Link to post
Share on other sites
3 minutes ago, Lipko said:

Having participaded ín some contests since the first one, the only real problem I see with community voting is letting recently joined members to vote. It's easier to join and vote than to support a project on Ideas. There was a contest whith dozens of one post users voting on the same entry. I guess because of a good Facebook campaign.

That's why we have a treshold of 50 posts before a new members can vote. 

3 minutes ago, Mr Jos said:

That's why there is now a 50 post rule in most contests I think for some time already.

Exactly this. Has been in place for quite some time now.

Share this post


Link to post
Share on other sites

If we limited voting to members who primarily post in the Technic forum, I think our results would be very different.  Those of us who build with Technic on a regular basis understand the complexity and amount of thought that goes into a Technic MOC or entry.  People who don’t build with Technic are more likely to vote based on appearance rather than technical complexity.   Personally, I have always voted in favor of complexity and functions  rather than appearance, but I tend to think I’m in the minority there.  

Share this post


Link to post
Share on other sites
Quote

For example, what if the criteria for the shrinking contest was "how much have they shrunk it?" and "how much does it look like the original?" but there was no criteria for "how many functions have they kept" and "how many functions are still mechanised as opposed to just being manual?". There would rightly be complaints that the categories are insufficient and lead to a bias towards looks over functions. So if it matters what the categories are, there's bound to be complaints about them.

That's exactly why I was lost in the beginning, because things were vaguely specified. I did not know based on what to select a model, and what to focus on. Should I go for more shrinkage at the cost of more basic implementation / dropping of functions, or should I go for less shrinkage and better representation of functions? Should I focus on functions or the looks (for example color match, which influences model choice)? Should I go for a function rich model, or should I pick what I like more even if it does not have so many functions. Does that put me into a disadvantage? After reading a few questions in the discussion topic I realized that the contest is kind of underspecified in this respect, and it is not going to get better specified, so I just let it go and picked the model I liked, knowing that it probably does not have much of a chance for getting to the podium even if I nail it. I did think about it as a build challenge, as someone proposed the wording, because even if it has voting and prizes, the criteria are somewhat vague so I can't really use them to guide my choices.

Quote

It's up to you what is important just like it's up to each individual voter or jury member.

Hmm, that does sound a bit weird to me. What I realize now is that I think differently about community voting and jury voting contest; in the latter case I sort of expect a more objective voting, which also requires more spelled out criteria. I personally can easily accept that in case of community voting, people would interpret the rules however they want, and in the end it's very subjective and the coolness factor has a big weight. In that case I would not have bothered asking the above questions, I would have known that it does not matter as everybody will interpret them however they want. However, in case of jury voting, I thought the point of the jury is to pronounce the spirit of the contest by sticking more to some well defined voting criteria. That's why it makes sense to ask, what exactly are those criteria (and how are they weighted)?

Quote

What benefit are you looking to achieve? Maybe there's another way to get it?

As I explained in the previous comment, one benefit could be guidance in case of community voting; making it easier for people to make decisions by giving them a tool to actually rank entries, by spelling out more what the spirit of the contest means. Maybe we don't want that and just let people interpret the spirit however they want. Also it could be used as a tool for encouraging more informed community votes; if you have to score each entry/criteria, then you're more probable to actually look at them / think about them in more detail, I guess.

At the same time, guidance for builders to aim their builds, especially in case of jury voting.

Don't get me wrong, I am also okay with how it is now. I'll just continue thinking about them as building challenges, in which I might get lucky and please the crowds and end up getting a prize.

1 hour ago, Seasider said:

If I discount a perfect score

I guess that's where we differ in our thinking. I don't find that useful, because it leaves little room for differentiation. Rather, I'd start from an average score, and increase that if some implementation is excellent as opposed to just being 'checked'. That way, I'd probably only end up giving near perfect score to very few entries.

Share this post


Link to post
Share on other sites
3 hours ago, allanp said:

there was no criteria for "how many functions have they kept"

I wish this was a criteria because my Claas Xerion retains the most functions :cry_sad:

Share this post


Link to post
Share on other sites
5 minutes ago, aminnich said:

What pneumatic-ify type set are you working on now :laugh_hard:

I've pretty much settled with shrunken models :tongue: Got like 6 shrunken models in the work.

Share this post


Link to post
Share on other sites
On 8/29/2023 at 6:34 PM, gyenesvi said:

Many people say it is hard for them to rank entries or to choose only 6 best ones.

True, in ma case at least.

Share this post


Link to post
Share on other sites

With all this discussion about voting and point systems, I wonder: has there been a case where there was consensus that a contest had "the wrong winners"? As a rather long-time reader on this forum, I can't remember any such case. I could be wrong, but I believe the current system works in that sense that it generates good results.

So I would be wary of making the system more complicated. The harder it is to vote, the fewer people will do so.

On 8/29/2023 at 6:34 PM, gyenesvi said:

I have also seen people report that when they need to rank entries, what they do beforehand is they score them according to their own scoring system.. [...] Could be better if everybody did so according to a unified scoring system instead of their own.

That would give all the power to the person deciding on the scoring system. Right now we have a variety of people voting, with a wide variety of opinions and values and preferences, which means the models with the most points will be those that do well on a variety of aspects. I think that's a favored result: the winners are models that are generally good.

Also, I definitely trust members to do their best to vote in the spirit of the contest as they see it. If we want this to be a community thing (and I believe that's what the idea is), I think we should give power to the community. This includes the power of the vote. We're all adults and we all play well. If it happens that someone does not play well, it seems best to handle that case by itself, I would say. I think it's fine that one person votes using their own system X and someone else votes using their own system Y.

For example,

On 8/29/2023 at 6:34 PM, gyenesvi said:

Many people say it is hard for them to rank entries or to choose only 6 best ones.

That's why I'm still (very much) in favor of a "distribute 25 points among all entries as you wish" vote system (or whatever number, but 25 is very close to 10+6+4+3+2+1). Then if I have trouble picking and ranking 6 entries, but I see 8 entries I really like, I can give them 3 points each and 1 extra for some entry that has something special. I find it somewhat strange that I have to pick a personal best when voting. There can be a community winner, even if I as a single voter don't select a winner, but just award points to entries I like.

Share this post


Link to post
Share on other sites
1 hour ago, Erik Leppen said:

With all this discussion about voting and point systems, I wonder: has there been a case where there was consensus that a contest had "the wrong winners"? As a rather long-time reader on this forum, I can't remember any such case. I could be wrong, but I believe the current system works in that sense that it generates good results.

Exactly how I feel about it.

1 hour ago, Erik Leppen said:

That's why I'm still (very much) in favor of a "distribute 25 points among all entries as you wish" vote system (or whatever number, but 25 is very close to 10+6+4+3+2+1). Then if I have trouble picking and ranking 6 entries, but I see 8 entries I really like, I can give them 3 points each and 1 extra for some entry that has something special. I find it somewhat strange that I have to pick a personal best when voting. There can be a community winner, even if I as a single voter don't select a winner, but just award points to entries I like.

I am not opposed to trying this out next contest. This is something which has been suggested by other members as well.

Share this post


Link to post
Share on other sites

The only problem with the “distribute points” is you’d have to define the maximum number of points you could give someone. You can’t give free choice on this. If someone gave all 25 points to one person it would heavily bias the result. You’d need to define maximum number of points per entry and also minimum number of entries awarded a point from a voter

 

 

Share this post


Link to post
Share on other sites
3 minutes ago, Seasider said:

The only problem with the “distribute points” is you’d have to define the maximum number of points you could give someone. You can’t give free choice on this. If someone gave all 25 points to one person it would heavily bias the result. You’d need to define maximum number of points per entry and also minimum number of entries awarded a point from a voter

Agreed. Max 10 sounds logical.

Not sure about the minimum number of entries. We could do 6 like usual. Otherwise 10, 10, 5 would be an option.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.