linusbohman

Eurobricks Vassals
  • Content Count

    23
  • Joined

  • Last visited

About linusbohman

  • Rank
    Indexed
  • Birthday 10/31/1984

Spam Prevention

  • What is favorite LEGO theme? (we need this info to prevent spam)
    Blacktron I

Contact Methods

  • Website URL
    https://brickinsights.com

Profile Information

  • Gender
    Male
  • Location
    Malmö, Sweden
  • Interests
    Web, LEGO, photography, my family

Extra

  • Country
    Sweden
  • Special Tags 1
    https://www.eurobricks.com/forum/style_images/tags/clipboard-2.png

Recent Profile Visitors

555 profile views
  1. Hey everyone! I've done another deep dive into LEGO set data and would like to share some fun findings. (Side note: I wanted to embed images of some graphs in this post, but I'm only allowed to upload 4kb large files, so you'll have to visit the links below for live updated data ) TL; DR: It was pointed out to me that it's difficult to compare price per part over different LEGO categories, so I attempted to group categories in order to get better comparisons. Here's the result: Price Per Part over the years. Some fun findings: The last couple of years system sets have averaged $0.11 - $0.14 per part And $13.88 - $17.47 per minifig, but there are cheaper categories if you're just interested in the 'figs! Technic parts are roughly the same price as system Longer story: So, the last couple of years I've been doing some deep dives into LEGO set data. It started with me wanting to figure out what sets I should buy, which led me down a small rabbit hole where I began indexing set reviews and tried to figure out the average rating. As part of that I also began calculating Price Per Part, and compared that between sets. However, this is about as easy as you would expect: what happens when you compare a Duplo set to a System set? The system set is, of course, cheaper. Jared from Rebrickable pointed this out, which led me to do another deep dive. I mapped the average price per part of all set categories (as categorised by Brickset), grouped the different categories roughly by part type, and then tracked the average over all years. If you like graphs it's pretty interesting! I created a page detailing the results, and an additional one with the results for each category (as well as the average price per minifig). This is all just for fun and done by a happy amateur, so feel free to point out ways I can improve the data, or ask anything if you want clarifications. It will all be automatically updated whenever new data is found (with imports being done daily).
  2. linusbohman

    What set to buy?

    As always, it probably depends on what you're looking for in a set :) There are reviews for both 71043 and 10261 here on Eurobricks by VBBN. His thoughts seem to be that the castle is fantastic - especially if you're into Harry Potter, and the Roller Coaster is also awesome, if a bit repetitive in its build. Both sets are out of my budget at the moment so I can't speak from personal experience, but I would personally lean towards Hogwarts - if nothing else because the pieces seem more interesting and useful in builds of my own. If you're interested in reading more reviews of both sets I'd encourage you to check out Brick Insights. It's the largest collection of LEGO reviews from different sites and might help you with your decision. Here's 71043 Hogwarts Castle and 10261 Roller Coaster. In general the castle seems to be rated slightly higher, but both sets are well liked.
  3. linusbohman

    [MOC] Micro M-Tron Mining Operation

    I particularly adore the landing pad mosaic and the landscape sculpting. Great work!
  4. My pleasure. Thanks for all the reviewing and moderation you do. Have played around with the exports? Anything you want me to change? Glad you like the site, @Clone OPatra! I understand that people are a bit hesitant to give scores. Since a score will always be personal, it'll never reflect an objective truth. In my view, that makes them more valuable - if I can find a reviewer with whom I generally agree, then I know I can trust that persons judgement more. This project has helped me personally do just that - hopefully it can do the same for others. If you do go back and add scores, please let me know (here or on linus@linusbohman.se) so I can go back and update the reviews. I would just like to take the time and say that I'm overwhelmed by the positive feedback everyone has given. I really, really appreciate the suggestions, positivity and criticism. If this small project can be of use to the community, I'm happy to continue building it. Right now I'm figuring out a way to identify RA reviews - not easy for me by my lonesome, but if someone would be interested in helping me we could tackle it this way: We identify all reviewers in RA (if there is such a list)? I flag reviews from these people in the system And build a separate little interface, showing the review on top and two buttons below (Is this an RA review? Yes/No) That someone from the RA uses to confirm or negate that flag Once that is done, I can create a separate profile ("Eurobricks Reviewers Academy") and move those reviews to that profile Would that work? Anyone interested in committing to doing that work? Maybe @Pandora, @WhiteFang or @makoy?
  5. @Pandora I've finally found the errors. They kept in total 46 Eurobricks reviews out of the review queue. The bugs were, in fact two - some reviews were too large which made my system reject them, and one bug made the system believe the tags before the titles were the link to the review - leading to missing posts. Both are solved now, and I'm working my way through the missing reviews to add snippets, score and the rest of the information. I ought to get them done over the coming days. Thanks for pointing it out to me - I would have missed it for sure if you hadn't.
  6. Derp - sorry, I had accidentally linked to the development copy on my own machine. I updated the link in that post to the correct one, but glad you found it despite my error :) @Jim Glad you find it intriguing! It is a fun project to work on.
  7. Glad you like it! If you find that you don't like the look of the export, that you're missing something from them or that you need greater control over the selected sets just let me know and I'll tweak it. The end goal is for you to not having to mess around with the generated output at all :)
  8. Alright, I've had time to do some work! @Peppermint_M, this is about your request to export reviews. I've built the base for an exporter now. It can be used in two ways. First, to export a single set just navigate to a desred set and click on the export button. Before, you could only select JSON, but now you can also choose BBCode. The other way is the big one. By heading to the export page you can select a bunch of options. They're a bit sparse right now, but you can choose to export by category and what reviewers you want to include. Here's what the export for an individual set currently looks like: 6941-1 Battrax 94 / 100 Brickset 100 / 100 Lugnet 91 / 100 Eurobricks by RangerBob 90 / 100 Eurobricks by ZeeK If you use the export page to export multiple sets they'll all look like this, with spacing between them. No matter what you use to export, you probably need to ensure you only paste clean text to not confuse Eurobricks' CKEditor. I do it by copy/pasting into notepad before copy/pasting the code here. Thoughts? Something you want me to change or improve? Options you need? Meanwhile, I'll focus on researching @Pandoras missing reviews.
  9. This is fantastic stuff, Chris! Thanks for the explanation (and very cool MOC, haha). I use a web programming language (PHP) to do the math, and there seems to be a few extensions that deal with a skewed dataset. I just didn't know it was something you ought to compensate for. I'll be researching this as soon as I can, and implement it as soon as I feel I understand it. Thank you so much for the input! Mind if I run the finished result by you when I'm done?
  10. I'm not a statistician (or even a mathematician!), so it's really interesting to hear this. I have to admit I don't fully understand what you mean. Instead of using the raw values (0-100), I should convert them beforehand? Why is that important? Do you have any links for me to read up on? I'm really interested in improving. Here's a link describing the formula I implemented (unless I did something wrong :) ): https://www.mathsisfun.com/data/standard-deviation-formulas.html
  11. That's interesting, but if this forum is built the same way most forums are, it wouldn't be 100% correct to use the sidebar to identify whether a post is part of the RA or not. I've been checking all posts the search gave me (since 2005, if I recall correct) and it is my understanding that the sidebar is static for the user. It's not a snapshot for when the post was initially made, but rather a representation of what the user is right now. This could lead to me categorizing reviews from reviewers before they were in the RA as RA. I've been looking at the post content, and many differentiate RA reviews with an image, but not all. Or am I incorrect? Reading this review by JackJonesPaw makes me think it's part of the RA. Is it? If so, I might need to check the post content for either RA images or the words "Reviewer's Academy", methinks. See any caveats or problems with that?
  12. @makoy thanks for the input! I really want the statistics to be as accurate as possible, so I'm especially interested in figuring out where my math is incorrect. However, I'm not sure I can see where the numbers from LEGO Shop and Amazon are wrong. I collect the average from each of those sites (not every individual review), and it seems to be working, unless I'm missing something. Here's a set with an Amazon review: https://brickinsights.com/sets/7094-1 The score the system identified is 4.1 (and we normalised that to 82). That's the average from 29 reviewers, which seems to be correct when I go to Amazon: https://www.amazon.com/LEGO-Castle-Kings-Siege/dp/B000NOB9Z8#customerReviews The same is true for S&H: https://brickinsights.com/sets/60204-1 and https://shop.lego.com/en-US/LEGO-City-Hospital-60204#product-reviews Both of those numbers are floats, as they should be. There are a lot of integers, especially for S&H, when there are just one review from that site. Is that what you're referring to, or am I misunderstanding something? I really appreciate the input, so if you or @WhiteFang can help me see where I'm wrong I'd love to fix it. (Thought: I experiment with displaying a snippet from one of the reviews, even if the score is an aggregate of several. I write this in the footnote at the bottom of the review, but perhaps this is just confusing?) When it comes to retroactively editing the score, that's not something the system picks up automatically. Since all reviews from Eurobricks are formatted differently I have to manually enter the score. If you do notice I have the wrong score, however, just let me know and I'll fix it. I plan to build a "report errors in this review" feature to make it easier, but that comes when/if people actually use the site. (To geek out though, some sources are updated automatically. They are the ones with programmatically readable structures, like Amazon, LEGO Shop, Brickpicker, Brickset and a few others.) Action plan on all the great feedback you've given so far: You've all given me excellent feedback - more than I could ever dream of. Besides the potential data error you've commented on missing reviews, segregating Reviewer's Academy reviews and having an exporter for Eurobricks. I'm following it all up in this Trello card, so you can see current status there - I intend to see how I can solve it all as best as possible. Here's the current progress: Missing reviews: Thank you so much for the list, @Pandora! I'll look into it further. I haven't manually excluded them, so they have either been omitted from Eurobricks search page or been misidentified by my script. I'll explore further, but this helps a lot. To make it slightly easier to troubleshoot I added the number of sets a particular author has reviewed on the Eurobricks stats page. It's important to note that this is not the same as number of reviews written, since each review can handle multiple sets. It's probably most noticeable for WhiteFang, since each collectible minfig review (which are awesome, by the way) contains 10-20 "sets" as Brickset consider each minfig a different set. Exporting reviews: I've started building a tool where you can select categories and do a few other options, and have the site generate a list of sets and reviews. I can't do by theme since that data doesn't exist in Bricksets database, but hopefully this will be good enough - you could generate your own theme list this way. My goal is to generate output that looks like this: https://www.eurobricks.com/forum/index.php?/forums/topic/30357-lego-action-themes-pictoral-reviews-index/ that you can copy/paste, @Peppermint_M. I'll get back to you when I have something you can test. It'll be a few days (or if it's trickier than it seems a week or two) due to family, work and such. (By the way, if it's useful you can already embed the score for a particular set.) Segregating RA reviews: I can, like you say, figure out a way to consider RA it's own entity. The tricky part is that technically there's no difference between an RA post and an "ordinary" Eurobricks post - but perhaps you could help me a bit. Does all RA reviews have the RA review logo? I seem to recall that there are multiple images symbolising Reviewer's Academy. Could you give me examples of those? This is by far the most time consuming task for me to fix since it will require a lot of manual work, so while I really want to make it happen, I'll prioritize the above tasks before this. I hope you understand :) Thanks again for all of the feedback - I really appreciate it! I hope this site could become a useful tool to find fantastic reviews and help people figure out if a set is good or not. Ways to go still, but this is exactly the kind of input I need.
  13. @Pandora no, I haven’t done any differentiation between RA reviews and others. I wanted to create as generalized a structure as possible so that it could work for multiple review sources (for that fun inter-site comparison). Would be a really interesting thing to explore, though! I’ll add it to my feature wishlist. Reading all of these reviews made me appreciate the work you guys in the RA do even more - thanks for everything you and your peers do :) Could you point me to the reviews you’re missing so I could try to figure out where the error lies? I built a small script that scrapes the search page, so the error could be either on the search page’s end, my script’s end or my own manual curation’s end. I really want the dataset to be as complete as possible - thanks for letting me know!
  14. Sure, we could do that pretty easily! At the moment I have a json export, but I could build an exporter for Eurobricks as well. A few questions, then: - Do you want an exporter where you select categories and get all sets in that category? - Or would it be better for you to select individual set ids manually? I’ll research this more in the coming days, but I’d love to put something together that makes it as easy as possible for you.
  15. Hey guys! As part of an experiment I've indexed all reviews from Eurobricks. I wanted to share some fun statistics! Here's what I've done: Gather all reviews For each review: * - Extract set id and map to Bricksets database - Extract a short summary (if possible) - Extract author - If there's an overall score, extract that and normalize it to 1-100 This lets us figure out some fun stuff: I've found a total of 3275 reviews from Eurobricks (whoah!) 74% of these are scored in one way or another (that's 2422 scored and 853 unscored) The average score is 81.83. Calculating standard deviation means a good score for a Eurobricks review is better than 93.11, and a bad score is below 70.55. Everything between those numbers is average, statistically speaking. 627 individual authors have produced these reviews. I've been doing the same to a lot of other reviews as well, and the global average score is 79.67. Eurobricks is pretty darn spot on - not a mean feat! This has been a fun experiment. If you're interested in seeing more you can see the stats for Eurobricks here. That page also lets you find all reviews written by a specific author. The entire experiment can be found on brickinsights.com - it's a work in progress, but any feedback you have is appreciated. I'd be happy to try to answer any questions you have about this, or provide the data for you to do your own analysis. Hope you enjoyed it as much as I did! * Some caveats: I found the reviews by making a site wide search for titles containing the word "review". If a review hasn't followed this convention it is not included. Even then, not all reviews were included. There were a few that mostly consisted of broken images or just links to other sites. These were left out. I thought of it this way: If a random person looking at the review would find it useful, it should be included in the dataset. If not, then it's out. Not always an easy call to make. Cheers! Linus