Announcement

Collapse
No announcement yet.

The Margin of Error of Projections

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • The Margin of Error of Projections

    The comments on the HQ Players Statistics and Projections page provide helpful information on how the projections were put together.

    However, what about the reliability of the projections as each baseball season comes to a close? Do we have a statistical measure of say, how accurate the 2006 HQ projections were, overall? How far off were they when the books closed on the final 2006 stats?

    And what were the greatest sources of discrepancy outside of playing time issues (which I assume are probably the most unpredictable)?

    By the same token, which stats proved to be the most reliable, by order of reliability?

    Were the same sources of discrepancy present in each of the previous years or have these varied from year to year? For instance, were the greatest discrepancies in terms of counting stats or percentage stats overall? In the counting stats, which were the greatest sources of variance for both pitchers and hitters? How about in the percentage stats?

    Just to be clear, I am not looking at leading indicators or any of the hidden stats this site uses to arrive at its projections -- just looking at the actual stats themselves that are relied upon in most leagues as scoring categories.

    Thanking you in advance.
    RIP Paco de Lucia.

  • #2
    I can't find the link but some blog had the numbers of accuracy and HQ was 1st by a decent amount.

    Comment


    • #3
      Thanks. I don't doubt HQ's would be amongst the very best, if not THE most reliable versus what's out there.

      I was looking to peel the onion and see where within HQ's layers of stats, lay the greatest sources of reliability and error, from year to year.
      Last edited by JVR; 12-20-2006, 04:00 PM.
      RIP Paco de Lucia.

      Comment


      • #4
        Originally posted by adichiara View Post
        I can't find the link but some blog had the numbers of accuracy and HQ was 1st by a decent amount.
        I recall a blog where PECOTA was regarded as the most accurate set of predictions for 2006. Ron Shandler's BaseballHQ might have been in first place before the guy added in PECOTA. Hat tip to BaseballThinkFactory.org for pointing it out.

        In general, I think Ron's position has been that he doesn't like these after-the-fact judging because the criteria ought to be more like which tout most positively impacts its customers ability to win, not a narrow mathematical question of who has the lowest variance from the final numbers (even if we could agree on how to do that precisely).

        I may eventually get back to trying to evaluate tout projections myself. I collected the data a couple years ago, but it became too big a task for me. My sense is that there isn't much persistant difference in accuracy among the top tier of projections.
        "If you torture data long enough, they will confess." -- Ronald Coase

        Comment


        • #5
          You know what would be interesting..... if every day we named an AL and NL hitter and pitcher.. and everyone on the board wrote down their projections... if we tally it up and find the mean... i wonder how accurate taht will be.... this is piggy backing off my post about request for greater participation in polls... what do u think?
          "I mean, look at you. You don't even have a name tag. You've got no chance. Why don't you just fall down?"

          -Nigel Powers (Goldmember)


          5*5 HTH 10 teams $380 cap
          Use OPS instead of AVG and Saves +.5Holds.
          28 man rosters 20 man Farms (C,1b,2b,3b,SS, IF, Of, OF, OF, OF, U, U, 8 SP, 4RP)
          Daily Lineup changes, Weekly FAAB. 60/40 hitter/pitcher split.

          Comment


          • #6
            Got what I was looking for, straight from the Forecaster, page 47:

            "Fact: The most advanced prognosticating systems are only going to be 'on target' 70% of the time. That leaves a huge variability in player value."

            Now, this depicts the built-in discrepancy of projected stats as a function of player value, rather than statistical data per se. But it does provide perspective.
            RIP Paco de Lucia.

            Comment


            • #7
              I was just invited again to participate in another comparative analysis, and I declined. When the inviter became more insistent about my participation, I shot back with this single response:

              "With a stat like OPS being the most popular comparison gauge, I could have projected Jose Reyes' 2006 stat line, ended up with Richie Sexson, and be deemed a success (both OPS ~ .840). I have no interest in these types of exercises. Thanks."

              Of course, BP will be trumpeting that PECOTA beat Shandler, which Peter Gammons will pick up on, and then all the world will see us as also-rans. If you want to judge my success based on how well I can project a Jose Reyes and end up with Richie Sexson, be my guest.
              "Inside every cynical person, there is a disappointed idealist." -- George Carlin

              Comment


              • #8
                well PECOTA seems interesting to me... but im not sure it can really tell the whole story. I believe pecota finds a group of players from the history of baseball that are comprable to said player and use that (plus some other factors) to project performance. To me their projections always seem a little negative, so it seems they wouldnt project too many breakout performances) but would get collapse performances more often.

                Honestly, its not the projections that i come to HQ for (though they are fun), but the analysis, and thought provoking ideas. In general any article i read on baseball tends to be about analysis or a new study, or trying to put value on something previously difficult to valueate. Like baserunning, fielding, etc. Im not concerned with whether reyes OPS was projected within .10 points etc (although it makes for fun debate).
                "I mean, look at you. You don't even have a name tag. You've got no chance. Why don't you just fall down?"

                -Nigel Powers (Goldmember)


                5*5 HTH 10 teams $380 cap
                Use OPS instead of AVG and Saves +.5Holds.
                28 man rosters 20 man Farms (C,1b,2b,3b,SS, IF, Of, OF, OF, OF, U, U, 8 SP, 4RP)
                Daily Lineup changes, Weekly FAAB. 60/40 hitter/pitcher split.

                Comment


                • #9
                  Just to echo Ron's point, the other aspect of this where I think HQ outshines everyone else is in projecting playing time. Not to toot our own horn, but I think we put more effort into that side of the projections than anyone, and it shows in the output. Again, that's something that is lost in OPS comparisons, but is perhaps even more critical to a "good" projection.

                  Comment


                  • #10
                    Originally posted by RAY@HQ View Post
                    Just to echo Ron's point, the other aspect of this where I think HQ outshines everyone else is in projecting playing time. Not to toot our own horn, but I think we put more effort into that side of the projections than anyone, and it shows in the output. Again, that's something that is lost in OPS comparisons, but is perhaps even more critical to a "good" projection.
                    Frankly, what HQ taught me is to buy skills and that skills lead to playing time. The breakout candidates become more obvious when viewed this way. I get what they are doing with PECOTA, but I've never fully warmed up to the concept since it's just a more thorough version of the method of looking at the past to guess the future trends. Yeah, that can work for the general case, but it tends to miss the outliers (who tend to be outliers for a variety of reasons that BPIs can sniff out).

                    If I want projections for the general case, any ol' book or method will get you the idea that Jeter will have a high batting average and score a lot of runs, and that Big Papi will hit a bunch of dingers. I don't care about those cases. I want to catch the up-and-coming guys who don't have enough of a track record for PECOTA to be as relevant, but from the moment they picked up a bat in low-A have been laying down BPIs to be deciphered to a fair degree of accuracy.

                    YMMV, but for me the methods HQ uses matches what feels right to my analytical mind. I don't care if a $24 projection winds up being $20 or $29. What I care about is making the 70% percentage play which, in the long run, will pay off whether you are playing baseball or cards at Vegas.
                    MiLBAnalysis.com / @NickRichardsHQ

                    Comment


                    • #11
                      Originally posted by Nick View Post
                      Frankly, what HQ taught me is to buy skills and that skills lead to playing time. The breakout candidates become more obvious when viewed this way.
                      You have learned your lessons well, young Jedi warrior. :-)

                      In all seriousness, this is a good point that shouldn't be lost. I was just trying to point out that another shortcoming of OPS-based comparisons is that there's no playing time component. My 840 OPS over 400 AB is a different projection than your 840 OPS over 250 AB, or 550 AB.

                      Comment


                      • #12
                        I don't know how this thread spiralled into a discussion of us versus them -- I am fully sold on HQ, and hence don't see the value of comparing how some other site moves its decimals around.

                        My only intent was to help me get a better gauge of HQ's methods within itself, so as to help me more effectively navigate across the overwhelming data HQ has to offer. Hence, my questions about stat reliability versus past years.

                        However, I have to admit that reading the gaming part of the Forecaster gave me the perspective I sought. I am still a novice at this, and it does take a lot of time for the concepts to fully sink in -- especially when it comes to viewing several factors together in order to fully appreciate prevalent trends.

                        R$ values are fine and dandy -- and we all use them to a certain extent to drive points across. But the other stuff, the determination and monitoring of skills -- that's just priceless, the more you delve.
                        RIP Paco de Lucia.

                        Comment


                        • #13
                          Why Should It Cost You?

                          To me, that 70% pronouncement really hits the nail on the head.

                          For those of us who play auction, where a dollar value has to be assigned to each and every player rostered, this means that even those players who fall within the 30% of correct stats projections will have their dollar values adjusted somewhat by the relative changes to the 70% who were wrong, so even though their projections were dead on, we will have paid an inappropriate amount for them based on the corrections required from the incorrect segment.

                          It's like what Bruce Lee says in "Enter The Dragon":
                          It is like a finger pointing away to the moon; focus on the finger and you will miss all that heavenly glory.
                          Use the force, Luke.
                          "Well, in all my years I ain't never heard, seen nor smelled an issue that was so dangerous it couldn't be talked about. Hell yeah! I'm for debating anything. Rhode Island says yea!"
                          - Stephen Hopkins, Delegate from RI in the film "1776"

                          Comment


                          • #14
                            Originally posted by Nick View Post
                            Frankly, what HQ taught me is to buy skills and that skills lead to playing time. The breakout candidates become more obvious when viewed this way.
                            I agree with this 835.9%. I have a pretty good feel for what the Konerko's and Pujol's of the world are going to do. It's HQ's view at the underlying skills that points you towards the difference makers.

                            In a 16 team mixed league (with 25 rounds). HQ's tools and discussion boards led me to these as my last six picks.

                            20) Juan Rivera
                            21) Scott Linebrink
                            22) Jose Lopez
                            23) Josh Willingham
                            24) Justin Verlander
                            25) Fernando Rodney

                            Every provider will give you the same main course. It's the little things at the end, like seasoning and sauce that make the difference.
                            "The problem with quotes found on the internet is that it is nearly impossible to verify the source." - Abraham Lincoln

                            Comment


                            • #15
                              Analysis Would be Good and Fun - Why Not?

                              I just joined BaseballHQ and came across this thread which is my main interest right now. The reason I joined was hoping to get last year's projections as I am doing an analysis right now on the various commercial projections and my own amateurish attempts from last year, starting with ERA.

                              After seeing Mr. Shandler's post above, I see that I am going to have to do it without BaseballHQ's 2006 projections, which is disappointing. While I understand Mr. Shandler's objections to such a thing, though not the forecefulness of his objections, if the purpose is only to determine which projectionist did the best last year because that is likely to change from year to year and would depend on the methodology of the analysis - you can make numbers say anything you want them to say if you try hard enough - there are other things to be learned. My purpose is to weed out the fakes from the professionals and to see if different pros were better in different areas.

                              What I have learned so far - my own ERA projections of last year based upon a 3 year weighted average of DIPS adjusted for park/league changes, age, etc. actually held up really well against the pros when I had enough data, no. of years in ML and IP. For those pitchers who didn't pitch the previous year or very little in the major leagues, this is where the pros really shine against an amateur like myself and in my opinion, why they are valuable, if indeed they are pros.

                              The most interesting thing I have learned so far though, is how totally useless a pitcher's previous years ERA was in projecting this year's ERA. My guess is a lot of people already know that but I was astounded as to how worthless it was, the absolute worst of about a dozen different ERA projections/actuals I tested. Goes to show how worthless ERA is as a stat in and of itself in evaluating the performance of a pitcher.

                              Now to the subject that I imagine Mr. Shandler wishes I do not address here, but I will. So far, I have only been able to obtain 2006 projections from Baseball Information Systems (Bill James), Baseball Prospectus and Baseball Think Factory. Baseball Prospectus was the clear winner, not even close, in particular their PERA (Peripheral ERA). While I suspect that Mr. Shandler's projections would compare favorably to those of Baseball Prospectus, I guess I will never know, or at least I won't know until next year.
                              Last edited by jackvdo; 01-03-2007, 10:19 PM.

                              Comment

                              Working...
                              X