Jump to content
Brian Enos's Forums... Maku mozo!

Percentage of Shooters in Each Class by Division


Tanders

Recommended Posts

I saw a post on here awhile back (linked below) in which several people were discussing what percentage of shooters make A class and above.  Someone broke down the number of shooters in each class for all divisions using data from a page that no longers exists on the USPSA site.  I'm kind of interested to see how the percentages have changed over the past decade, but I can't find the data listed anywhere.  I've been looking into scraping the classification data from all of the "Classifier Lookup" pages, but I don't have any experience with Python and I'm not sure I want to invest the time to figure it out.  Anyone have better coding skills than me or access to more current classification percentages they would care to share?

 

 

Link to comment
Share on other sites

12 hours ago, Tanders said:

I saw a post on here awhile back (linked below) in which several people were discussing what percentage of shooters make A class and above.  Someone broke down the number of shooters in each class for all divisions using data from a page that no longers exists on the USPSA site.  I'm kind of interested to see how the percentages have changed over the past decade, but I can't find the data listed anywhere.  I've been looking into scraping the classification data from all of the "Classifier Lookup" pages, but I don't have any experience with Python and I'm not sure I want to invest the time to figure it out.  Anyone have better coding skills than me or access to more current classification percentages they would care to share?

 

 

I've seen @lstange post some interesting stats based on pulled data in a thread or two. 

 

An interesting variable (in addition to the time elapsed since the quoted post) would be the high hit factor change that happened a year or so ago. There was a thread somewhere with a table of the change between the old HHF and new HHF.

Link to comment
Share on other sites

12 hours ago, Tanders said:

I've been looking into scraping the classification data from all of the "Classifier Lookup" pages

I don't see a clean way to do it based on publicly available data. Classifier lookup only works for current valid USPSA IDs, so you'll need to somehow correct for survivor bias. It might be possible to look at classifications as of today, then make some inferences based on ID itself (I suspect they are assigned sequentially) or join date, but this becomes tricky with people shooting multiple divisions and taking time off from shooting (sometimes decades).

Link to comment
Share on other sites

21 minutes ago, Rez805 said:

An interesting variable (in addition to the time elapsed since the quoted post) would be the high hit factor change that happened a year or so ago. There was a thread somewhere with a table of the change between the old HHF and new HHF.

There were two recent HHF updates, one on June 28th, 2018 (sizeable increase) and another on May 1st, 2019 (mostly just corrections). Don't know what happened before that. It should be possible to reverse out old HHF by looking at hit factors and percent on classification lookup pages, but I don't see any practical application of this knowledge beyond curiosity or nostalgia.

Link to comment
Share on other sites

I was looking at running a for loop that pulls out the classification entry for each division for all possible classifier numbers and then add in error handling that discards "X" entries for expired accounts or ignores nonexistent member numbers.  Does this sound feasible?  Apologies if it is obvious that I don't know what I'm talking about.

Link to comment
Share on other sites

The way to do it would be to loop over all possible member numbers, try to access uspsa.org/classification/{memberNumber}, and scrape the html in the Classification table.

image.png.97e0072ed3e116e6dda85f2ef52f93cc.png

 

There are python libraries for scraping html and Pandas can convert directly from a html table to a DataFrame. Here's a tutorial:  https://towardsdatascience.com/web-scraping-html-tables-with-python-c9baba21059

 

Shouldn't be too difficult to figure out, just make sure you handle bad requests from invalid member numbers and put in a time.sleep() call so you don't crash the USPSA website. 

 

If you/no one else does it before this weekend, I'll give it a try. 

Link to comment
Share on other sites

1 hour ago, regor said:

The way to do it would be to loop over all possible member numbers, try to access uspsa.org/classification/{memberNumber}, and scrape the html in the Classification table.

image.png.97e0072ed3e116e6dda85f2ef52f93cc.png

 

There are python libraries for scraping html and Pandas can convert directly from a html table to a DataFrame. Here's a tutorial:  https://towardsdatascience.com/web-scraping-html-tables-with-python-c9baba21059

 

Shouldn't be too difficult to figure out, just make sure you handle bad requests from invalid member numbers and put in a time.sleep() call so you don't crash the USPSA website. 

 

If you/no one else does it before this weekend, I'll give it a try. 

Thanks for taking a look at it, Jordan!  That's exactly how I was planning to do it, but I can only code in Matlab and not even very well in that.  I'll check out that tutorial you linked.

Link to comment
Share on other sites

  • 4 weeks later...
11 hours ago, obsessiveshooter said:

I would be far more curious to see how many people are in each class based on their current actual percentage, not on something they achieved potentially 10 years ago.

Here's empirical cumulative distribution function of classification percent for production:

index.png

Link to comment
Share on other sites

4 hours ago, obsessiveshooter said:

Can you explain the y axis

Y axis is number of people with classification percent less than or equal to what's on the X axis, divided by total number of people.

 

In other words about 1/4th are unclassified, and classification percent of the remaining 3/4ths is approximately normally distributed with mean 57% and standard deviation 18%.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...