Jump to content
Brian Enos's Forums... Maku mozo!
Tanders

Percentage of Shooters in Each Class by Division

Recommended Posts

I saw a post on here awhile back (linked below) in which several people were discussing what percentage of shooters make A class and above.  Someone broke down the number of shooters in each class for all divisions using data from a page that no longers exists on the USPSA site.  I'm kind of interested to see how the percentages have changed over the past decade, but I can't find the data listed anywhere.  I've been looking into scraping the classification data from all of the "Classifier Lookup" pages, but I don't have any experience with Python and I'm not sure I want to invest the time to figure it out.  Anyone have better coding skills than me or access to more current classification percentages they would care to share?

 

 

Share this post


Link to post
Share on other sites
15 minutes ago, rowdyb said:

Not publicly published any more.

 

Any rationale ?

Share this post


Link to post
Share on other sites
12 hours ago, Tanders said:

I saw a post on here awhile back (linked below) in which several people were discussing what percentage of shooters make A class and above.  Someone broke down the number of shooters in each class for all divisions using data from a page that no longers exists on the USPSA site.  I'm kind of interested to see how the percentages have changed over the past decade, but I can't find the data listed anywhere.  I've been looking into scraping the classification data from all of the "Classifier Lookup" pages, but I don't have any experience with Python and I'm not sure I want to invest the time to figure it out.  Anyone have better coding skills than me or access to more current classification percentages they would care to share?

 

 

I've seen @lstange post some interesting stats based on pulled data in a thread or two. 

 

An interesting variable (in addition to the time elapsed since the quoted post) would be the high hit factor change that happened a year or so ago. There was a thread somewhere with a table of the change between the old HHF and new HHF.

Share this post


Link to post
Share on other sites
12 hours ago, Tanders said:

I've been looking into scraping the classification data from all of the "Classifier Lookup" pages

I don't see a clean way to do it based on publicly available data. Classifier lookup only works for current valid USPSA IDs, so you'll need to somehow correct for survivor bias. It might be possible to look at classifications as of today, then make some inferences based on ID itself (I suspect they are assigned sequentially) or join date, but this becomes tricky with people shooting multiple divisions and taking time off from shooting (sometimes decades).

Share this post


Link to post
Share on other sites
21 minutes ago, Rez805 said:

An interesting variable (in addition to the time elapsed since the quoted post) would be the high hit factor change that happened a year or so ago. There was a thread somewhere with a table of the change between the old HHF and new HHF.

There were two recent HHF updates, one on June 28th, 2018 (sizeable increase) and another on May 1st, 2019 (mostly just corrections). Don't know what happened before that. It should be possible to reverse out old HHF by looking at hit factors and percent on classification lookup pages, but I don't see any practical application of this knowledge beyond curiosity or nostalgia.

Share this post


Link to post
Share on other sites

I was looking at running a for loop that pulls out the classification entry for each division for all possible classifier numbers and then add in error handling that discards "X" entries for expired accounts or ignores nonexistent member numbers.  Does this sound feasible?  Apologies if it is obvious that I don't know what I'm talking about.

Share this post


Link to post
Share on other sites

The way to do it would be to loop over all possible member numbers, try to access uspsa.org/classification/{memberNumber}, and scrape the html in the Classification table.

image.png.97e0072ed3e116e6dda85f2ef52f93cc.png

 

There are python libraries for scraping html and Pandas can convert directly from a html table to a DataFrame. Here's a tutorial:  https://towardsdatascience.com/web-scraping-html-tables-with-python-c9baba21059

 

Shouldn't be too difficult to figure out, just make sure you handle bad requests from invalid member numbers and put in a time.sleep() call so you don't crash the USPSA website. 

 

If you/no one else does it before this weekend, I'll give it a try. 

Share this post


Link to post
Share on other sites
1 hour ago, regor said:

The way to do it would be to loop over all possible member numbers, try to access uspsa.org/classification/{memberNumber}, and scrape the html in the Classification table.

image.png.97e0072ed3e116e6dda85f2ef52f93cc.png

 

There are python libraries for scraping html and Pandas can convert directly from a html table to a DataFrame. Here's a tutorial:  https://towardsdatascience.com/web-scraping-html-tables-with-python-c9baba21059

 

Shouldn't be too difficult to figure out, just make sure you handle bad requests from invalid member numbers and put in a time.sleep() call so you don't crash the USPSA website. 

 

If you/no one else does it before this weekend, I'll give it a try. 

Thanks for taking a look at it, Jordan!  That's exactly how I was planning to do it, but I can only code in Matlab and not even very well in that.  I'll check out that tutorial you linked.

Share this post


Link to post
Share on other sites

I would be far more curious to see how many people are in each class based on their current actual percentage, not on something they achieved potentially 10 years ago.

Sent from my SM-N960U using Tapatalk

Share this post


Link to post
Share on other sites
11 hours ago, obsessiveshooter said:

I would be far more curious to see how many people are in each class based on their current actual percentage, not on something they achieved potentially 10 years ago.

Here's empirical cumulative distribution function of classification percent for production:

index.png

Share this post


Link to post
Share on other sites
4 hours ago, obsessiveshooter said:

Can you explain the y axis

Y axis is number of people with classification percent less than or equal to what's on the X axis, divided by total number of people.

 

In other words about 1/4th are unclassified, and classification percent of the remaining 3/4ths is approximately normally distributed with mean 57% and standard deviation 18%.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...