- Scraping PGA Tour Stats: A script (
scrapePGAStats.js
) using Puppeteer to scrape statistics about the toughest holes on the PGA Tour from the official PGA Tour website. The script collects data such as hole number, yards, average, and scores for various scenarios (e.g., eagle, birdie, par, bogey, double bogey). - Scraping ESPN Golf Leaderboard: A script (
scrapeGolfSite.js
) using Puppeteer to scrape golf tournament data from ESPN's golf leaderboard pages. The script extracts player names, scores, and hole-by-hole details from specified tournament leaderboard URLs.
- PGA Tour Stats Excel File: An Excel file containing statistics about the toughest holes on the PGA Tour. The data includes information such as hole number, yards, average, and scores for different scenarios.
- Tournament Golf Scores Excel Files: Excel files containing golf tournament data scraped from ESPN's leaderboard pages. The data includes player names, scores, and hole-by-hole details for each tournament.
- The Python script (
data_processing.py
) demonstrates feature engineering and data cleaning steps for Draftkings Golf DFS analysis. - It merges multiple raw data files, handles missing values, filters data, and performs feature engineering tasks such as calculating summary statistics, creating binary columns based on specific conditions, and calculating cumulative scores.
- Fantasy points are computed based on various scoring metrics, and the script provides basic data analysis and summary statistics.
- To use the scraping scripts, make sure to have Node.js installed. Then, install the required dependencies (
puppeteer
,puppeteer-extra
,puppeteer-extra-plugin-stealth
,exceljs
) using npm. - Run the scripts (
scrapePGAStats.js
,scrapeGolfSite.js
) to scrape the desired golf data from the respective sources. - Modify the scripts as needed to customize data scraping or output file names/locations.
# Install dependencies for scraping scripts
npm install puppeteer puppeteer-extra puppeteer-extra-plugin-stealth exceljs
# Run PGA Tour Stats scraping script
node scrapePGAStats.js
# Run ESPN Golf Leaderboard scraping script
node scrapeGolfSite.js
- Ensure proper network connectivity and permissions to access the websites being scraped.
- Be mindful of scraping etiquette and website terms of service to avoid any legal issues or IP bans.
- Customize the scripts and data processing logic as needed for specific use cases or data requirements.
- Utilize the provided Python script for feature engineering and data cleaning tasks related to Draftkings Golf DFS analysis. Adapt the script according to your specific data and analysis needs.