Web Scrape

PGA Tour Stats Scraper

This project centres on building a fully automated web scraper that collects tournament-level statistics from the official PGA Tour website, covering data from 2004 to the present. It overcomes the platform’s manual, one-stat-at-a-time download limitation by enabling users to extract structured, high-quality .csv datasets across any year or date range - with support for all available stat codes. The tool is available as both a Python script and an interactive Jupyter Notebook, and the repository also includes a complete pre-scraped dataset (2004–2025) for immediate use.

Understanding Trump's Tariff Formula

On April 2, 2025, President Donald Trump declared ‘Liberation Day’, unveiling aggressive new tariffs designed to correct trade imbalances via a controversial formula from the U.S. Trade Representative (USTR). This tariff equation, which aims to achieve a bilateral trade balance of zero, adjusts rates based on export-import disparities, elasticity of demand, and tariff passthrough. This article focuses on explaining the Trump Tariff formula to all with worked examples and offering Python tools to replicate and validate the data.

Evaluating Environment & Climate Truthfulness in Social Media using Deep Learning & Large Language Models (LLMs)

Awarded Best Dissertation in Cohort, this MSc project explores the detection of climate and environmental misinformation on social media using a comparative framework of traditional natural language processing techniques, deep learning, and Large Language Models (LLMs). Leveraging a web-scraped dataset from PolitiFact, the study highlights the superiority of CNNs trained on ordinal truthfulness data, with accuracy boosted from 80.1% to 84.0% through GPT-4o-driven feature augmentation. While LLMs enhanced contextual understanding and sentiment analysis, their time complexity posed practical limitations. The project contributes novel insights into model performance trade-offs, evaluation metrics tailored to ordinal classification, and the practical integration of LLMs for misinformation mitigation in climate discourse.