How I combined web scraping, agricultural science, and the NOAA API to create a tool that optimizes cattle feed costs based on environmental data

Ranching is a business of razor-thin margins. Two of the biggest variables that can make or break a year are the cost of feed by far the largest operational expenseand the weather. A sudden heatwave can drastically impact a herd's health and productivity.

As a developer with a background in agriculture, I saw an opportunity to connect these two domains. What if I could build a tool that not only finds the most cost-effective feed but also starts to lay the groundwork for adjusting recommendations based on real-world environmental data?

So, I built a prototype. It's a collection of Python scripts that function as a data-driven assistant for a rancher. It has two core modules: an Economic Engine to optimize feed costs and an Environmental Monitor to pull critical weather data. Here’s how it works.

Module 1: The Economic Engine - Scraping for Savings

The first task was to answer a fundamental question: "Given my herd's nutritional needs, what is the cheapest way to feed them?"

This required two parts: a scientific calculator and a web scraper.

Part A: Translating Agricultural Science into Python

First, I needed to calculate the "Dry Matter Intake" (DMI)—a scientific measure of how much food a cow needs. This isn't a simple number; it depends on the cow's weight, stage of life (lactating, growing, pregnant), milk production, and more. I found peer-reviewed formulas and translated them directly into Python.

cows_2.py - A Glimpse into the DMI Formulas

# Formula for lactating cow DMI, based on scientific papers
dmi_lac = ((3.7 + int(lac_par) * 5.7) + .305 * 624 + .022 * int(lac_weght) + \
          (-.689 - 1.87 * int(lac_par)) * 3) * \
          (1 - (2.12 + int(lac_par) * .136)**(-.053*int(lac_day)))

# Total feed needed is the DMI multiplied by the number of cows and days
all_feed_needed = (dmi_lac_needed + dmi_grow_needed + final_preg) * int(days_of_feed)

This script asks the user for details about their herd and calculates the total pounds of feed required. Now I had the demand. Next, I needed the supply.

Part B: The Web Scraper for Real-Time Prices

Feed prices change. The only way to get current data is to go to the source. I wrote a web scraper using requests and BeautifulSoup to pull product names and prices directly from a local feed store's website.

cow_save_scrape.py - The Scraper Logic

import requests
from bs4 import BeautifulSoup as soup
from urllib.request import Request, urlopen

websiteurl = 'https://shop.berendbros.com/departments/calf-|70|P35|P351.html'
req = Request(websiteurl, headers={"User-Agent": 'Mozilla/5.0'})
webpage = urlopen(req).read()
soups = soup(webpage,'html.parser')

calf_name = []
calf_price = []

for link in soups.find_all('div', class_='card-body'):
    for product in link.find_all('div', class_='product_link'):
        calf_name.append(product.text)
    for price in link.find_all('div', class_='product_price'):
        # Clean the price string (e.g., "$24.99\n" -> 24.99)
        price_text = price.text.strip().strip('$')
        calf_price.append(float(price_text))

The script calculates the total cost to meet the herd's DMI for each feed product and then sorts the list to find the cheapest and most expensive options. This provides an immediate, actionable financial insight.

Module 2: The Environmental Monitor - Tapping into the NOAA API

Feed cost is only half the equation. Environmental stress, especially heat, has a massive impact on cattle. A cow suffering from heat stress will eat less, produce less milk, and have lower fertility.

To quantify this, I needed data. I turned to the National Oceanic and Atmospheric Administration (NOAA), which offers a fantastic, free API for historical and current weather data from thousands of stations.

My script, weather_1.py, is designed to pull key data points for a list of specific weather stations in my area of interest (College Station, TX).

weather_1.py - Fetching Key Climate Data

import requests
import json

token = 'YOUR_NOAA_API_TOKEN' # Get this from the NOAA website
base_url = 'https://www.ncei.noaa.gov/cdo-web/api/v2/data'
start_date = '2024-04-01'
end_date = '2024-04-03'

# List of data types we want to fetch
data_types = [
    'TMAX', # Maximum Temperature
    'TMIN', # Minimum Temperature
    'RH_AVG', # Average Relative Humidity
    'WIND_SPEED_AVG',
]

for station_id in us1tx_codes:
    print(f"--- Processing station: {station_id} ---")
    params = {
        'datasetid': 'USCRNS', # A specific, high-quality dataset
        'stationid': f'USCRNS:{station_id}',
        'startdate': start_date,
        'enddate': end_date,
        'limit': 1000,
        'datatypeid': data_types
    }
    # ... make the requests.get() call ...

The script systematically queries the API for each station and saves the results into JSON files, creating a local database of recent environmental conditions.

The Next Step: Connecting the Dots

Right now, these two modules are separate. But the power lies in connecting them. The next evolution of this project is to use the weather data as a dynamic input for the DMI calculator.

You can calculate the Temperature-Humidity Index (THI) from the NOAA data—a standard metric for measuring heat stress in cattle. As the THI rises above a certain threshold (around 72 for dairy cows), DMI begins to drop.

The next version of the DMI formula would look something like this:
adjusted_dmi = calculated_dmi * get_heat_stress_factor(THI)

This would allow the tool to make smarter, more realistic recommendations. For example, it could advise a rancher that during a predicted heatwave, their herd's intake will likely decrease by 10%, allowing them to adjust feed purchases and avoid waste.

What I Learned

This project is a starting point, but it demonstrates the immense potential for developers to build tools that bring data science and automation to traditional industries, creating real value and solving tangible problems.