Earnings Surprises: Do They Move Stock Prices?

Nikhil Adithyan
Jul 9
8 min read

A comprehensive analysis with Python and APIs

Earnings are the company’s profits, usually for a specific quarter. This is one of the most important information an investor can get their hands on, trying to understand the present and future of a company. But this information does not come out of the blue.

Before the official announcement, investors can find the forecasted earnings on most major financial websites, such as Yahoo Finance. The forecasted earnings are estimated through analysts who examine past financial results, industry trends, and management guidance to make their prediction. These predictions are then combined to form a consensus estimate.

But what happens when analysts get the forecast wrong? These surprises are very important because they often lead to immediate and significant price movements since investors react quickly to the new information. The scope of this article is to examine how earnings surprises influence stock prices and investigate whether trading opportunities arise when this occurs.

What to expect from this article

In this article, with the help of FinancialModelinPrep’s API endpoints and Python, we will:

Check the price movements, and comparing it to the surprise percentage for 1, 5, 10, and 20 days after the announcement, for the last 5 years
Backtest a simple strategy using the above data

Let’s code!

First, let’s do the boring imports.


import requests
import pandas as pd
import requests_cache
import json
import datetime
import time
from tqdm import tqdm
import matplotlib.pyplot as plt
import numpy as np

# Setup cache to avoid repeated API calls
requests_cache.install_cache('cache')
token = '<YOU FMP API KEY>'

Now, using FMP’s S&P 500 Constituents endpoint, we will get all the stocks and store them in a dataframe:


url = f'https://financialmodelingprep.com/api/v3/sp500_constituent'
querystring = {"apikey": token}
resp = requests.get(url, querystring).json()

# Convert to DataFrame
df_constituents = pd.DataFrame(resp)

We will need the historical earnings and the EOD prices for each ticker listed above. Therefore, let’s define two functions to retrieve that data. First, for the earnings data, we’ll be using FMP’s Earnings Historical endpoint:


def get_historical_earnings(ticker):
    url = f'https://financialmodelingprep.com/api/v3/historical/earning_calendar/{ticker}'
    querystring = {"apikey": token}

    response = requests.get(url, querystring)
    data = response.json()
    df = pd.DataFrame(data)

    # Convert date to datetime
    df['date'] = pd.to_datetime(df['date'])

    # Filter for dates since July 1, 2020
    start_date = datetime.datetime(2020, 7, 1)
    df = df[df['date'] >= start_date]
    df['symbol'] = ticker

    return df

And then for the price data, we’ll use FMP’s Daily Chart EOD endpoint:


def get_historical_prices(ticker):
    url = f'https://financialmodelingprep.com/api/v3/historical-price-full/{ticker}'
    querystring = {"apikey": token, "from": "2020-07-01"}
    
    response = requests.get(url, querystring)
    data = response.json()
    df = pd.DataFrame(data['historical'])
    # Convert date to datetime
    df['date'] = pd.to_datetime(df['date'])
    # Add ticker column
    df['symbol'] = ticker

    return df

Now, we will need to save all the earnings in a single dataframe named df_earnings:


tickers = df_constituents['symbol'].tolist()

all_earnings = []
print("Fetching earnings data...")
for ticker in tqdm(tickers):
    earnings_df = get_historical_earnings(ticker)
    if not earnings_df.empty:
        all_earnings.append(earnings_df)
    # Add a small delay to avoid hitting API rate limits
    time.sleep(0.1)

# Combine all earnings data
df_earnings = pd.concat(all_earnings, ignore_index=True)
# Drop rows where eps is NaN
df_earnings = df_earnings.dropna(subset=['eps'])

As you can see, we have a large number of entries, exceeding 10,000. Now, we should add the element of surprise to the dataframe as a percentage, and also include each symbol’s sector. This will help us analyze our results more effectively in the future.


df_earnings = df_earnings.merge(df_constituents[['symbol', 'sector']], on='symbol', how='left')
df_earnings['surprise'] = ((df_earnings['eps'] - df_earnings['epsEstimated']) / df_earnings['epsEstimated']) * 100

Now that we have the necessary information for the earnings, let’s gather the historical prices. We will store the open, close, and adjusted_close data in three separate dataframes.

A common mistake is to calculate the price difference between the previous day’s close and the next day’s close after an announcement, which isn’t realistic since earnings are usually announced when the market is closed.

For example, if a stock closed at 100, announced results after hours, then opened at 120 and closed at 130 the next day, you can’t buy at 100 on the surprise day unless you have a time machine or illegal insider info. Therefore, we should measure the price spike from 120 to 130, not from 100.


all_prices = []

all_prices = []

for ticker in tqdm(tickers):
    prices_df = get_historical_prices(ticker)
    if not prices_df.empty:
        # Keep necessary columns including open and close for price change calculations
        prices_df = prices_df[['date', 'symbol', 'open', 'close', 'adjClose']]
        all_prices.append(prices_df)

# Combine all price data and pivot
df_prices_combined = pd.concat(all_prices, ignore_index=True)

# Create pivot tables for different price types
df_prices_pivot_adjClose = df_prices_combined.pivot(index='date', columns='symbol', values='adjClose')
df_prices_pivot_open = df_prices_combined.pivot(index='date', columns='symbol', values='open')
df_prices_pivot_close = df_prices_combined.pivot(index='date', columns='symbol', values='close')

With all the required data in hand, we’ll now compute the percentage changes as promised. We’ll define a function to calculate the spike and then add the price changes over 1, 5, 10, and 20 days to the earnings dataframe.


def calculate_price_change(ticker, from_date, days):

    # Convert from_date to pandas Timestamp if it's not already
    if not isinstance(from_date, pd.Timestamp):
        from_date = pd.Timestamp(from_date)

    # Get all available dates for the ticker
    available_dates = df_prices_pivot_open.index[df_prices_pivot_open[ticker].notna()].tolist()

    # Find the next available date on or after from_date
    next_dates = [d for d in available_dates if d >= from_date]
    if not next_dates:
        return None  # No data available after from_date

    from_date_actual = next_dates[0]

    # Calculate the target to_date by adding business days to from_date
    # This is an approximation as we need to find the actual available date
    target_to_date = from_date + pd.Timedelta(days=days)

    # Find the previous available date on or before target_to_date
    prev_dates = [d for d in available_dates if d <= target_to_date]
    if not prev_dates:
        return None  # No data available before target_to_date

    to_date_actual = prev_dates[-1]

    # Get open price on from_date and close price on to_date
    open_price = df_prices_pivot_open.loc[from_date_actual, ticker]
    close_price = df_prices_pivot_close.loc[to_date_actual, ticker]

    # Calculate price change percentage
    price_change_pct = ((close_price - open_price) / open_price) * 100

    return price_change_pct

# Create new columns for price changes
df_earnings['price_change_1d'] = None
df_earnings['price_change_5d'] = None
df_earnings['price_change_10d'] = None
df_earnings['price_change_20d'] = None

# Calculate price changes for each earnings record
for idx, row in tqdm(df_earnings.iterrows(), total=len(df_earnings)):
    ticker = row['symbol']
    date = row['date']

    # Calculate price changes for different periods
    df_earnings.at[idx, 'price_change_1d'] = calculate_price_change(ticker, date, 1)
    df_earnings.at[idx, 'price_change_5d'] = calculate_price_change(ticker, date, 5)
    df_earnings.at[idx, 'price_change_10d'] = calculate_price_change(ticker, date, 10)
    df_earnings.at[idx, 'price_change_20d'] = calculate_price_change(ticker, date, 20)

Analyze the spikes

The first step is to assess whether this idea makes sense. To do that, we’ll plot the price change over one day (the day after the announcement) against the surprise factor. Additionally, we’ll draw the best-fit line using a reliable statistical method to identify any trend. We’ll implement this as a function for easy reuse.

Note that we should be able to remove any outliers to have a better understanding of the chart.


def calculate_best_fit_angle(df, x_col, y_col, plot=False, title=None, filter_outliers=False):
    # Make a copy of the dataframe to avoid modifying the original
    df_work = df.copy()

    # Filter outliers if requested
    if filter_outliers:
        # Calculate Q1, Q3 and IQR for y column
        Q1 = df_work[y_col].quantile(0.25)
        Q3 = df_work[y_col].quantile(0.75)
        IQR = Q3 - Q1

        # Define bounds for outliers
        lower_bound = Q1 - 1.5 * IQR
        upper_bound = Q3 + 1.5 * IQR

        # Filter out outliers
        df_work = df_work[(df_work[y_col] >= lower_bound) & 
                          (df_work[y_col] <= upper_bound)]

    # Drop rows with NaN values in either column
    df_work = df_work.dropna(subset=[x_col, y_col])

    # Ensure numeric data types and handle any non-numeric values
    df_work[x_col] = pd.to_numeric(df_work[x_col], errors='coerce')
    df_work[y_col] = pd.to_numeric(df_work[y_col], errors='coerce')

    # Drop any rows where conversion resulted in NaN
    df_work = df_work.dropna(subset=[x_col, y_col])

    # Now perform the polynomial fitting
    coefficients = np.polyfit(df_work[x_col].values, df_work[y_col].values, 1)
    polynomial = np.poly1d(coefficients)
    best_fit_line = polynomial(df_work[x_col])

    # Calculate angle in degrees
    angle = np.degrees(np.arctan(coefficients[0]))

    # Create plot if requested
    if plot:
        plt.figure(figsize=(10, 6))
        plt.scatter(df_work[x_col], df_work[y_col], alpha=0.5)
        plt.plot(df_work[x_col], best_fit_line, color='red', linestyle='--')
        plt.xlabel(f'{x_col}')
        plt.ylabel(f'{y_col}')

        # Set title
        if title is None:
            title = f'{y_col} vs {x_col}'
        plt.title(title)

        plt.text(0.7, 0.95, f'Angle: {angle:.1f}°', transform=plt.gca().transAxes,
                 bbox=dict(facecolor='white', alpha=0.8))
        plt.grid(True)
        plt.show()

    return angle

pr_ch = 'price_change_1d'

# Test the function with the existing dataframe
angle = calculate_best_fit_angle(
    df_earnings, 
    x_col=pr_ch,
    y_col='surprise', 
    plot=True, 
    title=f'Price Change ({pr_ch}) vs Earnings Surprise',
    filter_outliers=True

That was expected. Having the best-fit line trending upwards indicates that, when the biggest positive surprise occurs, the largest spike in the stock price typically happens the following day.

Now let’s review all the different days we calculated.


days_pc = ['price_change_1d', 'price_change_5d', 'price_change_10d',  'price_change_20d']
for day_pc in days_pc:
    angle = calculate_best_fit_angle(
        df_earnings,
        x_col=day_pc,
        y_col='surprise',
        plot=False,
        title=f'{day_pc} vs Earnings Surprise',
        filter_outliers=True
    )
    print(day_pc, angle)

Output:


price_change_1d 12.696605382862431
price_change_5d 10.078316450314189
price_change_10d 8.387462317162376
price_change_20d 5.938370177937775

This is also expected. The angle is decreasing as days pass from the announcement. This indicates that the impact of a surprise on earnings is much more pronounced on the first day, and as time goes on, the momentum in the price diminishes.

Let’s also see if we can identify any patterns based on the sector.


days = [1, 5, 10, 20]
sectors = df_earnings['sector'].dropna().unique()

# Dictionary to store angles by sector and day
sector_angles = {}

# Calculate angles for each sector and day
for sector in sectors:
    sector_angles[sector] = {}
    sector_data = df_earnings[df_earnings['sector'] == sector]

    for day in days:
        pr_ch = f'price_change_{day}d'
        try:
            angle = calculate_best_fit_angle(
                sector_data,
                x_col=pr_ch,
                y_col='surprise',
                plot=False,  # Don't plot individual charts
                filter_outliers=True
            )
            sector_angles[sector][day] = angle
            print(f"Angle for {sector} - {day} day(s): {angle:.2f}°")
        except Exception as e:
            print(f"Error calculating angle for {sector} - {day} day(s): {str(e)}")
            sector_angles[sector][day] = None

# Convert the nested dictionary to a DataFrame for easier plotting
sector_angles_df = pd.DataFrame.from_dict({(sector, day): sector_angles[sector][day]
                                          for sector in sector_angles
                                          for day in sector_angles[sector]},
                                         orient='index')
sector_angles_df = sector_angles_df.reset_index()
sector_angles_df = pd.DataFrame({
    'sector': [x[0] for x in sector_angles_df['index']],
    'day': [x[1] for x in sector_angles_df['index']],
    'angle': sector_angles_df[0]
})

# Pivot the DataFrame for plotting
pivot_df = sector_angles_df.pivot(index='sector', columns='day', values='angle')

# Create a grouped bar chart
plt.figure(figsize=(14, 8))
bar_width = 0.2
index = np.arange(len(pivot_df.index))

for i, day in enumerate(days):
    plt.bar(index + i*bar_width, pivot_df[day], bar_width,
            label=f'{day}d', alpha=0.7)

# Add labels, title, and legend
plt.xlabel('Sector')
plt.ylabel('Best Fit Angle (degrees)')
plt.title('Best Fit Angles by Sector for Different Time Periods')
plt.xticks(index + bar_width * (len(days)-1)/2, pivot_df.index, rotation=45, ha='right')
plt.legend()
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.tight_layout()
plt.show()

Most sectors are sensitive to earnings announcements. We can see that Energy, Financial Services, Consumer Cyclical, and Basic Materials are experiencing the most significant spikes in their stock prices following their earnings reports. What is interesting is that the only sector that doesn’t intuitively make sense is Utilities, which behaves in the opposite manner.

Let’s Backtest!

Of course, all these results are very interesting, but also expected. It’s always helpful to confirm your feelings with data based on a specific event, but this is far from turning it into a strategy.

So, let’s define a simple strategy: We will invest in the stocks that have more than 10% positive surprise only for the day after the announcement.

For that reason, we will:

Create a dataframe named `df_backtest` where we will store the daily price change, calculated as the difference between the open and close prices of the day.
Add a column named invested_assets where we will store, in a list, the names of tickers that had a more than 10% positive surprise on that day.
We will calculate the average of the daily change of those tickers for that day in a column strategy_return
Based on those returns, we will calculate the equity of our strategy and plot it


df_backtest = ((df_prices_pivot_close - df_prices_pivot_open) / df_prices_pivot_open)

df_backtest['invested_assets'] = [[] for _ in range(len(df_backtest))]
for idx, row in df_earnings.iterrows():
    if row['surprise'] > 10:
        date = row['date']
        symbol = row['symbol']
        if date in df_backtest.index:
            df_backtest.at[date, 'invested_assets'].append(symbol)
df_backtest['strategy_return'] = 0
for idx, row in df_backtest.iterrows():
    if len(row['invested_assets']) > 0:
        # Get the returns of all invested assets for this day
        asset_returns = df_backtest.loc[idx, row['invested_assets']]
        # Calculate average return and store in strategy_return
        df_backtest.at[idx, 'strategy_return'] = asset_returns.mean()
initial_capital = 1000
df_backtest['strategy_equity'] = initial_capital * (1 + df_backtest['strategy_return']).cumprod()
plt.figure(figsize=(12, 6))
plt.plot(df_backtest['strategy_equity'])
plt.title('Strategy Equity Curve')
plt.xlabel('Date')
plt.ylabel('Equity Value ($)')
plt.grid(True)
plt.show()

The result looks incredible! With this strategy, over a five-year period, we would achieve an 800% profit. However:

Note that this strategy does not account for broker fees.
It has a maximum drawdown of more than 30%
In the first two years, it was on the negative side (COVID era).

Conclusions

Earnings surprises are clearly affecting the stock prices immediately, and after that, this effect slows down.
Sectors like Energy, Financial Services, Consumer Cyclical, and Basic Materials are more sensitive to earnings surprises, while Utilities are the opposite.
A simple backtested strategy — investing in stocks with more than 10% positive earnings surprise for one day — shows extraordinary returns, however, with significant drawdown.

Even if you should be sceptical of all those “too good to be true” results, it is safe to say that it is worth investigating further to identify a clear edge in your trading.

With that being said, you’ve reached the end of the article. Hope you learned something new and useful. Thank you very much for your time.

tech. finance. ai