Back to Blog
Article
alternative-data
nlp
alpha-generation
machine-learning

Alternative Data for Alpha Generation: Satellite Imagery, NLP, and Beyond

MA
Marcus Chen
March 1, 2026
1 min read
5,467 views
Alternative Data for Alpha Generation: Satellite Imagery, NLP, and Beyond

Alternative Data for Alpha Generation

Traditional financial data — prices, volumes, fundamentals — is fully priced in. The new frontier of alpha generation comes from alternative data: non-traditional datasets that provide informational edges.

The Alternative Data Landscape

Satellite Imagery

  • Retail parking lots — Count cars at Walmart to predict revenue before earnings
  • Oil storage tanks — Estimate crude inventory from shadow analysis
  • Crop health — NDVI satellite data predicts agricultural commodity prices

Natural Language Processing

  • Earnings call transcripts — Sentiment analysis predicts post-earnings drift
  • SEC filings — Detect changes in risk language between quarterly filings
  • Social media — Reddit/Twitter sentiment as a contrarian indicator

Web Data

  • Job postings — Hiring surges predict revenue growth 2-3 quarters ahead
  • App download rankings — Mobile app traction forecasts tech stock performance
  • Price tracking — Web-scraped product prices indicate inflation trends

Building an NLP Signal

from transformers import pipeline

Financial sentiment model

sentiment = pipeline( "sentiment-analysis", model="ProsusAI/finbert" )

def score_earnings_call(transcript: str) -> float: """Score earnings call transcript sentiment.""" # Split into chunks (model max 512 tokens) chunks = [transcript[i:i+500] for i in range(0, len(transcript), 500)] scores = [] for chunk in chunks: result = sentiment(chunk)[0] score = result["score"] if result["label"] == "positive" else -result["score"] scores.append(score) return sum(scores) / len(scores)

The Challenge

Alternative data is expensive, noisy, and decays fast. A satellite imagery signal that worked in 2024 may be arbitraged away by 2026 as more funds adopt it.

On AlphaNova

While our competitions use obfuscated data (preventing external data edges), the feature engineering mindset from alternative data carries over perfectly. The skill of extracting signal from noise is universal.

Leave Feedback

/blog/alternative-data-alpha-generation