The Emotional Geometry of Playlists: From Thirteen Features to Three Scores

Matt/ April 28, 2026/ Agentforce, AI, Development, Integration, Music, Personal, Projects, Salesforce/ 0 comments

Analyzing playlists through 45 years of music psychology

The last post ended with thirteen audio features computed from a 30-second preview clip. Energy. Danceability. Valence. Acousticness. Tempo. Key. Mode. And six more.

That’s a lot. Too many for the Scoring Engine to reason about directly, in fact.

If the goal is to decide whether a submitted track fits a playlist, I can’t ask the Scoring Engine to reason across thirteen independent dimensions simultaneously. Two tracks can agree on nine features and diverge on four, and there’s no principled way to say whether that’s a good match or a bad one without knowing which features actually define that playlist’s character.

The features need to be collapsed into something smaller and more meaningful. The question is: collapsed into what?


The Search for a Framework

I didn’t want to invent a taxonomy of musical dimensions from scratch. That road leads to arbitrary choices, post-hoc rationalizations, and a scoring system that reflects my intuitions rather than any defensible structure.

So I went looking for existing frameworks. What I found was richer than I expected.

Russell’s Circumplex Model of Affect (1980) is the most widely cited framework in music psychology for how we emotionally respond to music. The core insight is deceptively simple: all emotional states – happy, anxious, content, bored, excited, melancholic – can be placed on a two-dimensional plane.

The horizontal axis is valence: pleasant versus unpleasant. Euphoric, joyful, content on the right; sad, tense, angry on the left.

The vertical axis is arousal: activated versus deactivated. Alert, excited, stimulated at the top; calm, sleepy, subdued at the bottom.

Every emotional experience occupies a position on this plane. And crucially – playlists do too.

A “dark workout” playlist lives in the upper-left quadrant: high arousal, low valence. Intense and driving, but not happy. A “happy chill” playlist lives in the lower-right: high valence, low arousal. Positive and bright, but unhurried. A “sad late night” sits bottom-left. A “party energy” playlist: top-right.

When a curator assembles a playlist, they’re building something that occupies a region of this plane – consciously or not. Tracks that fit belong in the same region. This was the architecture I needed.


Why Two Dimensions Aren’t Enough

Russell’s model gives us the emotional plane. But it doesn’t capture everything about how music is experienced, because it says nothing about rhythm.

Robert Thayer’s Two-Dimensional Model of Mood (1989) refines the picture. Thayer proposed two independent axes: energy (how activated you feel) and tension (how stressed versus calm you feel). What his model makes explicit is that physical, rhythmic drive is genuinely independent from emotional tone. You can feel energized by something joyful or by something menacing. The energy axis is real and distinct from valence.

But neither Thayer nor Russell addresses rhythmic structure specifically – whether a track has a strong, consistent beat that compels physical movement. That’s a different quality than general arousal or emotional character.

Lerdahl and Jackendoff’s Generative Theory of Tonal Music (1983) is where rhythm lives, academically. Their framework formalizes how listeners perceive hierarchical metric structure – the way strong beats, weak beats, and rhythmic groupings create the sensation of groove or drive. The computational music information retrieval field has since turned this theoretical work into practical algorithms; the beat tracker in librosa is essentially a real-time implementation of the Lerdahl/Jackendoff beat hierarchy model.

What this framework told me is that rhythmic salience – the degree to which a beat clearly stands out from the surrounding musical activity – is a distinct perceptual dimension, not reducible to energy or mood. A slow, heavily syncopated track and a fast track with a weak, cluttered kick drum can score identically on arousal and valence. Their rhythmic character is completely different.

Three dimensions, then. Not two.


The Scoring Engine: Three Composite Scores

Each composite score maps onto one of these theoretical axes. All three are normalized to a 0–1 scale – which is what makes them directly comparable across tracks and playlists.

Energy Adjusted – the arousal axis. This measures the perceived physical intensity of a track: how dense, produced, and driven it sounds. The dominant input is the spectral analysis of the waveform – how bright, noisy, and active the frequency content is – with a secondary contribution from loudness. The two signals reinforce each other, but they’re not the same thing. Commercial mastering normalizes tracks to roughly similar peak levels. A track can be loud but sparse. A track can be sonically intense at a moderate volume. The spectral component is what separates them.

High Energy Adjusted: hard-hitting electronic production, distorted guitars, dense layered arrangements. Low: sparse acoustic recordings, quiet ambient textures, solo voice.

Mood Score – the valence axis, modulated by arousal. The primary input is the valence feature from the analysis layer – the one that reflects key, tempo, and spectral brightness. But mood and energy interact in a way pure valence can’t capture alone.

This is the Russell Circumplex in practice. Two tracks with identical valence scores but different energy levels land in different emotional quadrants. High valence, high energy: euphoric and driving. High valence, low energy: warm and peaceful. The score needs to reflect that difference. Adding an energy component to Mood Score is what separates the quadrants.

High Mood Score: upbeat pop, bright acoustic, feel-good summer records. Low: dark electronic, minor-key ballads, tense and brooding.

Rhythm Score – the rhythmic drive axis, grounded in the Lerdahl/Jackendoff beat hierarchy. The dominant input is danceability – specifically, beat salience: whether metrically strong beats stand out clearly above the rest of the signal. This is the quality of the rhythm. The secondary input is tempo, which provides the pace dimension. A powerful, locked-in beat at 80 BPM is a groove. The same beat strength at 140 BPM is a different experience entirely.

Beat salience leads because quality matters more than speed. A slow track with a compelling, consistent beat outscores a fast track with an indistinct, cluttered rhythm. Tempo refines the score; it doesn’t override the core measurement.

High Rhythm Score: club records, hip-hop with heavy drum programming, driving punk with a locked kick-snare pattern. Low: ambient, classical, acoustic ballads without a defined beat.


What Makes Them Comparable

The design constraint I worked hardest to satisfy in building the Scoring Engine is that all three scores live on the same scale for both tracks and playlists – computed the same way on both sides.

The same composite formulas that run on a submitted track also run on every snapshot record in every playlist profile. When the playlist snapshot pipeline finishes processing a playlist, that playlist has an average Energy Adjusted, an average Mood Score, and an average Rhythm Score derived from all the tracks currently on it.

Comparing a track to a playlist is then a direct numerical comparison: three numbers against three numbers, all on the same 0–1 scale. The distance across those three dimensions is the similarity signal. A track with Energy Adjusted of 0.8 submitted to a playlist that averages 0.3 has an energy problem, regardless of how well it scores on the other two dimensions.

In practice, all three scores are formula fields in Salesforce – computed automatically from the raw audio feature values whenever a record is saved. The same formulas exist on both the track record and the playlist snapshot record. That’s not an implementation detail; it’s the design constraint that makes the comparison valid. Same scale, same formula, both sides.

Whether a given distance is small enough to call a fit is what the Fit Score decides – and that’s the subject of a later post. The short version: four signals, four weights, one number between 0 and 1. The interesting part is which signals made the cut and which ones didn’t.


What This Looks Like in Practice

I looked at the scores against music I know.

The high-energy punk and hard rock submissions cluster at the top of Energy Adjusted. The acoustic singer-songwriter recordings sit at the bottom. Melancholic electronic tracks score low on Mood Score; upbeat pop scores high. The club records and hip-hop with heavy drum programming dominate Rhythm Score; the ambient and classical submissions sit near zero.

The quadrant clustering holds too. High energy, low mood: dark and aggressive. High energy, high mood: euphoric and driving. Low energy, high mood: warm and acoustic. Low energy, low mood: melancholic and still. The theory maps onto the catalog.

There are limits. A minor-key track that sounds upbeat – a common pattern in pop and electronic – can score lower on Mood Score than a human listener would place it. Key is a strong predictor of perceived valence, but it’s not infallible. These aren’t bugs; they’re the known boundaries of signal-based analysis. The scores are a systematic, reproducible signal derived entirely from audio. They’re not a substitute for a human ear. They’re a foundation.


What’s Next

The three composite scores are live on every track in the catalog. The playlist snapshot pipeline is running daily, building profiles for roughly 3,500 playlists. As it fills in, each playlist develops a fingerprint: where it lives on the arousal axis, the valence axis, the rhythmic drive axis.

The next post covers that pipeline – how it works at scale across thousands of playlists with tracks changing week to week, how it handles playlists that disappear and come back, and the scheduling strategy that makes the full catalog crawl-able without hitting Spotify’s rate limits. It’s less theoretical than this one and more about the engineering realities of keeping data current at a scale I honestly didn’t anticipate when I started.


Matt McGuire is an independent punk artist and Salesforce architect. He’s presenting “The Music Intelligence Engine: AI-Powered Promotion on Salesforce” at True North Dreamin‘ in May 2026.

Share this Post

About Matt

Matt McGuire is a Salesforce architect, AI builder, and punk musician based in Toronto. Canada's #1 certified Salesforce professional, 43× certified across architecture, development, AI, and a wide range of platform products. He's been building on Salesforce for 17 years and currently spends most of his time at the intersection of AI and the platform. The Music Intelligence Engine is his most interesting project to date. He thinks you should read the whole series.

Leave a Comment

Your email address will not be published. Required fields are marked *

*
*

This site uses Akismet to reduce spam. Learn how your comment data is processed.