Do LLMs have good music taste?
08-28TL;DR: No.
I made frontier models rank Kanye West albums. I asked questions that compared the first ten solo albums, pairwise, which totals to 45 questions. Here is the prompt format I used for each album pair:
> Pick your favorite Kanye West Album between "X" and "Y". You have to pick one. Respond with just their name.
I turned these binary preferences into per-album scores by fitting a Bradley-Terry model. As for the reference ranking, I went with Rate Your Music, the canonical website for music ratings. Here's what the models think:

I used the Kendall tau distance metric to compare how similar each model's ranking was to the reference. Opus 4.1 comes out on top:

This was a fun experiment, though I realize that this method can be generalized to much more than 10 different albums and far beyond music taste. If you want to give me some API credits or compute to do experiments on a grander scale, reach out via email!