Friday, August 23, 2013

Week 10: Making filters apply consistently

With the previous way of passing filters settings from DlgSelector to SelectorLibraryTableModel, a filter update and select() was executed for every change whether user-initiated or programmatic. Since the default behavior is to load filter settings programmatically when the seed track changes, and each control triggered the update, there were a number of redundant calls to select().

In order to fix this, I changed the API for SelectorLibraryTableModel: now, each filter has a corresponding "SelectorLibraryTableModel::setXfilter" function that changes the appropriate variable but does not trigger the filter update. Thus, SelectorLibraryTableModel::applyFilters must be called manually once all changes have been applied -- the "DlgSelector::filterByX" functions each call applyFilters, as they are meant to be connected to the appropriate signals from the UI, whereas loadStoredFilterSettings does so only once after all the filters have been changed.

This change fixes the issue where filters were not getting applied after the first track was played without manually checking/unchecking them, and cuts down on unnecessary database queries.

There is still an issue wherein, once a track has played, it always appears in the filtered set whether or not it matches the filters. Since I drop the temporary table (that stores the preview and score columns) whenever a new seed track is set, I'm not sure how or why the old track is being kept; this requires more investigation.

Friday, August 9, 2013

Week 8: Isolating similarity calculations

This week, I placed the similarity calculations in a separate class so that they could more easily be called from outside the SelectorLibraryTableModel. Initially, this change seemed to cause a unusually large performance hit, but I eventually realized that due to merging with master, the tracks were being dirtied in TrackDAO::getTrackFromDb and thus each one was being re-saved upon access. Resolving this caused average time for calculating similarity against 1458 tracks to jump back from 2800 ms to about 1100 ms.

Further improvement should still be possible, and in my search for places to optimize I tried using callgrind, but it was not particularly revealing (the biggest bottleneck seemed to be the database queries). I will try with the Google profiling tools to see if I can uncover anything else.

Right now, to add a new similarity function requires changes in the following places:

  • dlgprefselector.ui: add a new slider
  • dlgprefselector.cpp: hook up the slider signals/slots, and add a function to show the description
  • library/selector/selector_preferences.h: add a preference key for said slider
  • library/selector/selectorsimilarity.cpp: implement the actual comparison in the foreach loop of calculateSimilarities
I'll continue looking for ways to speed up the similarity calculation so that it is more responsive, and see if it might be worthwhile to further abstract out comparison functions from SelectorSimilarity to facilitate the addition of new ones.

I also plan to add a function to SelectorSimilarity to allow to fetch the top (or top N) matching tracks according to current preferences; this could then be fed into the Auto-DJ queue or a MIDI mapping with ease. This requires essentially the same filters as are used in SelectorLibraryTableModel, so now is a good time to think about cleaning up the generation of queries for key/BPM filter matching so that the same functions can be used in both classes.

Friday, August 2, 2013

Week 7: Midterm overview

This post is an extended version of my e-mail to mixxx-devel.

As part of the Google Summer of Code, I have been working on a branch to facilitate the automated selection of follow-up tracks. Since the project has reached the halfway point, I thought it would be good to get some feedback on the direction I've taken and to find out how this feature could be most useful.

The track suggestions are available through a "Selector" item on the library sidebar. This view is based off of Keith's earlier work, allowing one to filter based on matching key, genre, or BPM; it also adds another layer, predicated on the calculation of several similarity functions. At present, these functions compare the timbre, beat spectra, and Last.fm/MusicBrainz tags of the tracks, assigning a score from 0 (no match) to 1 (exact match); these are more expensive than the key/BPM match, and so are only calculated on demand.

One can specify the "seed track" (basis for comparison) in three ways: dragging a track to the Selector icon, right-clicking on a track and selecting the "Get follow-up track" context menu item, or just playing a track. The filters (checkboxes across the top) are all-or-nothing, allowing you to quickly pare down the library to a set of possible matches in key/BPM; then, the similarity functions (accessed by clicking "Calculate similarity") estimate the degree of closeness of those potential followups. Eventually I intend to have the similarity scores calculated automatically, perhaps when the pool of potential follow-ups is reduced below a critical value such as 100 tracks.

The "Selector" preference pane allows you to adjust the weight of the different similarity functions. I have found that the Last.fm tags are not very complete/helpful, so have generally used 50% timbre and 50% beat spectrum as the weights.

The similarity calculations currently achieve 81.7% accuracy on a "genre classification" task, compared with 82.8% in the automated playlist generator Mirage. However, I'm not sure that this test is a good assessment for how appropriate the suggested follow-ups are. Thus I would welcome any subjective assessments of how reasonable you find the top-scoring tracks to be, and also if adjusting the weights helps to improve matters.

High priorities for me are to establish an API for new similarity functions (basically, they must provide a scoreTrack function that takes two TrackPointers as input and gives a score from 0 to 1) and to abstract the scoring functions out of library/selector/SelectorLibraryTableModel.cpp so that they can be used to add to the Auto-DJ queue or elsewhere in Mixxx.

The branch is up at <https://github.com/chrisjr/mixxx/tree/features_key_selector>. Note, the database schema will be updated if you run it, so please don't use this on your primary library without a backup!

Comments on any aspect of this effort, including the UI, types of similarity function included, and the code itself, are all welcome.

Friday, July 26, 2013

Week 6: Rethinking the Last.fm tags approach

Finding similarity of tracks using the Jaccard similarity has proven problematic, as some tracks have far more tags than others. In such cases, the similarity is close to zero even if one track's tags are a proper subset of the other's. This clearly will not work for comparison.

Thus, a new approach is required. Exaile does this by querying artists similar to the current one, then selecting tracks at random from these other artists. This is a simple but promising approach, and one that can be adapted for Mixxx.

For now, I've changed the Last.fm fetcher to obtain similar artists as "tags," so songs can be compared that way. With my tiny sample library, it seems not to work terribly well, as none of the similar artists are also found in my library, but perhaps this is just a fluke. I'll try to incorporate more of my library to see whether it can be made to work.

As for testing the timbral similarity quantitatively, the test using the ISMIR '04 data is described on p. 55 of Dominik Schnitzer's thesis about Mirage. As he states:
"1. All pieces in a music collection are assigned an adequate genre label [ed note: these are given as a CSV file in the ISMIR data].
2. The files in the collection are analyzed and the full similarity matrix is computed for all pieces in the collection.
3. Then a genre classification confusion matrix is computed so that each song is assigned the genre of its most similar song.
4. In the confusion matrix the predicted genre of a song is plotted against its actual membership.
5. The confusion matrix diagonal shows the classification accuracies for each genre, which is the number of correctly classified songs divided by the total number of songs in the class."
I spent some time looking at the Google Test framework, but I'm not sure if this process can be completely automated using it. At the very least I can write up clear instructions/a shell script to do the test and automate at least part of it in the TimbreUtils class.

Next week is the midterm, so I'll want to hammer out the rest of this artist similarity function and write out instructions for the selector UI so that others can give it a spin.

Friday, July 19, 2013

Week 5: Last.fm integration

This week, I worked on implementing the Last.fm tag fetcher, to add an additional data source for determining song similarity. Since the official liblastfm is covered under GPL v3 and Mixxx uses GPL v2, I switched from using the liblastfm library to connecting to the REST API using Qt's built-in functions.

In order to fetch tags, you can currently use the "Fetch tags..." item in the context menu -- it will later be something that happens during the analysis, if it's enabled in the Selector preferences (not yet implemented). In addition, I added a menu item to "Get follow-up tracks" which will set the currently selected track as a seed for the selector and switch over to Selector view.

The fetcher obtains results from Last.fm fine, but I want to make sure that if any tracks have no tags or could not be found (as was the case for several tracks in my test) that we store that information, so that Last.fm will not be repeatedly queried for them (if they have been updated recently).

Max notes that upon opening Mixxx with a newly created database, the Selector view is empty. I can confirm this is happening, but I'm still not sure why; this is something important to look into further.

As for my solution to using the Hellinger distance for timbral similarity, Max has suggested that I implement a test to compare the success of different distance functions using this data. For this purpose, we can use the ISMIR '04 dataset which is freely available online to assess how well the metric groups together similar tracks.

Next week I'll work on putting such a test in place, as well as implementing the tag similarity functionality and isolating a "get best follow-up" function so it can be called from e.g. the auto-dj.

Friday, July 12, 2013

Week 4: Calculus woes (and eventual solution)

I was less productive this week as I hoped, in large part because of the math difficulties detailed below. Therefore I plan to put in some work this weekend to make sure things remain on track.

The Jensen-Shannon divergence between probability distributions P and Q is defined as:
\[ D_{\mathrm{JS}}(P\|Q)=\frac{1}{2}\left(D_{\mathrm{KL}}(P\|M)+D_{\mathrm{KL}}(Q\|M)\right) \]

where M is a mixture distribution that gives half the weight to P and half to Q. For Gaussians, this turns out to be quite different from the sum of the two distributions; see this StackExchange answer for more detail.

To find an analytic solution for $D_{\mathrm{KL}}(P\|M)$, one ends up with an integral that looks like:

\[ \int_{-\infty}^{\infty}\ln\left(\frac{2p(x)}{p(x)+q(x)}\right)p(x)\,{\rm d}x \]

where $p(x)$ and $q(x)$ are the probability density functions of P and Q, and $m(x) = \frac{1}{2}p(x)+q(x)$.

This ultimately involves finding the integral of the log of the sum of the two density functions, which doesn't seem to work out, regardless of the fact that the covariance matrix is diagonal.

The J-S divergence can also be represented as a difference in entropies, i.e.: $JS(P, Q) +H\left(m(x)\right)-\frac{H(p(x))+H(q(x))}{2}$. There are ways to approximate the entropy of a Gaussian mixture without resorting to numerical methods -- see On Entropy Approximation for Gaussian Mixture Random Vectors for an approach using the Taylor-series expansion.

The zeroth-order expansion gives the estimate

\[ H(m(x))=-\frac{1}{2}(\log m(\mu_{P})+\log m(\mu_{Q})) \]

which might work well enough, but still ends up rather elaborate when expanded.

To my great fortune, however, there are other metrics bounded between [0,1] for which people have already worked out the solution for two multivariate Gaussians, freeing me from attempting any more calculus! In particular I went with one called Hellinger distance, which has the following form:

\[ H^{2}(P,Q)=1-\frac{\left|\Sigma_{P}\right|^{1/4}\left|\Sigma_{Q}\right|^{1/4}}{\left|\frac{1}{2}\Sigma_{P}+\frac{1}{2}\Sigma_{Q}\right|^{1/2}}\exp\left\{ -\frac{1}{8}(\mu_{P}-\mu_{Q})^{T}(\frac{1}{2}\Sigma_{P}+\frac{1}{2}\Sigma_{Q})(\mu_{P}-\mu_{Q})\right\} \]

This article (PDF) called "Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions" notes that the Hellinger distance is relatively "close" to JS divergence as a metric, so it may well be good enough.

I also added the cosine similarity for the beat spectra, so the similarity scores are starting to look more reasonable.

Upon looking at the BBC Vamp plugin more carefully, I realize that its notion of "energy" is really just to do with the sense of sound output, and not with the intuitive sense of energy that would be useful in putting together a set. Thus I think it's best to put that idea aside for now and focus on the genre comparison.

This weekend I'll polish off these distance calculations, and also to investigate an occasional crash that happens when the distance functions try to access the timbre model.

For next week, I aim to get the UI into a better state (especially the currently non-functional sliders) and to lay the groundwork for the Last.fm tags fetcher (in particular, creating the database tables where it will store its information, and the SimilarityDAO class to access that information).

Friday, July 5, 2013

Week 3: Adding timbral similarity

This week, I incorporated the timbral modeling plugin from Queen Mary and created some additional classes to deal with timbral similarity calculations.

I had to modify the Queen Mary SimilarityPlugin to output a vector<float> containing the Gaussian timbral model and beat spectrum. The vector's first three elements give the length of the mean, variance, and beat spectrum vectors respectively, followed by the concatenation of those three vectors.

It turns out that the symmetrised KL divergence doesn't have a specific upper bound, so it's not suitable on its own for combining with other metrics. However, the related Jensen-Shannon divergence is bounded by [0,1], and might be a better fit if it can be efficiently calculated.

As an aside, it seems there is some opportunity for refactoring among the Analyser classes, since AnalyserBeats, AnalyserKey, and AnalyserTimbre all have similar tasks related to serializing protobuf messages to the database and retrieving them. Today I ran into a problem where the TimbreModel was not actually serializing as I had thought, because I bound the data to the query in TrackDAO::bindTrackToLibraryInsert but not in updateTrack or addTracksPrepare (which required almost identical code each time). If these commands were factored out into a separate function, they could easily be applied to queries in all three places.

For next week, I'll investigate the possibilities for such a refactoring, populate the new preference panes with options for ranking the different similarity functions, implement the J-S divergence, and add in the track energy measure as another source of data.