Using Evolving Agents
to
Critique
Subjective
Music
Compositions
Chuen-Tsai
Sun, Ji-Lung Hsieh
Department
of Computer Science
National Chiao
Tung
University
1001 TaHsueh Road, Hsinchu, Taiwan, China
{ctsun, gis91572}(cis.nctu.edu.tw
Abstract
The authors describe a recommender model that uses intermediate agents to evaluate a large body
of
subjective data according to a set
of
rules and makerecommendations to users.
After
scoring recommended items, agents adapt their own selection rules via interactive evolutionary computing tofit user tastes,even when user
preferences
undergo a rapid change. The model can be applied to such tasks as critiquing largenumbersof
music or written compositions. Inthis paper we use musical selections to illustrate how agents make recommendations and report the resultsof
several experiments designed to test the model's abilityto adapttorapidly changing conditions yetstill makeappropriate decisions and recommendations.1. Introduction
Since the birth of the Netscape web browser in 1994,millions of Internet surfers have spent countless hours searching for current news, research data, and entertainment especially music. Users of Apple's
Musicstore can choose from 2,000,000 songs for
downloading. Havingtodeal withsomanychoices can
feel like a daunting taskto Internet users, who could benefit from efficient recommender systems that filter
outlow-interest items[1-3].
Some of the mostpopular Internet services present statistical data to pointusers to items that they might
be interested in. News websites place stories that
attract the broadest interest on their main pages, and commercial product stores such as amazon.com use
billboardstolistcurrentbook salesfiguresandtomake recommendations that match collected data on user behaviors. However, these statistical methods are less useful for making music, image, or other artistic
product recommendations to users whose subjective preferences can cross many genres. Music selections
areoften made basedonmoodortime ofday [4, 5].
Chung-Yuan
Huang
Department
of Computer Science and
Information Engineering
Chang Gung University
259 Wen Hwa1st
Road,
Taoyuan,
Taiwan,
ChinaTwo classical approaches to personalized
recommender systems are content-based filtering and collaborative filtering. Content-basedfilteringmethods focus on item content analyses and recommend items similar to interested itemsgiven byuserinthe past [1, 6], while the experts use collaboratefilteringmethod to make the group of users with common interests share their accessed information [7-9]. Common design challengesofprevious approachesinclude:
1. When the recommended item is far different from the user's preferences, the user still can only access or select these system-recommended items,
and cannot access the potential good items which never appear in the set of recommended items. This problem can be solved possibly with an
appropriatefeedback mechanism[7].
2. In a collaborative filtering approach, new items maynotbe selected duetosparseratinghistories.
3. User preferences may change over time or
accordingtothe moment,situation,ormood[4, 5].
4. Because of the large body of subjective
compositions, the required large amount of time for forming suitable recommendations needs to
should be reduced[4, 5].
In light of these challenges, we have created a music recommender system model whichwasdesigned to reduce agent training time through user feedback. Model design consists of three steps: a) content-based
filteringmethods are used to extract item features, b)a groupof agents make itemrecommendations,andc)an
evolution mechanism is used to make adjustments according to the subjective emotions and changing tastesofusers.
2. Related Research
2.1.
Recommender
Systems
The two major components of recommender systemsareitems andusers. Manycurrentsystemsuse algorithms tomake recommendations regarding music
[3, 9, 10], images, books [11], movies [ 12, 13], news, and homepages [7, 14, 15]. Depending on the system, the algorithm uses a pre-defined profile or user rating history to make its choices. Most user-based recommender systems focus on grouping users with similar interests [7-9], although some do try to match thepreferences of single users according to their rating histories [1, 6].
Recommender systems play a role to use multiple mapping techniques to connect item and user layers, requiring accurate and appropriate pre-processing and presentation of items for comparison and matching. Item representations can consist of keyword-based profiles provided by content providers or formatted feature descriptions extracted by information retrieval techniques. Accordingly, item feature descriptions in recommender systems can be keyword- or content-based. Features foritems, such as movies or books, are hard to extract because movies are composed of various kinds of media [6] and content analysis of books encounters the problem of natural language
processing. Their keyword-based profiles are often determined by content providers. However, current image and audio processing techniques now allow for programmed extraction of content-based features represented by factors that include tempo and pitch distribution for music and chroma and luminance distribution forimages.
Previous recommender systems can be classified in
terms of content-based filtering versus collaborative filtering. Standard content-based filtering focuses on
classifying and comparing item content without
sharing recommendations with others identified as
having the same preferences. Collaborative filtering method focusesonhowusers areclustered into several groups according to their preference. To avoid drawbacks associated with keyword-based searching (commonly used for online movie or book store databases), other designers emphasize content-based filtering focusing on such features as energy level,
volume, tempo, rhythm, chords, average pitch differences, etc. Many music recommender system
designers acknowledge drawbacks in standard collaborative filtering approaches for instance, they
can't recommend two similar items ifone of them is
unrated. To address the shortcomings of both approaches, somesystemsuse contentfeatures foruser
classification and other systems find out group users
with similartastes [7, 16].
To address challenges tied to human emotion or mood and solve the sparsity problem of collaborative
filtering method, some music and image retrieval system designers use IEC to evaluate item fitness
according to user parameters [4, 5]. We adopted IEC forourproposed model, whichusesagentevolutionary
training for item recommendations. The results of our system tests indicate that trained agents are capable of choosing songs that match both user taste and emotion.
2.2. Interactive Evolutionary Computing
Genetic algorithm (GA) is an artificial intelligence system that allows for searches of solutions to optimization problems. According to GA construction rules, the structure of an individual's chromosome is designed according to the specific problem and genes arerandomly generated once the system is initialized. Following GA procedures include 1) using a fitness function to evaluate the performance of various problem solutions, 2) selecting multiple individuals from current population, 3) modifying the selected individuals by mutation and crossover operators, and 4) deciding which individuals should be preserved or discarded for the next run; discarded solutions are replaced by new ones whose genes are preserved). A GA repeats this evolutionary procedure until an optimal solution emerges. The challenge of music recommendation was defining a fitness function that
accuratelyrepresentssubjectivehumanjudgment. Only
then can such a system be used to makejudgments in art, engineering,and education [4, 5].
Interactive Evolutionary Computing (IEC) which is anoptimization method can meet the need ofdefining
a fitness functionby involvingthe humanpreferences.
IECis a GAtechnique whose fitness of chromosome is measured by a human user [18]. The main factor
affectingIECevaluation is human emotion andfatigue.
Since users cannot make fair judgments when
processing run evaluations, results will change for different occasions according to the user's emotional
state at any particular moment. Furthermore, since
users mayfailto adequately process large populations
due to fatigue, searching for goals with smaller
population sizes within fewer generations is an important factor. Finally, the potential for fluctuating
human evaluations can result in inconsistencies across differentgenerations[19].
3.
Using Evolutionary Agents
for
aMusic
Recommender
System
3.1. Model
Description
In our model, intermediate agents play the roles which select music compositions according to their chromosome and recommendto user. Thesystem's six function blocks (track selector, feature extractor, recommendation agent module, evolution manager,
... *.¢v.User corirng ...T .*!...S! ... .. .... rolution...C....on n... ... Database Select .... ...
Figure 1. Six model components including track selector, feature extractor, database, recommendation agent module, evolution manager, and user interface
A representation component consists of the track
selector, feature extractor, and database function
blocks, all of which are responsible for forming item feature profiles. This component translates the
conceptual properties of music items into useful information with specific values and stores it in a database for later use. In other words, this is a pre-processing component. Previous recommender systems established direct connections between user tastes and item features. In contrast, we use trainable agents to
automaticallymake this connection basedon adetailed item analysis. The track selector is responsible for
translating each music composition into textual file,
while feature extractor is responsible for calculating
several statistical feature measurements (such aspitch
entropy, pitch density, and mean pitch value for all tracks mentioned in Section 4). Finally, database function block stores these statistical features for furtheruses.
An evolution component includes a
recommendation agent module and evolution manager. The former is responsible for building agent selection rules according to music features extracted by the
representation component, while the latter constructs an evolution model based on IEC and applies a GA model totrain the evolutionaryagent. In ourproposed model, user evaluations serve as the engine for agent
adaptation (Fig. 2).
A central part of this component is the
recommendation agent module, which consists of the agentdesignand thealgorithm forselecting items. The first step for standard GAs is chromosome
encoding-that is, designing an agent's chromosomal structure
based on item featurerepresentations. In our proposed model, each agent has one chromosome in which each generespectivelyrepresents one of feature value. The gene value represents item feature preference and the number of item features represents chromosome length.
Each feature needs two genes to express the mean and range value. Take 3 agents' chromosomes listed in Figure 3 for example, fl mean and fl_range represent the 1st agent's preference of tempo feature. It means that 1st agent prefers the tempo between 30 and 40 beats per minute. The 1st agent will select the songs which have the tempo 35 ± 5 bests per minute and velocities 60 ± 10. The value of gene also can be "Don't care". We also perform the real number mutation for each mean and rangevalue,andone-point
crossoverfor selectedpairof agents' chromosomes.
n
xitialization _S,,
AGENTS Music
acXxAgertseecttXxite
DatabasexsTheusergrades
EachAet selct the itm thxe mui xiCtemS.
zA ' 4cs bymatchxingthxegenxes.
NextGenxerationx
GA: Generate
new=agents
byCrossover &Mutation GASelection
Good! Bad!
Figure 2. Evolution component, including agent recommendation module and evolution manager
The evolution manager in our model is responsible for the selection mechanism that preserves valuable genes for generating more effective offspring. The commonprocedureisselecting good agentsto serve as
the parent population, creating new individuals by mixing parental
genes,
andreplacing eliminated agents. However, when dealing with subjective evaluations,human's preference changing can result in lack of
stability across runs. Accordingly, the best agents in previous rounds may get lowgradesbecause ofchange
of human's preference, and therefore be discarded prematurely. As a solution, we propose the idea of agent fame values that are established according to
previous behaviors. Thehigherthe valueis,the greater thepossibilitythatan agentwill survive. Thesystem's
selection method determines which agents are
discarded or recombined according to weighted fame values and localgradesineachround,with totalscores being summed with an agent's fame value in
subsequentrounds.
Another important GA design issue is deciding
when to stop agent evolution. System convergence is
generally determined via learning curves, but in a subjective system this task (or deciding when an agent's training is complete) is especially difficult in
light of potential change of user preference and emotion. Our solution is basedonthe observation that the stature ofjudges in a music or art competition
make in previous competitions. In our system, agent fame value varies in each round. The system monitors agent values to determine which ones exceed a pre-defined threshold; those agents are placed in a "V.I.P pool." Pool agents cannot be replaced, but they can share their genes with other agents. Once a sufficient number of stable V.I.P. agents are established, the system terminates the evolution process. For example,
if one of agent got six points fame value and the system pre-define threshold is six points high, the agent will beplaced in a V.I.P pool. This mechanism just sets for preserving the possible good agents.
CHROMOSOME
AgentiD fl_mean fl_range |f2mean f2_range ...
1 35 5 60 10 ...
2 60 3 95 4 ...
3 83 5 120 10 ...
Figure 3. Agent chromosome. Each gene represents a mean or range value of music feature. Whole chromosomes represent selection rules for agents to follow when choosing favorite items. The chromosome in this figure encodestwo music features.
A user component consists of an interface for
evaluating agentrecommendations based on standards suchastechnicality, melody, style,andoriginality. The user interface is also responsible for arranging agents
according to specific application purposes. For
example, for finding joint preference between two
different users, the user interface component will initialize and arrange two set agents for these two users respectively.
Anagentselects items of interest from the database
according to selection rules and makes appropriate
recommendations to the user, who evaluates items via the interface. Evaluations are immediately dispatched tothe agent, whose evolution is controlledaccordingto performance and GA operations (e.g., crossover,
mutation, and selection). The evolution manager is responsible for a convergence test whose results are usedtohalt evolutionaccordingtoagentperformance.
3.2.
Applications
Wedesignedourmodelsothat the chromosomes of
survivingagents contain selection rules that be ableto
represent user profiles. Concurrently, user profiles
formed by agent chromosomes can be compared
among multiple users. Combined, distributing agents
canbe utilized for three kinds ofapplications:
1. Userscantrainsamplegroupsof agents. The agent evaluation function canbe alteredtoreflecta sum
of severaluserprofiles,thusrepresentingthetastes
of multiple users. However, true system
convergence will be difficult to achieve due to disagreements among user opinions. As in the case of scoring entries in art or music competitions,
extremely high and low scores can result in total
scoringbias.
2. Users cantrain their own agents and share profiles. According to this method (which is similar to collaborative filtering), the system compares user profiles formed by the agents' chromosomes and
identifies those that are most similar.
Collaborative recommendations can be
implemented via partial exchanges among agents. 3. Users can train their own agents while verifying
the items selectedbyother users' agents. Inthe art ormusiccompetition scenario,users cantrain their own agents before verifying the agents of other users toachievepartialagreement.Pools of agents from all users will therefore represent a consensus. If one user's choice isrejected by the majority of otherusersfollowing verification, thatuserwill be
encouraged to perform some agent re-training or
face thepossibility that the agent inquestion will be eliminated from the pool. For this usage, the
user interface is responsible for arranging and
exchangingthe agents between differentusers.
4.
Experiments
Our experimental procedures can be divided into
twophases:
1. 1. Training phase. Each user was allotted six agents for the purpose ofselectingmusic items
two songs per agent per generation (12 songs per
generation). Since subjective distinctions such as
"good or bad music" are hard to distinguish according to a single grading standard, user give multiple scores to each songs according to
difference standard. Each agent received two sets
ofscores from user, with three scores in each set representing melody, style, and originality. The chromosome of any agent receiving high grades
from a user six times in a row was placed in the
system's V.I.Ppool; the chromosomewasusedto producea newchromosome in thenextgeneration.
This procedure was repeated until the system determined that evolutionary convergence had occurred. The system stoppedat theuser'srequest
or when the V.I.P pool contained four agents, whichevercamefirst.
2. Validation phase. This phase consisted of a
demonstration test for verifying that system-recommend songs matched the user's tastes. Experimentalgroupsconsisted of 20 songs chosen
by 6 trained agents; control groups consisted of 20
songs chosen by 6 random agents. User
evaluations confirmed or refuted agentcapabilities. Users were not told which selections belonged to the respective groups.
4.1. Model Implementations
Musical items were stored and played in polyphonic MIDI format in our system, because the node data in MIDI files can be extracted easily compared with data in audio wave format [1]. The track selector translates each MIDI file into a textual formatrespectively; we list the beginning part of textual feature file in Table 1 forexample. Polyphonic items consist of one track for
melody and additional tracks for accompanying
instruments or vocals. The melody track (considered the representative track) contains the most semantics. Since the main melody track contains more distinct noteswith different pitches than the other tracks, it was used for feature extraction based on pitch density analysis. According to previous research [3], this
method is capable of achieving an 83 percent
correctness rate. Trackpitch densityis defined as Pitch
density =NP/AP, where NP is the number of distinct
pitches on the track and AP is the number of all
possible distinct pitches in the MIDI standard. After
computing the pitch densities of all targeted music
object tracks, the track with the highest density was
identified as therepresentative polyphonic track.
Table 1. Partof textual MID feature file
Unit Length At Time Track Channel Note Veincity
314 53 1162ms 197ms T4 C4 d2 68
319 50 1181ms 185ms T3 C3 d4 71
321 48 1188ms 178ms T3 C3 b3 74
Purpose of feature extractor is to extract features from the perceptual properties of musical items and transform them into distinct data. We focusedon seven
features for our proposed system; new item features should be also added whenpossible.
1. Tempo, defined as the average note length value derived fromMIDIfiles.
2. Volume, defined as the average value of note
velocities derived fromMIDIfiles.
3. Pitch entropy: PitchEntropy P jogP whereP] N
PithE,,oy - Plo P,werPj -T
where
Nj
is the total number of notes with a corresponding pitch onthe main track and Tis the total number of main tracknotes.Pitchdensity,asdefined earlier in this section. Meanpitchvalue for all tracks.
6. Pitch value standard deviation. Large standard deviations indicate a user preference for musical complexity.
7. Number of channels, reflecting a preference for solo performers, small ensembles, or large
bands/orchestras.
Genes in standard GA systems are initialized randomly. However, in our proposed system the random agents will probably fail to find items that match their genetic information because the distribution of extracted features is unbalanced. We
therefore suggest pre-analyzing feature value
distribution and using the data to initialize agent chromosomes. By doing so, it is possible to avoid initial agent preferences that are so unusual that they cannot possibly locate preferred items. Furthermore,
this procedure prevents noise and speeds up agent evolution. Here we will use tempo as an example of music featurepre-analysis. Since the average tempo for all songs in our database was approximately 80 beats perminute(Fig. 4),arandom choice of tempo between 35 and 40 beats per minute resulted in eventual agent replacement or elimination and a longer convergence time before convergence for the entire system. For this reason, average values in our system were limited: 60 percent of all initial tempo ranges deviated between 1 and -1 and 80 percent between 2 and -2. This ledto a speedingupof the agent evolution process.
4.2. Recommendation
Quality
Recommendation quality is measured in terms of precision rate and weighted grade. Precision rate is defined as Precision rate = Ns / N, where Ns is the
number of successful samples and N the total number of music items. Weighted grades equals to summation of
Mi
divided byN, whereMi
represents music item grades and N the total number of music items. Userswere given six levels to choose from for evaluating
chosen items.
Users were asked to evaluate experimental and control group selections. Experimental group agents evaluated songs recommendedbyagentsthattheyhad trained and control group agents evaluated songs at
random. After users completed their tests, the system
calculatesprecisionratesandweighted grades. Finally,
the songs recommended by the trained agents had an
average precision rate of 84 percent and average
weighted grade of7.38,comparedto58.33 percentand 5.54for songs recommendedbythe random agents. 4.
20
--Accumulatednumberof
.< tempo(Beats perminute) E1
z
16-12
30 50 70 90 110
Beats per minute
Figure 4. Statistical curve for tempo distribution in our sample of 1,036 MIDI files
potential for use by referees to critique large numbers ofsubjective compositions (in such areas as art, music and engineering) and to make recommendations for images by extracting features (e.g., brightness, contrast, or RGB value) and encoding the information into agent chromosomes.
...
C
Curve A:ExpermentalGroup(Trained Agents)
C(urveB:Control group(RandomAgents)
...
4.3.
Convergence
TestGA-based models commonly perform large
numbers of iterations before arriving at convergence. In order to trace learning progress, we let users
perform one demonstration test after every round;
results are shown inFigure 5. Curve A reflects a steady increase in effectiveness and convergence after eight rounds. Curve B reflects a lack of progress for agents that make random selections withouttraining.
In addition to recommendation quality and
convergence tests,wemadeanattempttoidentifyclear differences between experimental and control group music selectionsby extractingtheirrespectivefeatures. As shown inFigure6, obvious differences were noted in terms of tempo and
entropy,
indicating that the trained agents converged unique preferences and did not blindly select items. Take one user's experimentalresult as an example, the user'spreferences of feature tempo is quite different from the average tempo in control group.
5.
Conclusion
Our proposed recommendation model can evaluate
a large body of subjective data via a cooperative
processinvolvingboth system agents and humanusers.
Those users train groups of agents to find items that match their preferences, and then provide ongoing feedback on agent selections for purposes of further training. Agent training entails IEC methods and agent fame values to address the issue ofchange in human emotions. The agent fame value concept is also usedas a convergence condition to promote agent population diversity and to propagate useful genes. Model
flexibility was expressed in terms of replacing or altering functionalblocks suchas user interface which allows for usages ofmultiple users. We suggest that with refinement and modifications, our model has
1 2 3 4 5 6 7 8 9 10
Generation
Figure 5. Convergence test and evolution
generation
of 10 users. Curve A represents anaverage of fitness values of 60 agents belong
to 10 users
Figure 6. One user resultsexample
References
[1] Kazuhiro, I., Yoshinori, H., Shogo, N.: Content-Based
Filtering System for Music Data. 2004 Symposium on
Applications and the Internet-Workshops. Tokyo Japan.
(2004)480
[2] Ben Schafer, J., Konstan, J.A., Riedl, J.: E-Commerce
RecommendationApplications.DataMining and Knowledge Discovery, Vol.5. (2001)115-153
[3] Chen, H.C., Chen, A.L.P.: A Music Recommendation
System Based on Music and User Grouping. Journal of Intelligent Information Systems, Vol.24.(2005) 113-132
[4] Cho, S.B.: Emotional Image and Musical Information Retrieval with Interactive Genetic Algorithm, Proceedings of
theIEEE,Vol.92.(2004)702-711
[5] Cho, S.B., Lee,J.Y.: A Human-OrientedImageRetrieval System using Interactive Genetic Algorithm, IEEE
TransactionsonSystems, Manand Cybernetics,PartA, Vol.
32.(2002) 452-458
[6] Li, Q, Myaeng, S.H., Guan, D.H., Kim, B.M.: A
Probabilistic Model for Music Recommendation Considering Audio Features, in Information Retrieval Technology, Vol.
3689.(2005)72-83
[7] Balabanovic, M., Shoham, Y.: Fabs: Content-based,
Collaborative Recommendation, Communication of the
ACM,Vol.40.(1997)66-72
[16] Pazzani,M.J.: A Framework forCollaborative, Content-Based and Demographic Filtering, Artificial Intelligence
Review,Vol. 13.(1999)393-408
[17] Holland, J.H.:
Adaptation
in Natural and Artificial Systems, Ann Arbor:UniversityofMichiganPress.(1975)[18] Takagi,
H.: InteractiveEvolutionary
Computation:
Fusion of the Capabilities ofEC Optimization andHuman
Evaluation, in Proceedings of the IEEE, Vol. 89. (2001) 1275-1296
[19] Maes, P.:
Agents
that Reduce Work and InformationOverload, Communicationsof the ACM, Vol. 37.(1994) 31-40
[8] Konstan, J.A., Miller, B.N., Maltz, D., Herlocker, J.L., Gordon, L.R., Riedl, J.: GroupLens: Applying Collaborative FilteringtoUsenet News,Communications of theACM,Vol.
40.(1997)77-87
[9] Shardanand, U., Maes, P.: Social Information Filtering:
Algorithms for Automating "Word of Mouth", InKatz, L.R.,
Mack, R., Marks, L., Rosson, M.B., Nielsen, J. (eds.), in
Proceedings of the SIGCHI conference onHuman factors in
computingsystems, Denver, Colorado, UnitedStates.(1995)
210-217
[10] Kuo, F.F., Shan,M.K.: A Personalized MusicFiltering
System Based on Melody Style Classification, in
Proceedings of Second IEEE International Conference on
Data Mining, (Maebashi City, Gumma Prefecture, Japan.
(2002)649-652
[11] Mooney, R.J., Roy, L.: Content-Based Book
Recommending using Learning for Text Categorization, In
Nurnberg, P.J.,Hicks,D.L., Furuta, R.(eds.), inProceedings of the fifth ACM conference on Digital libraries, (San Antonio, Texas,UnitedStates. (2000)195-204
[12] Fisk, D.: An Application of Social Filtering to Movie
Recommendation, Bt Technology Journal, Vol. 14. (1996)
124-132
[13] Mukherjee, R., Sajja, E., Sen, S.: A Movie
Recommendation System-AnApplication of Voting Theory
in User Modeling, User Modeling and User-Adapted Interaction, Vol.13.(2003) 5-33
[14] Chaffee, J., Gauch, S.: Personal Ontologies for Web
Navigation, in Proceedings of the ninth international conference on Information and knowledge management. McLean, Virginia, UnitedStates.(2000)227-234
[15] Chiang, J.H., Chen, Y.C.: An Intelligent News
Recommender Agent for Filtering and Categorizing Large
Volumes ofTextCorpus, International Journal ofIntelligent Systems, Vol. 19.(2004)201-216