THE PHANTOM ORCHESTRA
In the early 1960's the American association of musicians made an unprecedented statement regarding the invention of a new instrument:
"Rules are made in response to problems."
"Our problem is job loss due to automation in music."
"We recognise the use of tape instruments as long as they are not utilised in order to displace a musician."
Quite what they would have made of today's endless discourse around musical AI can only be imagined - because in comparison with current music creation tools, the instrument they were referring to was a relatively benign invention. 'The Chamberlin' used a prerecorded set of tape loops to emulate a small palette of orchestral sounds that could be played using a traditional keyboard interface. As such, it was only ever a tangential threat to music creation, since it still required humans to compose for it and to perform it. With the advent of 'constructed music' using sequencers and digital music tools, we removed the need for performers, but most subsequent advances in music creation, even today's cutting edge sound generation AI's, still require the imagination of humans for composition.
However, there is now an exception - one which requires us to narrow down the broad and nebulous terms 'AI' and 'Generative AI' to one particular application of it - 'Automatic Music Creation'. It can be described as follows - training an AI model on a library of pre-existing music to create complete pieces of new music with little or no human input. This process is not just unique in removing the need for a human composer - it also differs vastly from every preceding technology in its ability to make complete works of 'original' recorded music on an almost infinite scale, at almost infinite speed. So it is when we examine this unprecedented aspect of its capabilities that a central question emerges:
What happens to our experience of music and music creation now that we have a system for automatically creating recorded music at infinite scale?
Answering this question entails us making speculative predictions about the future, so instead of attempting to answer it precisely, we can look to examples from the past. These examples approach this question from different perspectives in the hope that they can provide a clearer understanding of how things may progress in the future, whilst avoiding many of the existing arguments around IP and ‘computational vs human creativity’ which have already been exhaustively covered elsewhere.
EXPERIMENTS IN MUSICAL INTELLIGENCE
The first story illustrates the nature of context. In the mid 1980’s, David Cope began work on EMI - what many believe to be the first (albeit primitive) musical AI. EMI was able to interpret musical scores and then automatically create novel scores in the style of the music it had 'learnt' from. Unlike many of the systems we now use for deep learning, the underlying architecture of these models did not attempt to simulate the patterns of information flow through a human brain - instead it used Augmented Transition Networks to predict patterns and sequences of notes. Since EMI was working with a symbolic representation of music, rather than the recorded sound, EMI's output was a musical score, and the resulting piece still required significant human interpretation in its performance by musicians. But although (by todays standards) it was rudimentary in technical terms, it was very successfully in its ability to mimic the compositions of famous composers, to the degree that even those well versed in composition were fooled into thinking that these scores were the work of the composers that it had been trained on.
So what were the seismic responses to a computer that had passed the “Turing test for music”? Well, beyond the initial questions about 'machine creativity' and 'intellectual property', today (outside of academia) we see very little interest in either recordings of EMI's works, or a desire to hear them performed. It appears that although Cope's achievements were an almost unparalleled step forward in automatic music creation, they somehow failed to permeate into the culture and canon of western classical music. How could this be? Musically many of EMI's 11,000 works were as compelling as many of the pieces they had been trained on, and compositionally they were refined, but because we had underestimated the profound human desire for coherent narrative around the piece itself, we had also failed to remember that our relationship to and appreciation of a musical work has never been about the music in isolation.
All human created music emerges from and belongs to a broader context, one which includes its creator, the performers and the prevailing cultural dynamics of the time. If EMI's creations and the current swathe of automatically created music are anything to go on, we are yet to see any evidence that music born in a computational and contextual vacuum is compelling enough to displace creations that posses a human narrative. In other domains, it appears that our response is similar - we see very little interest in watching computers play one another at chess, or compete against each other at sports. Furthermore, this problem of engagement cannot simply be solved by brute force compute, or another innovative step change in underlying AI architecture - it may well prove to be an immovable aspect of the human condition - one which is subject to the slower pace of human evolution.
THE INTERLACE
The second story is one of visibility. In 1996, with the Internet in its early infancy (when Google was still called 'Backrub') David Foster Wallace described something he called 'The Interlace'. The purpose of this fictional entity of the future was to filter through the 'trillions of bits coming at you, 99% of which are shit' to help orient us towards the things we find meaningful because 'it's too much work to do triage to decide'.
Often when this quote is discussed, It is cited as a profound prediction about technological progress - and in that regard it was extremely accurate. But Foster Wallace's great insight was not in predicting the course of technological change itself, but instead anticipating how human psychology would play out when confronted by a vast barrage of user generated content. Today, it is obvious to anyone that the largest, most powerful technologies in the software domain are the gatekeepers - those that can guide us towards the things we want to find, alongside others which allow us to structure, organise and make sense of them.
The implications of Foster Wallace’s quote are clear - as the current sea of human creation deepens, the great works of our imagination do not inevitably rise to the top. When automatic content creation turns this sea into a predominantly artificial ocean, we may risk obscuring human creations from view entirely, or at best empower a whole new category of technological gatekeepers to decide what deserves visibility in terms of their own economic incentives. Crucially, this insight also tells us that this may still happen even if we decide to create a more coherent framework for consent around training models on existing creations. Consequently it is myopic to think that we can simply legislate away all of the ills of automatic music creation, by simply enforcing an opt in for training data or a fairer system for remuneration.
CHORLEYWOOD
The final story is one of over-abundance. It emerges from the village of Chorleywood about 20 miles from London, where in 1961, a manufacturing method was developed which changed the trajectory of food production. Through the use of additives and high strength mechanical kneading, bakeries were able to vastly increase the speed of bread production, whilst simultaneously removing manual labour almost entirely. This process now produces 80% of the bread sold in the UK - around 9 million loaves per day - a scale almost unimaginable at its inception. It signalled the beginning of a much wider shift towards food produced at ever increasing scale and speed, at a substantially decreased cost - and subsequently - the emergence of ultra-processed foods, which now make up 60-70% of a western diet.
So what was our collective human response to this unprecedented new abundance of food? Well, between 1975 and 2016, the prevalence of obesity worldwide tripled. Even in the face of evidence emerging in the 1990's that little of the ultra-processed food we are eating is nourishing for our bodies, and may in fact be extremely damaging to health, our consumption continues to rise. In short, we have become addicted to artificial foods created at almost infinite scale and speed - even when we know that it is damaging to our health.
So how does all this relate to automatic music creation? I believe we have to take very seriously the notion that exposure to an almost infinite volume of artificial 'content', may do to our minds what an almost infinite volume of artificial food has done to our bodies. We are already experiencing the dramatic effects of exponential growth in 'user generated content' on our attention, wellbeing and orientation towards the world. Furthermore, it took us around thirty years to discover the effects of ultra-processed and artificial foods on our physical health - so our future exposure to an ocean of artificial content needs to be urgently studied, anticipated and addressed.
Despite all of this, our current situation also reminds us that progress can be made. Over the past 20 years there has been significant awareness raised around our physical health, and we have also (miraculously) managed to reintroduce an economy for slower, manual processes of food cultivation and production. If we are able to do this with something as fundamental as food production, surely it is not beyond the realms of possibility that we could do the same for works of human imagination?
PARTING THE INFINITE FOG
To conclude with some necessary distance - if AI proves to play a role in minimising illness, protecting nature, or even broadening the sphere of human imagination, it will be one of the most miraculous inventions in human history - consequently, this is not a polemic against AI. Instead it is a criticism of a particularly pernicious commodification of AI whose purpose is to leverage others intellectual property (whether licensed or unlicensed) for additional financial gain. In automatic music creation, we have designed a system which simultaneously denies us the pleasures we receive from the process of making music, whilst creating an almost infinite fog of 'content', one which threatens to obscure from view the creations of those who persevere in cherishing the human craft. Those in charge of developing these systems fundamentally misunderstand the nature of creativity, conflating it with any other mechanical processes as just another 'problem to be solved'. They appear to believe that ideas can somehow be wholly formed in our minds prior to manifestation, just requiring the shortest and quickest route to completion - even if taking that route entails our creative agency being reduced to mere text prompts, rudimentary parameters, or removed entirely.
By encouraging ‘users’ to claim authorship over works that they have been given no agency in creating, their true motivations are veiled behind altruistic statements such as 'democratising music creation'. However, in a world where everyone has access to free music creation software, much of which has already been designed for ease of use - a world where almost all instruments can be acquired for free - this argument is at best incoherent, but more likely disingenuous. Appeals to 'Education' are equally weak - there can be little educational value in a 'black box' system which obfuscates all of the underlying musical structure from the user, whilst providing them with little or no agency in shaping it or auditioning it - since learning an instrument, or arranging a piece of music teaches us as much about listening as it does refine our ability to manipulate the tools themselves.
Clearly these flaws are not the inherent properties of 'AI' itself, more a failure on our part to apply it to the appropriate tasks. If guided in the right ways, and applied to the right purpose, it has the capacity to profoundly reshape our interactions with computers, enabling interfaces for music creation which are more humane, more nuanced, more expressive and less rooted in the current paradigm of parameters and abstraction - something which is desperately needed. It may also inspire us to create differently - but only if we give ourselves appropriate agency in shaping it to unveil the 'inter-concept spaces' and tend towards the unexpected, rather than endlessly regressing to the mean. It can also allow us to deliver richer experiences of recorded music to the listener, ones which can be more holistic and animate representations of a musical work. Recorded music can, for the first time, manifest as improvisational, malleable structures which encourage a deeper attention, exploration and discovery on the part of the listener.
However, building all of this takes time, care, attention and most of all a reverence towards human creations, and it will only provide positive outcomes if we choose to steward it wisely. We don’t need to get lost in arguments about what constitutes authorship or creativity in order to defend the position that there may be many applications of AI that we simply do not want. We must also confront the fact that although economic incentives and legislation are certainly necessary, they may not be sufficient in preventing the most insidious applications of musical AI. To collectively defend the things we care about in music may require us to turn our backs on some of the AI tools we are already making, because our current relationship with them feels very much like we have just invented fire, and our first thought is to burn our shared musical culture to the ground.