Share

I don't have a well-informed opinion on AI. I'm led to believe that what we refer to as 'AI' is just 'machine learning.'

I'm not a computer scientist, but my intuition is that there's a difference between 'intelligence' and 'machine learning.'

Throughout my twenties, I was an armchair intellectual. I binged podcasts by Joe Rogan and Sam Harris and subsequently regurgitated the opinions of experts as if they were my own.

Only a fool attempts to predict the future. I'm more interested in the imminent period where we learn how to coexist and work with our new machine-learning tools, rather than against or without them.

I read pop academia books like Utopia for Realists by Rutger Bregman and Sapiens by Yuval Noah Harari. This is where I began to hear grave concerns about the relentless progress of what, at the time, was being referred to as 'automation'; seeing scary graphs like this one [below] predict when certain jobs would be automated:

Above: A graph from 2017's study into the likelihood of job replacement by machines, conducted by Oxford University’s Future of Humanity Institute, Yale University and AI Impacts.


The graph predicts that a machine could "write a high school essay" by 2026. ChatGPT had managed it by 2022. So given that we're ahead of schedule, we can confidently expect everyone to be unemployed within 115 years.

Only a fool attempts to predict the future. I'm more interested in the imminent period where we learn how to coexist and work with our new machine-learning tools, rather than against or without them.

I host a podcast called Having a GAS. Chris, who films the discussions, told me that our audio for one of the episodes wasn't ideal. You could hear us talking, but it was in a very echoey room. So we tried all sorts of tools and techniques to clean it up but 90% of the problems remained.

An entire branch of audio post-production had disappeared completely.

A few weeks ago, Chris said, "Alright guys, I ran the first minute of your audio mix through an AI speech enhancer I used as an experiment. Files in here with and without AI let me know which you prefer".

The enhancer in question was the beta rollout of Adobe's new podcast clean-up service. It fixed 120% of the problem, and I say that deliberately.

Not only had it removed all the echo, but it added depth and quality to our voices that wasn't there in the first place. Rich bass tones gave the impression that we'd recorded our discussion in a real studio, rather than in an echoey room with a £250 zoom recorder. It floored me.

An entire branch of audio post-production had disappeared completely.

Above: The legendary debate that laid down US political lines on race, justice and history.

One weekend, YouTube fed me this debate [above] from The Cambridge Union in 1965 on whether The American Dream had been borne at the expense of "The American Negro" (forgive what may seem like a slur, but I must quote it accurately).

The debate is electrifying between James Baldwin and William F. Buckley, Jr. I became slightly obsessed with it and re-watched it a few more times. Partly because the standard of oratory from everyone involved is stellar compared with modern standards. Baldwin is the clear winner, not just by virtue of his quiet charisma but because of his ability with language.

While I was watching, I remembered that speech enhancement tool and realized that you can upload anything into it. Not just things we'd recorded.
So I took the debate from YouTube and put the audio into the podcast clean-up tool.

Adaptation is a defining characteristic of humanity. 

Within ten minutes of processing, we were hearing James Baldwin speak as if he were recorded yesterday. The clarity of the audio brings him firmly into the present. In the same way that things are easier to objectify if they're shot in black and white, they seem unrelated and irrelevant if the audio is plagued by the limitations of the time.

But, of course, it sounded too perfect. Too processed. So we used our post-production techniques to put it back into its context and make it sound as if we were there in the room with them in 1965.

And now, we have our first case study of 'remastered audio'; a collaboration between man and machine.

Gas audio remaster

Credits
powered by Source





Unlock full credits and more with a Source + shots membership.

Credits powered by Source
Above: A side-by-side comparison of the AI cleanup.


A doom-monger would point out that the machine will soon be able to do the things that humans did in this process. To apply the appropriate reverbs based on room dimensions derived from image modelling; to separate one voice from another and treat them accordingly; to know the difference between people speaking in the room and people speaking into a broadcast dynamic microphone.

And when that day comes, which it likely will, the burden will, once again, be on us to adapt.

Adaptation is a defining characteristic of humanity. When the water rises, we learn to swim. It is difficult, stressful, and painful. And often we lose things on the way. But that's as it should be because making sacrifices to adapt is how the heroic survive.

We're all about to be put to the test.

We never really know someone until they're tested. And we never really know ourselves until we are. We're all about to be put to the test. And in this tiny moment of collaboration between man and machine, I've seen for the first time a reason to be optimistic about our future relationship with machines.

They could be wonderful servants, instead of tyrannical masters.

Share