How AI Will Change Your Future Media Workflow
The two big product updates from Adobe at NAB were text-based editing in Premiere and still photo support in Camera to Cloud (you can read our coverage on Frame.io updates here). But in chatting with Michael Cioni, Adobe’s Senior Director of Global Innovation, things took a fascinating turn when discussing Firefly, Adobe’s generative AI model, and the future of AI in media production.
Our conversation explored the integration of AI tools in Adobe’s products, as well as the broader implications of AI-driven technologies on the creative process. Michael shared his insights on how the shift towards generative AI solutions could significantly reduce traditional media production, alter the landscape of creative roles and reshape the way we consume content.
His thought-provoking perspective on the inevitability of these changes and the importance of adapting to them prompted us to ponder the role of human creativity in this rapidly evolving landscape. And while GPT-4 wrote most of this article’s introduction based on the context of the transcript, the interview happened IRL (you can watch a video of it below).
The transcript has also been slightly edited and condensed for clarity.
Filmmaker: Talking about AI, do you see future developments with the Frame.io platform of tools to help organize footage? I know Adobe Premiere has some stuff going out now, but even being able to search footage or tagging or identifying objects, like using it as a media library.
Cioni: Absolutely. Look, AI is going to be the most polarizing bloodbath of the last 20 years. It’s going to be really incredible. It’s going to happen really fast. And there’s going to be people on all sides of these issues. Adobe is really building a system, Firefly, that is a generative AI solution. And that’s part of our ability to make sure that there are copyrights considered with that. These are things that are really important to the Adobe brand to make sure that those things are considered. But you still have to recognize that this is an inevitability, right?
My advice to cinematographers and photographers, creatives in general, is to really recognize that, enormously, the percentage of images that are photographed or recorded in the world for production is going to go down. Now news and sports are different, because you can’t generate news and sports. But when it comes to narrative, and elements like that, there’s going to be a big change. You want to position yourself at businesses, companies and technologies that really support a world where we’re going to be able to generate so much more than we could possibly imagine, right? It’s going to even change things like actors’ contracts. [There will be] the legal ability to buy someone’s likeness and generate that instead of paying them to shoot it. This is going to be really, really different.
So I’m excited about that opportunity. I don’t know exactly how it’s going to play out, but I’m pretty certain it is going to [result in] a significant reduction in overall production because it’ll be cheaper to generate it than to shoot it. Productions that are looking for that kind of savings are just going to go right there. If we fight this technological change, I think it’ll be a miss. If you exert the wrong amount of calories and the wrong issues, you’re going to end up wasting energy and still find yourself in a tricky situation.
So lean into it. Much like 10, 15, 20 years ago with digital cinema, people fought it. There were people that said, “I’m not shooting anything unless it’s on film.” What a silly argument, right? People said, “We’re not shooting 4K, we don’t want 4K.” Some people still argue we shouldn’t have HDR. They’re wasting energy on small issues.
The big issue is gonna be, what are we gonna generate, what are we gonna shoot, how’s that gonna change? And when we edit, are we gonna edit it, or is it gonna edit itself? Computational editing is gonna start to replace a lot of the creative editing at some level. I’m not trying to say it’s good or bad, I’m just reporting the weather here.
I don’t actually know how it’ll play out, but I’m trying to position myself from a posture of learning and listening, and I think Adobe’s doing the same thing, to try to learn and listen and position ourselves to be considerate of the situation, but it’s a lot of unknowns, and if we just sit back and see how it goes, I think a lot of creatives are gonna put themselves in a compromising situation. So my advice is to be proactive, learn, read, try it, test it and figure out where you can fit to make sure that on the other side of this transition, much like digital or 4K or files or tape or whatever, you are on the side that has growth and opportunity still affiliated with it.
Filmmaker: When you say you think we’re going to actually film less and produce overall less, are you thinking that production will turn into generating backgrounds and filming people and compositing them, and then that gradually shifts to maybe people being just generated too?
Cioni: A lot of people have seen camera tests where they have camera A and camera B. They post it and they go, “This was shot a test of camera A and camera B.” Can you tell me which one’s better? Can you tell me the difference? People love doing that. What you’re going to see in less than three years is, “Is this human a human? Or is this just generated?”
Filmmaker: Is the pope really wearing a white puffy jacket?
Cioni: Right. Tell the difference and we start to take this beautiful generative AI and start to apply it at 24 images per second, which will inevitably happen, then all of a sudden we can start to generate actors. We can generate images. I mean, advertising agencies and fashion won’t pay for supermodels. They’ll just generate a supermodel wearing the items and they won’t have to paint out wrinkles. They just won’t generate wrinkles, right? It’ll be easier and faster to do that than to pay a model $1 million a photographer $100,000 dollars to shoot [the spot].
I know that disrupts the market, but this is what it’s going to do. Again, I’m not saying I’m necessarily in favor of this, I’m just saying it’s inevitable. When humans are actually generated in motion, you can’t tell the difference and it’s easy and fast, then would something like a sitcom rather pay seven actors $1 million an episode? Or just generate them and pay them nothing an episode, right?
Filmmaker: The one thing with the generative models, it’s based on stuff that humans have already made. So do you still feel like there’s a spot where human creativity comes in if the algorithms are possibly editing the videos in the future and generating the people, but it’s all based on things we created and fed it? Where does the human creativity come in?
Cioni: We generate what gets generated, I guess, right? And we draw from all these experiences. That’s part of what Adobe Firefly is trying to do is make sure it’s cognizant of where these images come from, where that stuff comes from.
Look, I don’t have the answers. I’m just saying that this is an opportunity where the right creatives will find a way to make sure that they’re relevant in this change. They always do. When digital still cameras came out, people said, “Oh, that’s the end of photography.” Not at all, right? It just changed. Same with digital cameras. “Oh, this is going to plummet the quality of cinema.” No, it actually made it better quality. The same thing will happen.
We just have to be open-minded to what this is going to unlock and figure out how to make sure that our creative impressions, our creative opportunities and our creative outlets are still there, but they’re going to be in a different envelope. That’s the opportunity, and that’s gonna ruffle some feathers. It’s going to create super new, amazing things we never thought of before.
Even the feed that you scroll is given to you via algorithms. What if the same show produces what a Person A likes versus Person B? If Person A likes a certain type of drink and shops at Ross, that’ll be in that, and if Person B likes a different drink and shops at TJ Maxx, it’ll be that version. And it’s the same show. Those little nuances become possible and they become automatic. It just knows what zip code you’re in and what’s there and if you love Taco Bell, your credit card knows that. I mean, all this stuff gets factored in. All of a sudden they’re eating Taco Bell on a show, and they’re eating at Chili’s in the same episode of a different region.
People gotta get behind that and make sure it works, make sure it looks real, make sure you suspend disbelief and you have great stories to tell. I think all that still plays out, but the way in which we deliver it is gonna be a little bit less hitting record and saying, “lights, camera, action.”
I think we’ve learned a little bit from the video game community, because they generate these images, videos, edits and stuff like that. These are all early versions of this. And things like Unreal Engine and Unity, they’re going to be really, really powerful things. I’m excited to see where it goes.
Filmmaker: Yeah, it’s really cool. This will be a good conversation to look back on 10 years from now and see what the deal is.
Joey Daoud is a media producer and founder of the agency New Territory Media. He also runs the free newsletter VP Land, covering virtual production and new video tech.