Tuning-Free Personalized Image Generation

fxtentacle · 2024-07-25T16:09:41 1721923781

I know exactly why Facebook / Meta are researching this.

Just imagine the possibilities for advertisers: Instead of telling someone how happy they would be if only they bought your expensive car, let's just spam them with AI pictures of themselves sitting in said expensive car, ideally next to some very attractive other people that match their dating preferences.

Facebook has all the data they need to create very pleasant dream scenarios for you. And they have the connections to monetize those dreams. Didn't the Expanse have a scene with someone addicted to living in a fantasy world? I thought it was meant as a warning, but this wouldn't be the first time that an elaborate warning would be misunderstood as an instruction manual.

strongpigeon · 2024-07-25T17:40:28 1721929228

This is quite thought provoking. I can totally see ads for, say Disney World, where they put you in the picture instead of an actor. I mean, the whole goal of these ads is already to have you imagine yourself there. Putting you in the picture makes it that much easier.

yazaddaruvala · 2024-07-25T20:54:43 1721940883

lol, if it is a good enough Ad

Just add it to my Instagram timeline and I can skip the trip and the cost. Everyone else (including me in 30 years) thinks I went.

MasterScrat · 2024-07-25T19:13:13 1721934793

We sell text-to-image model finetuning (aka "Dreambooth") as a service and yes, this is one of the use cases.

Recently a travel agency used our platform to generate images of people in the destinations they were advertising.

01HNNWZ0MV43FF · 2024-07-26T03:58:50 1721966330

LarsDu88 · 2024-07-25T16:11:10 1721923870

This is a much bigger thing than the llama3.1 release. Llama 3.1 doesn't really help Meta's bottom line.

But content creation and ads are Meta's killer app. By having a model that doesn't require finetuning, they just changed the whole game.

baq · 2024-07-25T17:32:38 1721928758

Where do I sign up for my personalized AI content filter bot which can reliably detect ads and remove them from my browser?

fxtentacle · 2024-07-25T17:44:45 1721929485

I think the future will be a web browser running inside a VM and then the final DOM including all referenced resources go through a filter before being rendered. That way, it's impossible for the website to detect if you display the ads or if you just just load all necessary resources for rendering but mask them out.

acchow · 2024-07-25T18:56:47 1721933807

The future will be “AI PCs” with a powerful on-device chip that can filter out on-screen ads, but enabled only by subscription.

LarsDu88 · 2024-07-25T19:07:50 1721934470

Hmmm, that'd be an interesting startup idea!

How do ad blockers work exactly?

ripdog · 2024-07-26T00:38:43 1721954323

They're entirely manual. A whole bunch of volunteers write filter rules to block known ads. There's a big github where people can post issues about ads they've found, and volunteers will write filter rules to block them.

See https://github.com/easylist/easylist/issues

LarsDu88 · 2024-07-26T06:59:26 1721977166

This does seem like something ai could automate

pyinstallwoes · 2024-07-26T10:24:31 1721989471

Digital immune system

bastawhiz · 2024-07-26T01:39:49 1721957989

> Llama 3.1 doesn't really help Meta's bottom line

Not directly. But most genai needs text models. And Meta definitely doesn't want OpenAI or Google or someone else controlling the state of the art. Zuck is essentially preventing anyone from getting too big by preventing the only options for "quite good" being behind someone's metered API.

ChrisArchitect · 2024-07-25T16:58:46 1721926726

Cleaner link: https://ai.meta.com/research/publications/imagine-yourself-t...

dang · 2024-07-25T19:07:27 1721934447

Thanks! We changed to that from https://scontent-sjc3-1.xx.fbcdn.net/v/t39.2365-6/452604312_... above.

LarsDu88 · 2024-07-25T19:08:24 1721934504

OP here: Thanks for changing the title as well!

miyuru · 2024-07-25T17:35:46 1721928946

Thanks. original link will expire after some time. this will be really helpful when that happens.

smokel · 2024-07-25T17:00:28 1721926828

Photographic images generated by these systems tend to look like the graffiti portraits you see on fairground attractions.

I've done a lot of photorealistic drawings, and the trick to make something look real, is to get the tones exactly right. Misjudge a tone a bit, and the result looks like a mediocre drawing or a painting. In other words, the gradient of skin tones is off, which is ironic, I guess.

I assume that there is a systemic error in (linearly?) interpolating colors (in the wrong color space?) somewhere, which potentially could be easy to fix and lead to improved photorealism. On the other hand, it might be a horrible problem to fix, because it would require accurate radiosity and raytracing to get right.

jsheard · 2024-07-25T17:07:18 1721927238

I know what you mean, my theory is that it's an emergent property from RLHF tuning penalizing examples of bad/incoherent lighting, which pushes the model towards that kind of vague "lit from everywhere" style which is relatively easy to sell as correct without a proper understanding of light transport. It looks amateurish because that's the same trick an amateur human might use to try to sell photorealism without good lighting fundamentals.

GaggiX · 2024-07-25T17:07:55 1721927275

The fact that these models rely heavily on classifier-free guidance has a strong impact on the tones of the image.

bastawhiz · 2024-07-26T01:43:39 1721958219

Exactly. No humans are involved in most of the process.

TylerE · 2024-07-25T17:26:51 1721928411

It doesn’t help that RGB is very badly tuned for many skin tones.

jsheard · 2024-07-25T17:36:48 1721929008

Do these models really operate in RGB space? I would have thought that using a perceptual color space to generate images meant to be perceived by humans would be low hanging fruit.

smokel · 2024-07-25T18:59:40 1721933980

As a total cluebie on generative art, I would assume that the neural networks involved use linear weights and ReLU only. If the training data and the output are in RGB pixels, then it would be reasonable to suppose that this introduces some bias.

It may not be enough to use a perceptual color space only. The gradients in skin tones, or any other complex texture, are non-linear due to lighting and curvature.

Is there someone in the room who does know how things work, and whether this hypothesis is wrong or not?

TylerE · 2024-07-25T18:28:47 1721932127

At the very least they ultimately output to RGB. The fleshtone part of the spectrum is quite small.

bastawhiz · 2024-07-26T01:43:01 1721958181

I have a feeling this is partly due to training data being selected by an "aesthetic score". If weirdly airbrushed skin consistently has a high aesthetic score, that's what the model gets trained on.

LarsDu88 · 2024-07-25T15:34:56 1721921696

Up until recently, to insert yourself into an image generation algorithm, you had to use a technique like Dreambooth, which involves finetuning the model itself with a new mapping of the subject to a rare token.

Meta just released and productionized a new technique that doesn't require finetuning at all.

This enables a whole host of new possibilities... People can now be inserted into scenes or outfits at will without any sort of time consuming model training.

phkahler · 2024-07-25T16:08:10 1721923690

This will be great for people on Instagram.

kkielhofner · 2024-07-25T16:54:05 1721926445

Given how absurd Instagram/social media already is (entire cottage industries of "private jet" stages in warehouses, etc) it will arguably be a benefit for society when it completely jumps the shark and anyone can generate over the top ridiculousness in seconds.

GaggiX · 2024-07-25T17:10:17 1721927417

>Up until recently, to insert yourself into an image generation algorithm, you had to use a technique like Dreambooth

I mean, not really, you could just train a LoRA for example (it doesn't require training with Dreambooth).

LarsDu88 · 2024-07-25T19:46:34 1721936794

Well the point is, both LoRA and Dreambooth require fine tuning the model (i.e. training)

tmsh · 2024-07-25T16:30:27 1721925027

To be clear for folks this is "fine-tuning" ;) DreamBooth from 2022: https://dreambooth.github.io.

Might want to update the HN title to reflect the paper title. It's really just applying multiple techniques that have existed. Paper's title is "Imagine yourself: Tuning-Free Personalized Image." Nice paper though!

paxys · 2024-07-25T18:04:41 1721930681

They didn't "release" anything, it's a paper.

educasean · 2024-07-25T16:15:08 1721924108

The future of Netflix isn't going to feature DiCaprio or Zendaya. It will be you, your wife, and your friends on the screen as hobbits adventuring to Mordor.

jsheard · 2024-07-25T16:27:58 1721924878

This hypothetical future gets brought up a lot, but would the novelty of something like that really hold up for more than one or two viewings? There's nothing stopping you from replacing the names in an eBook with the names of people you know personally, but beyond young children I can't see anyone actually being enamored by that.

buffington · 2024-07-25T17:41:33 1721929293

While this may have some appeal, I think it'll be similar to the fake Time magazine covers that made it look like someone you knew was named Time's person of the year. Good for a chuckle, but not much more.

I think applying the same idea to video games makes more sense, especially given the autonomy you have in a video game, but even then, the appeal wears off pretty quickly.

Games have had features that allow you to put your likeness in the game before, and that feature probably isn't what people we're buying the game for. Tony Hawk's Pro Skater 2 for Dreamcast allowed you to map a photograph of your face to the in game player. Odd example, but I actually just dusted off my old Dreamcast and remembered this feature the other day as the 20 year old game save had my 20 years younger face on the main character. What I recall about that experience was that for about 2 minutes it felt special, and then never thought about it again until feeling confused about why I was in the game before remembering.

robterrell · 2024-07-25T16:26:52 1721924812

Is this a common desire? I have absolutely no interest in watching myself inserted into a film or TV show.

add-sub-mul-div · 2024-07-25T17:24:47 1721928287

It sounds like it would be a common desire, like when you see the futuristic computer interface in Minority Report. It seems cool on the surface but falls apart the minute you imagine the reality of using it in practice. Your arms would get tired very quickly trying to control an interface in 3D space.

The idea that we'd someday have no more shared experience around media is harrowing and thankfully the public isn't actually calling for it.

bugglebeetle · 2024-07-25T16:46:34 1721925994

No, it’s why the we invented the phrase “main character syndrome” for people who exhibit this behavior.

coldfoundry · 2024-07-25T20:45:41 1721940341

This is actually the premise of an episode of Black Mirror in Season 6 called “Joan is Awful”. Shows an interesting dark take on the negatives that could potentially arise from this - https://en.m.wikipedia.org/wiki/Joan_Is_Awful

hhh · 2024-07-25T17:46:25 1721929585

I do not agree, and think that most people don't want this.

paxys · 2024-07-25T18:05:30 1721930730

Why on earth would I want to go to the movies and watch myself?

taco_emoji · 2024-07-25T20:01:08 1721937668

nobody wants that

megaman821 · 2024-07-25T19:58:21 1721937501

The last example in the paper with the boy and girl definitely have faking a girlfriend vibes.