I know exactly why Facebook / Meta are researching this.
Just imagine the possibilities for advertisers: Instead of telling someone how happy they would be if only they bought your expensive car, let's just spam them with AI pictures of themselves sitting in said expensive car, ideally next to some very attractive other people that match their dating preferences.
Facebook has all the data they need to create very pleasant dream scenarios for you. And they have the connections to monetize those dreams. Didn't the Expanse have a scene with someone addicted to living in a fantasy world? I thought it was meant as a warning, but this wouldn't be the first time that an elaborate warning would be misunderstood as an instruction manual.
This is quite thought provoking. I can totally see ads for, say Disney World, where they put you in the picture instead of an actor. I mean, the whole goal of these ads is already to have you imagine yourself there. Putting you in the picture makes it that much easier.
I think the future will be a web browser running inside a VM and then the final DOM including all referenced resources go through a filter before being rendered. That way, it's impossible for the website to detect if you display the ads or if you just just load all necessary resources for rendering but mask them out.
They're entirely manual. A whole bunch of volunteers write filter rules to block known ads. There's a big github where people can post issues about ads they've found, and volunteers will write filter rules to block them.
> Llama 3.1 doesn't really help Meta's bottom line
Not directly. But most genai needs text models. And Meta definitely doesn't want OpenAI or Google or someone else controlling the state of the art. Zuck is essentially preventing anyone from getting too big by preventing the only options for "quite good" being behind someone's metered API.
Photographic images generated by these systems tend to look like the graffiti portraits you see on fairground attractions.
I've done a lot of photorealistic drawings, and the trick to make something look real, is to get the tones exactly right. Misjudge a tone a bit, and the result looks like a mediocre drawing or a painting. In other words, the gradient of skin tones is off, which is ironic, I guess.
I assume that there is a systemic error in (linearly?) interpolating colors (in the wrong color space?) somewhere, which potentially could be easy to fix and lead to improved photorealism. On the other hand, it might be a horrible problem to fix, because it would require accurate radiosity and raytracing to get right.
I know what you mean, my theory is that it's an emergent property from RLHF tuning penalizing examples of bad/incoherent lighting, which pushes the model towards that kind of vague "lit from everywhere" style which is relatively easy to sell as correct without a proper understanding of light transport. It looks amateurish because that's the same trick an amateur human might use to try to sell photorealism without good lighting fundamentals.
Do these models really operate in RGB space? I would have thought that using a perceptual color space to generate images meant to be perceived by humans would be low hanging fruit.
As a total cluebie on generative art, I would assume that the neural networks involved use linear weights and ReLU only. If the training data and the output are in RGB pixels, then it would be reasonable to suppose that this introduces some bias.
It may not be enough to use a perceptual color space only. The gradients in skin tones, or any other complex texture, are non-linear due to lighting and curvature.
Is there someone in the room who does know how things work, and whether this hypothesis is wrong or not?
I have a feeling this is partly due to training data being selected by an "aesthetic score". If weirdly airbrushed skin consistently has a high aesthetic score, that's what the model gets trained on.
Up until recently, to insert yourself into an image generation algorithm, you had to use a technique like Dreambooth, which involves finetuning the model itself with a new mapping of the subject to a rare token.
Meta just released and productionized a new technique that doesn't require finetuning at all.
This enables a whole host of new possibilities...
People can now be inserted into scenes or outfits at will without any sort of time consuming model training.
Given how absurd Instagram/social media already is (entire cottage industries of "private jet" stages in warehouses, etc) it will arguably be a benefit for society when it completely jumps the shark and anyone can generate over the top ridiculousness in seconds.
Might want to update the HN title to reflect the paper title. It's really just applying multiple techniques that have existed. Paper's title is "Imagine yourself: Tuning-Free Personalized Image." Nice paper though!
The future of Netflix isn't going to feature DiCaprio or Zendaya. It will be you, your wife, and your friends on the screen as hobbits adventuring to Mordor.
This hypothetical future gets brought up a lot, but would the novelty of something like that really hold up for more than one or two viewings? There's nothing stopping you from replacing the names in an eBook with the names of people you know personally, but beyond young children I can't see anyone actually being enamored by that.
While this may have some appeal, I think it'll be similar to the fake Time magazine covers that made it look like someone you knew was named Time's person of the year. Good for a chuckle, but not much more.
I think applying the same idea to video games makes more sense, especially given the autonomy you have in a video game, but even then, the appeal wears off pretty quickly.
Games have had features that allow you to put your likeness in the game before, and that feature probably isn't what people we're buying the game for. Tony Hawk's Pro Skater 2 for Dreamcast allowed you to map a photograph of your face to the in game player. Odd example, but I actually just dusted off my old Dreamcast and remembered this feature the other day as the 20 year old game save had my 20 years younger face on the main character. What I recall about that experience was that for about 2 minutes it felt special, and then never thought about it again until feeling confused about why I was in the game before remembering.
It sounds like it would be a common desire, like when you see the futuristic computer interface in Minority Report. It seems cool on the surface but falls apart the minute you imagine the reality of using it in practice. Your arms would get tired very quickly trying to control an interface in 3D space.
The idea that we'd someday have no more shared experience around media is harrowing and thankfully the public isn't actually calling for it.
This is actually the premise of an episode of Black Mirror in Season 6 called “Joan is Awful”. Shows an interesting dark take on the negatives that could potentially arise from this - https://en.m.wikipedia.org/wiki/Joan_Is_Awful
Just imagine the possibilities for advertisers: Instead of telling someone how happy they would be if only they bought your expensive car, let's just spam them with AI pictures of themselves sitting in said expensive car, ideally next to some very attractive other people that match their dating preferences.
Facebook has all the data they need to create very pleasant dream scenarios for you. And they have the connections to monetize those dreams. Didn't the Expanse have a scene with someone addicted to living in a fantasy world? I thought it was meant as a warning, but this wouldn't be the first time that an elaborate warning would be misunderstood as an instruction manual.