Neutral means staying out of it. People will try and debate that and try to impart their own views about correcting inherent bias or whatever, which is a version of what I was warning against in my original post.
Re being explicit about one's own biases, I agree there is lots of room for layers on top of any raw data that allow for some sane corrections - if I remember right, e.g LAION has options to filter violence and porn from their image datasets, which is probably reasonable for many uses. It's when the choice is removed altogether by some tech company's attitude about what should be censored or corrected that it becomes a problem.
Bottom line, the world's data has plenty of biases. Neutrality means presenting it as it is and letting people make their own decisions, not some faux-for-our-own-good attempt to "correct" it
What do you mean by staying out of it? As far as I can tell, you can't stay out of choosing which data you use.
By staying neutral, it seems to me more that you're arguing for putting blinders on.
In terms of tech companies making choices, you seem to be arguing that they shouldn't intentionally curate their datasets. I would argue that intentional curation is their job, and should be done thoughtfully.
Larger problems could happen if only one (or two) companies end up effectively controlling the technology, as had happened with internet search, however, that is a completely different problem. It's one of lack of diversity of people making choices, as opposed to a problem caused by people actually making those choices.
In other words, I think we should hope for many different large models and datasets, so that no particular one stifles the rest. I think this is the larger point you were trying to make, though I also think the focus on ideology is a tangent from this.
Personally, I'm of the opinion that people should intentionally, carefully, and openly act with their biases (sometimes called ideology), instead of attempting to hide them, ignore them, or somehow "remove" them. Whether or not they do, however, is a different point than whether or not things end up stifled inside walled gardens.
Re being explicit about one's own biases, I agree there is lots of room for layers on top of any raw data that allow for some sane corrections - if I remember right, e.g LAION has options to filter violence and porn from their image datasets, which is probably reasonable for many uses. It's when the choice is removed altogether by some tech company's attitude about what should be censored or corrected that it becomes a problem.
Bottom line, the world's data has plenty of biases. Neutrality means presenting it as it is and letting people make their own decisions, not some faux-for-our-own-good attempt to "correct" it