HN2new | past | comments | ask | show | jobs | submitlogin
Project Alice – an open source virtual assistant that can run offline (github.com/project-alice-assistant)
169 points by jka on May 1, 2022 | hide | past | favorite | 78 comments


Offline is the key here. I wrote a short story not so long ago [1] about the educational potential of near-future world where storage gets so dense and cheap that the tide turns back toward local computing, away from the trend of the past 10 years toward "online cloud". What happens when the corpus of all human knowledge really can sit on your table top as the early sci-fi writers predicted?

[1] https://www.timeshighereducation.com/opinion/2048-informatio...


> What happens when the corpus of all human knowledge really can sit on your table top as the early sci-fi writers predicted?

If reality is any indication, then if it's copyrightable it won't be sitting on your table top:

* Set of important facts about the world: not copyrightable. Yet so freely accessible I don't even bother to copy Wikipedia to my table top.

* Set of important modern books: copyrightable and mostly still under copyright. Also, probably digitized by Google. Not available for table top copying.

* Set of important music scores: copyrightable, but most are by dead composers, and most are out of copyright. Hooray! Except the IMLSP had to go offline for almost a year, seek legal help and create a staff-review process for uploads to make absolutely sure no publishers could sue him out of existence.

* That full video of Will Forte playing white supremacist "Hamilton Whiteman" at Seth Meyer's actual rehearsal dinner: copyrightable. I actually watched the whole thing at one point, but it apparently got a copyright claim so I can't share it with anyone. I just added this one on the off chance someone on here has a copy and feels daring, because dammit, it was funny, and I want to watch it again.

I feel like the tech world needs a decent political mover-and-shaker who could make a Glengarry Glen Ross style speech at this point: something about how with the tools you got available he could go out and make legal table top human knowledge boxes a reality tonight...


> If reality is any indication, then if it's copyrightable it won't be sitting on your table top:

Absolutely spot on. Indeed, the story was an oblique advocacy for SciHub and an argument as to how the digital entertainment industry and publishers may eventually be the greatest obstacle to human progress.


Consider Cyc [1].

[1] https://en.wikipedia.org/wiki/Cyc

It deduces relations from set of facts in it's database.

If you have human knowledge from wikipedia on your desktop, you can deduce your own facts using Cyc's engine or something else.

It will be fun to tinker with. You can even start your own conspiracy theory, because you can.


> * That full video of Will Forte playing white supremacist "Hamilton Whiteman" at Seth Meyer's actual rehearsal dinner: copyrightable. I actually watched the whole thing at one point, but it apparently got a copyright claim so I can't share it with anyone. I just added this one on the off chance someone on here has a copy and feels daring, because dammit, it was funny, and I want to watch it again.

Was the thing you're looking for longer than 2 minutes, because the DDG results page is filled with those 2 minute videos, including the top result is one thrown to by Seth himself: https://duckduckgo.com/?q=will%20forte%20Hamilton%20seth


The set of all human knowledge doesn't seem to be accessible anywhere right now; it's not on the open internet at least.

Try "how do you make an entire digital camera starting with enough of the right kinds of sand?". Some of this is a trade secret, some of it is in other languages, some of it can't be explained by any person in the production process because you can't describe things you only know through muscle memory.


Really interesting, it makes me think about an AI that understands all the rules of physics and you can ask it questions the way GPT3 does, “design me an f1.4 lens that is low profile and projects light onto a curved 1inch sensor from X feet” or “how do I design a magnetic field to contain plasma at a maximum density with power input below X watts/h”. We really do stand on the edge of a very strange set of possibilities happening…


>the educational potential of near-future world where storage gets so dense and cheap that the tide turns back toward local computing, away from the trend of the past 10 years toward "online cloud"

I'd argue that storage is already dense and cheap enough to "turn the tide back toward local computing."

The reasons why the cloud (more precisely: someone else's servers) and all manner of non-local computing and storage became so entrenched was (at least in the US) that ISPs mostly offered (and still do) asymmetric internet links for "consumer" plans, refuse to provide static IP addressess (IPv4 and IPv6) and actively discourage (via port blocking/throttling) running servers on such links.

And when download/upload bandwidth ratios (for my links they're 20 and 17) are so skewed towards downloads, hosting content becomes less viable.

Add poorly managed physical plant and resistance to spending money on upgrades, and you have unreliable links with inadequate bandwidth.

After decades of this, is it any wonder that centralized storage/compute/streaming services found a ready niche, and have exploited the anti-consumer/anti-competitive actions and policies of the big ISPs to the point that many folks don't realize it could be any other way.

Want to move back toward local computing? Make decently fast, symmetric internet links the norm. Getting that done (at least in the US) will be a tall order.

And even if/when we get there, such a shift won't happen all at once, especially since centralized services will fight tooth and nail to keep their customers (for paid services) and product (for "free" services) at our expense.

However, once decently fast symmetric bandwidth is the norm for "consumer" internet links, decentralization will follow.

It would likely take a while, especially since many (most?) of the developers of decentralized solutions (I'm looking at you, diaspora, mastodon and matrix, among others) make self-hosting their software inaccessible to non-technical folks[0].

There are many great reasons to move away from centralized compute/storage/streaming services, but there are significant barriers to doing so, at multiple levels of the network infrastructure. And more's the pity.

[0] https://hackernews.hn/item?id=30783477


I don't believe storage is the main limiting factor here - availability is. And whether you want to make your data available outside.


So imagine an IPv6 based world where every household has rock solid high speed networking capability, and a relatively smart firewall deciding what connections to allow and disallow without heavy handed management work.


The corpus of knowledge is forever growing and it grows faster than many people's internet connection.


That sounds like a truism, but actually I am not sure it follows. You may be thinking of data, but knowledge is different.

Consider the ordering of:

Signals (semantic/non-entropy (changes that have meaning))

Data (interpretable signals)

Information (data that informs)

Useful Information

Knowledge (justifiable, true and useful information in context)

Wisdom (knowledge applicable to human survival and evolution)

Then, given Sturgeons Law that 99% of everything is rubbish at each level then "knowledge" is not growing as fast as you might think (and much human knowledge is lost at a terrible rate). Humans presently generate about 2-5 Exabytes of raw data per year. Only a small fraction of that is information, considerably less is useful information, and some fraction of that - in the form of good quality books, journals, papers etc, encoded without redundancy, is probably in the order of 100s of gigabytes per year. For over 70 years a shelf of encyclopedias was considered approximate summary of essential knowledge. It's not implausible that storage growth could overtake knowledge production at some point in the not to distance future.


>Then, given Sturgeons Law that 99% of everything is rubbish at each level then "knowledge" is not growing as fast as you might think

Sturgeon's law was 90%, also that's a mighty big given to entrust to a snarky Science Fiction writer's quip explaining why so much Science Fiction in his time sucked.


It already can if it’s mostly text. Have you checked hard drive prices lately?


Not ALL of human knowledge - and many(most?) might cringe to consider it so, but wikipedia english text only was 8GB or so in 2013 when I put it onto an ereader. Was handy to be able to read about the places as I travelled to them without internet connectivity.

Might be more now but storage has likely outpaced it in growth.


I realize it's used as a tongue in cheek reference to Resident Evil, but the "Umbrella Logo" is, among other things, trademarked[1][2][3].

I'm not sure about corporate ownership status here, but it would appear that someone's idea of movie merchandise was to sell firearms by registering a company called "Umbrella Corporation Weapons Research Group" in the real world. I could find several "licensed" arms in online stores. Possibly they're doing that to protect their IP, not out of any profit motive.

The name of the project won't be a problem, but you might want to steer clear of that particular logo. Entertainment companies can be very touchy about their IP and are known to be litigious.

[1] https://trademarks.justia.com/owners/umbrella-corporation-we...

[2] https://trademarks.justia.com/owners/umbrella-corp-4226588/

[3] Possibly more, but there's thousands of logos labeled "Umbrella" and I don't want to spend all day looking.


true words, thank you. While the projects logo/name exists longer than I'm in the project, I had my thoughts about it as well.

I came to the conclusion, there are so many car stickers/unlicensed merch or even companies with a similar logo, there shouldn't be an issue for a small open source project. Maybe a dangerous conclusion, that should be revised when Alice grows :)


I'm not sure I'd express it so harshly as TingPing. I like the project and hope it succeeds. Still, seems somehow unwholesome and invites unnecessary controversy. A good logo represents the project or company, and so what does it say that you all are associated with a erm... borrowed logo?


That attitude doesn't reflect well on the project IMO. Very short-sighted and unprofessional.


Why would they need to be professional? It's an open source hobby project. People are allowed to code for fun.


Hey if you want it, I threw together an umbrella inspired logo that might be a good alternative. It's close, but should be different enough to be distinguishable.

https://i.imgur.com/aPMBndB.png

CC0 and all that of course.


Thank you, my first thought was a mixture of umbrella corp and the scanner drones from half life 2 :)

I put it up for discussion. We don't want to cling to our current design, but we aren't sure yet about a good way to approach a redesigns. Needs a bit time to decide what aspects of Alice in total should be redesigned


Now that you mention it, it does remind me of the city scanner too haha

Well if you do go for it or want to make any tweaks to it I've now uploaded the svg source too: https://drive.google.com/file/d/1FIICC0ySnwzBB5NdWcDX3uXDdwO...

I hope you find something that fits :)


are you… also using the battle.net logo on the skill store? why

you’re like actively looking for trouble lol


Appears to be included in font-awesome, which warns that it is trademarked: https://fontawesome.com/icons/battle-net?s=brands

Someone probably just scrolled through available icons and ended up picking that one.


This - plus the skills icons can be chosen freely from font-awesome in skill creation. Maybe we should exclude fa-brand space from the allowed icons. Another good hint, thank you! And yep, the team dearly misses some creative heads for design as everybody involved is from the coding area :)


Because you can use whatever FontAwesome (for which I'm an early backer) icon when you are creating a skill for Alice using our backend skill creator on her browser interface. We could exclude the use of fa-brands.

Edit https://www.blizzard.com/fr-fr/legal/8bcb0794-6641-4ce3-a573...