Hacker News .hnnew | past | comments | ask | show | jobs | submit | malshe's commentslogin


> Now I have a local folder where I drop my 1 student list, with names and emails, 2 my loose notes, and 3 a qualification & feedback sheet model; then claude creates a sheet per student, formats and copies the feedback to the right sheet cell, waits for my corrections, then sends everything to their school emails

Yikes! Is this legal in your country?


I've built a small system to do this anonymously. There is a students.csv with real data, a notes.txt that contains my unstructured comments and grades associated to ids (not names or student data), and a model.ods that contains the grading sheet model.

Claude takes the notes.txt and produces a json with corrected comments in the structure I asked for (highlights/needs work/grade), associated to student ids (not real data). This works both for single id or multiple id, for group assignments.

Then a script takes the json, creates a model sheet per student or group of students, fills the right cells, checks the ids against students.csv to fill the real names, and produces the pdf in a pdf/ folder.

Another script sends the pdfs.

I gitignore the sensible files, including an .env with SMTP pass, and denied permissions to Claude for those files using a rule at .claude/settings.json.

There is also a config file to change language, email text and other things.

I believe this is safe and compliant with GDPR, unless Claude ignores the deny rules! Any comments appreciated. Thanks.


possibly not or grey area under GDPR if I use identifiable information, as it is sent to Anthropic for processing, no matter if used for training or not, but I am unsure about this, I should probably anonymize and research it more, thanks for pointing it out

You could just send Anthropic scrambled names / emails and then unscramble locally?

yes something like that, additionally most steps do not require data going through claude anymore, as it already wrote the script that take the student list and the qualifications model and produce a model per student, AND the script that takes that and sends each to its right email. The problematic part is when claude reads my notes and formats them into each of those student qualification sheets. There I would need some form of scrambling as you suggest, not to hijack the thread but ideas appreciated for a minimal setup. I believe claude respects .gitignore.

Maybe you could run a local script or smaller local model that takes a first pass through the notes and replaces every instance of a given name with their assigned number?

I shared a workflow above, thank you.

There is another institution I teach at that gives us Gemini, but not via API, which limits its use for this kind of work to an extent, I could do it via drive, I assume. There being a contract puts the institution and Google as responsible of the data. The first institution I was talking about has MS Teams, without AI afaik, but if they contract it I guess I can do the same with sharepoint, etc.

Sorry to tell you but it’s not grey area, it’s full on black. You do not have permission to share such data with a third party provider that doesn’t have strict privacy guarantees and that you have a data processing agreement with. TOS are not sufficient.

Yes, thank you, I developed and shared, above, a workflow for anonymization.

Property taxes are relatively high in Texas but houses are so much cheaper. Also, the cost of living in Texas is far lower than in CA. From an economic perspective, living in Texas is a no brainer.

Houses are cheaper because there is no appreciation.

Ca appreciation is high and the appreciation more than covers the tax though


Someone asked him this down the thread. He replied that it was said to him. So it’s a first party account.

It's interesting he provides quotes for the other stories, but not this one. And still doesn't in the thread. Again, i'm not debating the authenticity, but is it possible he's inferring that based on something not-so-explicit being said?

I'd love to see/hear the words that people actually say when I hear stories like "they said they wouldn't [invest/buy/etc] from me because i'm a woman".


When I studied comp sci in the early 00s, a prof just flat out told us in a male-only class that women had no place in comp sci. I'm not at all surprised that shitty men are open about their thoughts on women when they assume that they are talking to people who agree with them.

In 2015, working at a software consultancy. Led a small project, delivered it, client was happy, moved onto the next project, thought nothing more of it.

At the company Christmas party that year (which had clients invited, for reasons I do not know) - that client merrily said to my face "boy, I thought the project was going to fail with a woman in charge, but you sure proved me wrong!"

Apparently I had murder written all over my face, and coworkers who overheard were impressed I didn't deck him.


Maybe he wrote his first tweet then thought of putting direct quotes in the next two. Not everything needs to be a conspiracy.



That link it throwing me:

  "Scan this QR code with your mobile device to verify you are human. reCAPTCHA protects your privacy and does not share your details with this website or app."
Is that a new recaptcha thing? I've never seen that before.

Yeah, it's Google's second attempt at device attestation (ie. the widely panned web integrity tech they're trying to push through)

It's why I switched PortableApps.com to hCaptcha

Yeah there's been some debate about it in the news, because only a certified device can approve it, meaning this takes away any open-source platform to prove you're a human

Footnote 1 mentions:

Actually, the title of this paper is unproven. We have not ruled out the possibility that a single neuron could ride a bicycle.


Deepseek has been great for my academic research. I used it recently for a large scale RAG task with 10-Ks going back to 1995. I tried different models from various providers to get a sense of accuracy-cost tradeoff. GPT 5 mini was the best but still costing close to an estimated $10k which is way beyond my research budget or what my department would pay. Then Deepseek V4 Pro released and I tried it when they announced the pricing. The results were way better than GPT 5 mini and the cost was unbelievably low. I went for it rushing to finish the job before their promo period ended (which they made permanent anyway). It cost me only $450! That’s insane savings compared to GPT 5 mini.

I wasn’t concerned about data privacy as 10-Ks are public and quite likely part of the model training.


Makes sense in terms of public information. But the thing is if you use a foreign-hosted tech, you don’t know what is doing with ALL data you signed up with. You can use Chinese models hosted on US servers (like a together.ai – there are others). You’ll pay more than the native Chinese app sites, but will know that you aren’t feeding personal data into an ominous black hole. One simple example is to think what “dossier” they could build with your credit card data and search data. Nothing good can come from that.

And those US hosted versions of DeepSeek etc. are still far cheaper than the big (frontier) US companies... just not as cheap as the Chinese hosted sites.

I will check them out. I would always prefer keeping my data in the US. So far I haven’t used Deepseek for anything else but for future applications it will certainly create issues.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: