Hacker News .hn (a.k.a HN2)new | past | comments | ask | show | jobs | submit | dalke's commentslogin

The README says "PyMOL-RS is a clean-room rewrite" but when I look at ./pymol-mol/src/dss.rs I see things like:

  //! - PyMOL's layer3/Selector.cpp - SelectorAssignSS function
  //! - PyMOL's layer2/ObjectMolecule2.cpp - ObjectMoleculeGetCheckHBond function
  //! - PyMOL's layer1/SettingInfo.h - Default angle thresholds
and "matching PyMOL's cSS* flags from Selector.cpp"

While the Rust code is cleaned up and easier to read, I can see that it preserves similar data flow, uses similar variable names, and of course identical constants.

For example, this is PyMol layer3/Selector.cpp:

          /* look for antiparallel beta sheet ladders (single or double) 
          ...
          */

          if((r + 1)->real && (r + 2)->real) {

            for(b = 0; b < r->n_acc; b++) {     /* iterate through acceptors */
              r2 = (res + r->acc[b]) - 2;       /* go back 2 */
              if(r2->real) {

                for(c = 0; c < r2->n_acc; c++) {

                  if(r2->acc[c] == a + 2) {     /* found a ladder */

                    (r)->flags |= cSSAntiStrandSingleHB;
                    (r + 1)->flags |= cSSAntiStrandSkip;
                    (r + 2)->flags |= cSSAntiStrandSingleHB;

                    (r2)->flags |= cSSAntiStrandSingleHB;
                    (r2 + 1)->flags |= cSSAntiStrandSkip;
                    (r2 + 2)->flags |= cSSAntiStrandSingleHB;

                    /*                  printf("anti ladder %s %s to %s %s\n",
                       r->obj->AtomInfo[I->Table[r->ca].atom].resi,
                       r->obj->AtomInfo[I->Table[(r+2)->ca].atom].resi,
                       r2->obj->AtomInfo[I->Table[r2->ca].atom].resi,
                       r2->obj->AtomInfo[I->Table[(r2+2)->ca].atom].resi); */
                  }
                }
              }
            }
and this is pymol-rs's pymol-mol/src/dss.rs

        // Antiparallel ladder: i accepts j, (j-2) accepts (i+2)
        if a + 2 < n_res && res[a + 1].real && res[a + 2].real {
            for &acc_j in &acc_list {
                if acc_j < 2 || !res[acc_j].real {
                    continue;
                }
                let j_minus_2 = acc_j - 2;
                if !res[j_minus_2].real {
                    continue;
                }
                let acc_jm2_list: Vec<usize> = res[j_minus_2].acc.clone();
                for &acc_k in &acc_jm2_list {
                    if acc_k == a + 2 {
                        res[a].flags |= SsFlags::ANTI_STRAND_SINGLE_HB;
                        res[a + 1].flags |= SsFlags::ANTI_STRAND_SKIP;
                        res[a + 2].flags |= SsFlags::ANTI_STRAND_SINGLE_HB;
                        res[j_minus_2].flags |= SsFlags::ANTI_STRAND_SINGLE_HB;
                        if acc_j >= j_minus_2 + 2 {
                            res[j_minus_2 + 1].flags |= SsFlags::ANTI_STRAND_SKIP;
                        }
                        res[acc_j].flags |= SsFlags::ANTI_STRAND_SINGLE_HB;
                    }
                }
            }
        }
That's close enough that I really think you should include the PyMol license info, before Schrödinger's lawyers notice.


Over the last few days I have learned that using code generation tools are increasingly used to create a "clean room" version of a product, using a definition which is far from its standard use.

See https://tuananh.net/2026/03/05/relicensing-with-ai-assisted-... with discussion at https://hackernews.hn/item?id=47257803

I believe your use of "clean room" is another example of misusing the term.

Could you clarify how it was developed? Who had access to the original source code? Were code generation tools used, and if so, how? Was the PyMol source part of the training set for those tools? How did you ensure no copyright violations?

Warren was a friend of mine, and a passionate believer in open source software. He wanted people to be able to modify PyMol for their own purposes, and asked only for a license acknowledgment. Schrodinger, to their great credit, continues to honor Warren by maintaining the Open-Source PyMOL product.

If this project was not developed under true clean room practices, I ask that you continue to honor his work by including the PyMol license in your Rust rewrite.

If it was true clean room development, why does ./crates/pymol-algos/src/align/ce.rs say "This is a faithful port of PyMOL's `ccealignmodule.cpp`.", with comments like "Equivalent to PyMOL's `calcS`" and references to the original code in comments, like: "PyMOL: for (row = 0; row < wSize - 2; row++)"?


Dear Andrew, first of all — thank you so much for your feedback, both the technical and the legal parts. Your earlier comments about SDF and PDB parsing corner cases are incredibly valuable.

PyMOL has been one of my primary tools for 15 years, and I've always held it in deep respect. This project was born entirely out of a desire to contribute something to molecular visualization in the modern world — something fast, modular, and with qualities I've been missing in existing tools. And as a source of inspiration, I took the best one: PyMOL.

Of course I spent a lot of time reading and studying its code, and I openly took concepts and algorithms from it. I don't hide that — it's why the project carries the name it does, and it's why the README has had an Acknowledgments section since the very first commit: "Inspired by PyMOL, created by Warren Lyford DeLano. This is an independent reimplementation, not affiliated with Schrödinger, Inc."

You are absolutely right about the "clean-room" wording — I used it loosely, meaning "rewritten from scratch in a different language with a different architecture," not in the legal sense. That was misleading, and I've already removed it from the README.

You're also right that DSS or CE was a fairly direct port of PyMOL's algorithm, and it should carry proper attribution. At the same time, many other parts — surface generation, cartoon rendering, the shading pipeline — are done quite differently, and the gap keeps growing. But that doesn't excuse insufficient attribution where code was closely followed.

Going forward, I'm focusing on genuinely new functionality — Rust plugin system, web interface, novel shading models (try set shading_mode, skripkin!) — things the original PyMOL never had. But this is not an attempt to distance the project from DeLano's creation. It's a respectful continuation of his ideas in a completely new product.

Thank you again — your comments are making this project better.


You might mention in other forums, like the RDKit mailing list (though that's almost moribund).

I looked at the SDF reader, since that's what I know best. I see a few things which look like they need revisiting.

Line 75 has 'if name == "$$$$" {return self.parse_molecule();}' This isn't correct. This means the record name is "$$$$" (if you are RDKit), or it means the record is in the wrong format (if you are the CTFile specification, which explicitly prohibits that).

Also, does Rust have tail recursion? If not, the recursive nature of the code makes me think parsing a file containing 1 million lines of the form "$$$$\n" would likely blow the stack.

In principle the version number test for V2000 or V3000 should look at the specific column numbers, and not presence somewhere in the line. Someone like me might place a "V3000" in the obsolete fields, with a "V2000" in the correct vvvvvv field. ;)

The "Skip to end of molecule" code will break on real-world datasets. One classic problem is a company which used "$", "$$", "$$$" and "$$$$" to indicate cost, stored as tag data like:

  > <price>
  $$$$

  $$$$
where the first "$$$$" is part of the data item, and the second "$$$$" is the end of the SD record. This ended up causing a problem when an SDF reader somewhere in their system didn't parse data items correctly. (Another common failure in data item parsing is to ignore the requirement for a newline after the data item.)

I talk about "$$$$" more at http://www.dalkescientific.com/writings/diary/archive/2020/0... .

Then there's the "S SKP" field, which you'll almost certainly never see in real life! I've only seen it used in a published example of a JICST extended MOLfile. See http://www.dalkescientific.com/writings/diary/archive/2020/0...

Please don't let these comments get you down! These details are hard to get, and not obvious. It took me years to learn the rare corner cases.

I also haven't done molviz since the 1990s, or used PyMol (I was VMD person), so can't say anything about the overall project. We started with GL, and had to port to OpenGL. :)

PS. A bit of history for you. PyMol's and VMD's selection syntax look similar because both drew on the syntax in Axel Brunger's X-PLOR. Warren DeLano came out of Brunger's lab, and VMD was from Schulten's group, which were X-PLOR users. (Schulten was Brunger's PhD advisor.)


I looked at the PDB parser.

Will you be adding support for using duplicate CONECT records to store bond type information? That's a RasMol extension that PyMol supports. You'll also need to support the pdb_conect_nodup option in the writer.

I see you interpret atom/hetatm serial numbers as an integer. Will you be using the base-36 or hybrid-36 variants (see https://cci.lbl.gov/hybrid_36/) which is a common way to handle more than 100,000 atoms?

Again, these are corner cases which come with experience. I've no expectation that a new program would handle them. I want you to know about them since they will be issues if you expect long-term uptake.


I see you include the dot disconnect "." as part of the Bond definition.

You also define Chain as:

  Chain <<= pp.Group(pp.Optional(Bond) + pp.Or([Atom, RingClosure]))
I believe this means your grammar allows the invalid SMILES C=.N


That's "SMILES".

Yes. Here is the yacc grammar for the SMILES parser in the RDKit. https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/Smi...

There's also one from OpenSMILES at http://opensmiles.org/opensmiles.html#_grammar . It has a shift/reduce error (as I recall) that I was not competent enough to fix.

I prefer to parser almost completely in the lexer, with a small amount of lexer state to handle balanced parens, bracket atoms, and matching ring closures. See https://hg.sr.ht/~dalke/opensmiles-ragel and more specifically https://hg.sr.ht/~dalke/opensmiles-ragel/browse/opensmiles.r... .



Link to the actual "etude" https://archive.org/details/wetherell-etudes-for-programmers... (Wetherell has a number of small, interesting programming "etudes", one of is to write a TRAC interpreter.)

Some background. Calvin Mooers developed TRAC - the programming language - in the 1960s for "duffers", that is, people who were not computer scientists.

(A phrase he used in the writing of the time was "duffers", though I don't know if he specifically applied it to TRAC users.)

It was the first homoiconic language. There were a group of teens interested in programming that hung out with Mooers. One of these was L Peter Deutsch, who at 18 (and a year after writing LISP 1.5 for the PDP-1) helped develop the TRAC language and wrote the first TRAC implementation. Deutsch later implemented Ghostscript.

About 10 years later, Ted Nelson's "Computer Libs" suggested TRAC as one of the first three programming languages to start with. This made people more widely aware of TRAC, and of course people did their own implementations, as seen in this link.

Mooers, though, was very protective about what was "his." He pushed for software copyright production back in the 1960s. The best he could do was trademark the term "TRAC", and send cease&desist letters when someone used it. See this article from the first issue of Dr. Dobbs: https://archive.org/details/dr_dobbs_journal_vol_01/page/n12...

I talked with someone who had met one of Mooers' daughters around Cambridge. He knew Mooers was, and had (as I call) a copy of Computer Lib with him. He got invited to dinner with the Mooers family. All went well, until he revealed he had written a version of TRAC for himself. This was a sore point. Mooers got up and left. He wife commented that Mooers didn't like others playing with his toys.


Could someone explain what "democratising" means here? Is it any different than "user-friendly", "enabling" or "simplifying"?


On the return from my first trip to South Africa I carried 12 bottles of wine in my luggage.

That was back when flights included two free checked items.

On my second trip to Europe one of my suitcases was full of T-shirt swag to give out at a conference. Lugging both up the stairs, across the train tracks, and back down was a hassle.

Both of these were over 20 years ago.

And then there's the story at https://notalwaysright.com/a-steam-powered-cruise/392530/ of a couple trying to bring a full-size espresso machine on their cruise, so they can have their special coffee.


> On the return from my first trip to South Africa I carried 12 bottles of wine in my luggage.

Totally understandable! Amazing wine - I didn't want to leave Constantia. But I picked myself up and dragged myself to Stellenbosch !

And at the end of the trip, I didn't want to return home. Such an incredible country that still holds a very special place in my heart


You didn't have to pay duty on them?


I ... probably did?

I didn't.

Things were a lot looser then. I brought a 6-pack of Negra Modelo as carry-on for a trip to Europe. The airport x-ray staff in Albuquerque recognized it on the screen, which impressed me. They had no problem with it.


I was a nomad for about a year. Towards the end I was tired of the constant leaving.

I asked for advice from an NGO who moves countries often. She said what happens is the NGO members become part of the extended connection, which helps with that situation.

Even when I was a nomad, I wouldn't have been without a suitcase. My big hobby then was dancing - mostly salsa and tango - and I needed several changes of clothes and dance shoes. And, umm, not all black clothes.

To make it worse, indoor smoking was legal, so I would come home with stinky clothes that I wouldn't want to wear again until washing.

I also did some upper undergrad/grad level visiting teaching, and would stay at a staff members home, or in one case the home of the parents of one of the grad students. I brought a dozen or so greeting-style cards with nice pictures of the city I used to live in, so I could leave them as a thank you, with an image of what for them would be an exotic place.


I went backpacking last year for only a little over a month. Absolute pain in my chest when someone who I'd gotten to known over the past few days said it was their last day lol.


I work in a chemistry research field. Most people I know run Python programs for their research. No one I know uses Go. I only know a handful who use Java. Rust and Julia are oddities that appear occasionally.

Sure, we have very different experiences. But that also means that unless you can present strong survey data, it's hard to know if your "Most people" is limited to the people you associate with, or is something more broadly true.

The PSF overlap with my field is essentially zero. I mean, I was that overlap, but I stopped being involved with the PSF about 8 years ago when my youngest was born and I had no free time or money. In the meanwhile, meaningful PSF involvement became less something a hobbyist project and something more corporatized .. and corporate friendly.

> scientists (who buy expensive training courses)

ROFL!! What scientists are paying for expensive training courses?!

I tried selling training courses to computational chemists. It wasn't worth the time needed to develop and maintain the materials. The people who attended the courses liked them, but the general attitude is "I spent 5 years studying <OBSCURE TOPIC> for my PhD, I can figure out Python on my own."

> who are force fed Python at university

shrug I was force fed Pascal, and have no idea if Wirth was a nice person.

> main reason for its popularity in machine learning

I started using Python in the 1990s both because of the language and because of the ease of writing C extensions, including reference counting gc, since the C libraries I used had hidden data dependencies that simple FFI couldn't handle.

I still use the C API -- direct, through Cython, and through FFI -- and I don't do machine learning.

> If it is so popular, why the booster articles?

It's an article about a company which sells a Python IDE. They do it to boost their own product.

With so many people clearly using Python, why would they spend time boosting something else?


The Swedish non-government system (BankID) doesn't work well for me. My Swedish identity must not be dependent on the permission of a US company nor the US government, while BankID requires both.

So far my BankID boycott is over a year old, and my resolve grows as I read more of the news.


I once had my bank close my account because of a mistake they made (I can provide the background but it’s just a facepalming story). That meant my Bank ID was closed down, too.

I asked for an appointment with the bank to resolve it but was told I can only get an appointment with Bank ID.

It was outrageous. Obviously none of the other services worked either. Luckily I still had a British and a German credit card that I used for payments (since I lived in both those countries before). In the end I opened an account with another bank and moved on. Although I did try, furiously, for two weeks to get my old bank to admit their mistake and rectify it. No chance. If they had admitted it it would’ve meant they would have broken financial regulation, and obviously you don’t admit to that if you don’t have to.

Bank ID is great when it works and brutal when it doesn’t.

I actually don’t have a better proposal for a system since it works quite well in most cases, but just wanted to share my bad experience on it too.


Ask your bank for a pin machine, you can get a chip and pin machine to solve BankID challenges.

The machine itself is likely manufactured in China, but it’s of no consequence. You wouldn’t be able to communicate with me if you didn’t use chinese products at all.


You mean Bank-id på kort? https://sv.wikipedia.org/wiki/Bank-id#Bank-id_på_kort says it only supports MS Windows and MacOS, not Linux.

Fundamentally though, that doesn't change the fact that the US can order a Swedish bank to either freeze access to a customer or the bank can no longer do business in the US.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: