As someone who did typeset his thesis not too long ago to forget all that lot of tiny tweaks I had to make to get a perfect print out from LaTeX, I would say that this project has a lot of issues. Some of them a more or less easy to fix, but some of them will be hard. Consider wrapping some specific paragraph in a \fussy / \sloppy pair, or hammering a specific float to this page, or ... (this list is really long). Not to mention bibliography tools. I don't claim that these issues are unsolvable, but the deeper you go into solving them the more you will discover that you are basically re-implementing LaTeX.
While the idea of multi-format thesis, or at least double format - PDF and HTML, is very compelling, I doubt there could be good enough solutions for that for any thesis which contains more then just text and some inconsiderate amount of figures/tables/formulae at all.
"Just write and pay attention to content, not formatting," has led to staring at the clock, wondering how 5am came around so quickly more nights than I would care to admit.
God forbid you use Sweave.
I'm looking back 22 years and getting the twitch again. All that time I saved not playing with fonts, kerning, margins, and line height has been burned aligning figures or tables or yes, trying to get that frickin' float on a page at least vaguely proximal to its reference.
Agreed, but for my case, I didn't really care too much about the PDF output. The printed version of my thesis is going to sit in my university library untouched forever. I'm much more concerned with having a semantically correct, searchable, beautiful HTML output.
Why not to go the other way round in this case? Typeset your thesis in semantically correct, searchable, beautiful HTML5, and then just add some specific CSS to get the hard print copy?
Unfortunately, browsers are not up to the job so far. For example, page-break-inside:avoid is only supported by Opera[0]. Maybe Prince [1], which is not for free, though.
I don't argue that TeX is good for typesetting thesis. I doubt that sphinx, even with some plug-ins is good for that. Also don't forget that reStructuredText has its own problems (not found in HTML and LaTeX). Can I have some bold italic or italic monospaced phrase example in reStructuredText please? No, really?
While the idea of multi-format thesis, or at least double format - PDF and HTML, is very compelling, I doubt there could be good enough solutions for that for any thesis which contains more then just text and some inconsiderate amount of figures/tables/formulae at all.