There's a good reason to require proper page numbers: citations. In general you can't refer to a particular passage in a huge HTML document. Anchor links are a technical solution, but that's not how normal citations work.
I think that more generally when a document becomes sufficiently complex it is better to use a page-oriented, typesetting approach. You'll want to fiddle with where certain page breaks and figures end up, and if you do that you might as well stick with one canonical format.
You can view it as the paper paradigm. Alternatively, you can also see it as sticking to a uniform, canonical format. Getting fluid layouts right is a very difficult problem, and manually getting the layout right for a fixed format is simply a much better solution for this purpose.
The HTML format does generate multiple pages for different sections and chapters, completely configurable. The format above is the singlehtml format where everything is on one page. I'd much rather have a link to a specific anchor tag on a page that goes directly to where I want than a citation to a page number on a 200 page document.
"As show on pages 105-107" looks much better than "As shown in sections 4.5.4, 5.1 and 5.1.2". Really. You don't have the first option with HTML and/or sphinx. Oh, and can I have some bold italic text please?
Actually, I despise page number references because often popular resources get republished (because no one has the book from the 70s anymore), changing the page numbers, whereas a section title / number is much likelier to remain correct, and an URL to a paragraph can always be updated.
You can if you must but you know what I mean. Typically you cite a peer reviewed work and reference it in the bibliography with author, year, publisher, etc.
Not really. These are useful bits of bibliographic information, which can give you an idea of a source without having to consult it. And the kind of publisher gives an indication of reliability ranging from peer reviewed to self-published. Including an URL as well, when applicable, is of course very useful (given that it's a stable one).
There's a good reason to require proper page numbers: citations.
Right, and I have even seen page numbers in citations of theses. It wasn't common in my field at least, but it happens. For a journal citation, generally page numbers tell you where in the volume that article is, not the content that is being referred to. I think there is flexibility in citations—one can refer to a figure, table, section or even paragraph number with at least equal clarity as a page number. There is convention here, I agree, and my point in my original post is that the convention is followed dogmatically rather than pragmatically in academia.
Getting fluid layouts right is a very difficult problem, and manually getting the layout right for a fixed format is simply a much better solution for this purpose.
I have given this topic a lot of thought. Partly from working on an HTML document for a thesis, partly because I've worked in the area of cognitive ergonomics. At the moment our tools aren't great for this, but that's really our fault as software designers. Fluid layouts should be better than fixed ones.
Concerns about fluid layouts are new, a byproduct of current technology. Printing technology didn't afford us these problems. You can remember when it was commonplace to see websites that contained text as bitmaps, to force readers to consume content as laid out by the designer. There were sites that rigidly spec'd font sizes and browsers that allowed that. Today, the best web designers accept the fact that content will be consumed on different screens (sizes and pixel densities) by users with different preferences. They design layouts that can gracefully handle a large font size stipulated by the reader, rather than design the way that a designer would for print.
So what is the underlying issue with fluid layouts?
Well, we have text that relates to graphics, and as Tufte points out in some great books on the topic, it's particularly useful (and historically very common) to have the relevant text presented in conjunction with graphics. That's an obvious point, but with fluid layouts this can unpleasantly 'break.'[0] This is a solvable issue (even using current web technologies), in a way that can evolve beyond what paper can do for us.
I'd like to see a markup that allows me to designate which graphics are relevant to a block of text, and have those figures shown together. For example, if the display allows it, once you start scrolling past a relevant figure it could slide to the margin of the text and stay pinned alongside as you read relevant text for reference. Perhaps the reader could select other figures to pin, or unpin figures as well.
Think of the number of times you've flipped between pages of a PDF to read text and look at the figure described. Sometimes I end up opening the same document twice and putting them on my screen side-by-side. That's the kind of hack us human factors types love putting in slide shows about how we should be designing for users.
With high-resolution displays getting to mass-market price points, we are only missing the right tools to take technical documents to the next level. As I said, this could be done in HTML, CSS and javascript today, but it's not author-friendly. Even if the tools did exist to make this easy, it would frowned upon in the academic world for being different. But some fields are more progressive than others, and eventually we'll move past the page as a paradigm. I've got to applaud the OP for taking steps in that direction.
My futurist speculation: the move away from pages in the scientific community will happen at the same time that it will move away from traditional journals as an idea distribution channel. I don't think that's in the immediate future, though.
[0] In fact, this often breaks with fixed layouts too, except it's only really bad during document creation. Think figure placements in LaTeX. Fortunately the document creator has to do this work once, and then all of the readers see the exact same thing. Better tools for fluid documents would make things friendlier for the author as well as the readers.
I think that more generally when a document becomes sufficiently complex it is better to use a page-oriented, typesetting approach. You'll want to fiddle with where certain page breaks and figures end up, and if you do that you might as well stick with one canonical format.
You can view it as the paper paradigm. Alternatively, you can also see it as sticking to a uniform, canonical format. Getting fluid layouts right is a very difficult problem, and manually getting the layout right for a fixed format is simply a much better solution for this purpose.