Hacker News .hnnew | past | comments | ask | show | jobs | submit | gigantum's commentslogin

You just always start with the base case. It could be recursion is just so simple that your mistake is thinking there must be something complicated.

Also, if the algorithm is not tail recursive, you may be missing that there is an implicit call stack where intermediary results are pushed to.


Like some of the other ML/AI posts that made it to the top page today, this research too does not give any clear way to reproduce the results. I looked through the pre-print page as well as the full manuscript itself.

Without reproducibility and transparency in the code and data, the impact of this research is ultimately limited. No one else can recreate, iterate, and refine the results, nor can anyone rigorously evaluate the methodology used (besides giving a guess after reading a manuscript).

The year is 2019, many are finally realizing it's time to back up your results with code, data, and some kind of specification of the computing environment you're using. Science is about sharing your work for others in the research community to build upon. Leave the manuscript for the pretty formality.


>any clear way to reproduce the results.

Given that it's evolved I'd imagine this is a given? Or more accurately you could probably duplicate some kind of emergent behaviour but it would be different given different randomized parameters


More of what the point is I think is that they don't go into any meta-analysis of big changes that were seen in many of the trials. They don't try to isolate specific mechanisms that formed in a majority of trials that almost made it to this stage for example. They just don't really go into any analysis of the failure trees in trial dataset at all.

IMHO this is probably just a case of them trying to stretch this out across a bunch of different papers, and this is just the announce paper. Which is a shitty practice, but the current academic environment encourages taking good findings and puffing them up into multiple incomplete papers rather than one well-done paper.


Usually you use an RNG for which you can publish the seed. So, although it’s random, you can reproduce the results.


Glancing through the paper it seems like they use the recent Transformer model. Does whatever underlying stack they use expose something to share RNG seeds and the exact hardware optimizations your environment applies during training? Otherwise "publishing the seed" sounds nice but might not be as trivial as the phrase suggests.


reproducibility should be something that's baked into an experiment's design.

so, if their experiment was designed such that reproduction is inherently difficult, they should have designed it in a better way, and they should've used a toolset that wouldn't run into that problem.

a non-reproducible experiment isn't necessarily completely without value, but it's a thing that everyone should look askance at till it proves its worth.

(apologies if my comments don't apply to this experiment and if it is reproducible -- i didn't have time to read through the OP, but i thought this reply was still a worthwhile response to its specific parent comment)


No that's absolutely a fair and true point, my comment was more pointed at the RNG aspect. I have not looked into this specific one either but normally people would hopefully not publish their best randomly achieved run if the system cannot reproduce it or similar results.

That being said the paper in question doesn't seem to reference open source code anyway so I guess my point was kind of moot, apologies.


For the most part, yes.

There are specific CUDA operations which are not guaranteed to be reproducible though, as well as some CuDNN operations which are non-determanistic without performance sacrifice, and this does cause real problems.

See https://pytorch.org/docs/stable/notes/randomness.html for some reasonable docs on this.


There are many CS conferences where you can/should submit a VM image to reproduce the results. See, e.g.: http://cavconference.org/2018/artifact-submission-and-evalua...


You want to be able to set the seed if only you want to be able to debug your program. Pseudo random is sufficient for these models and is independent of any hardware settings. You should not share your random source between concurrent threads, though, but that’s good practice anyway.


Most machine learning accelerators have a few non-deterministic operations. The chances that you could run trillions of floating point operations through a GPU and get a bit-for-bit identical result is low.


Really? I'm not an ML guy so in simple terms, what are these non-deterministic ops? Or are you saying GPUs can be expected to be, basically, faulty?


Both.

Some operations split and join data in non-deterministic ways (especially the order of operations, leading to different floating point rounding). If you shard across multiple machines, weight accumulation order will depend on network latency for example.

Also, GPU's aren't anywhere near as reliable as CPU's when it comes to being able to run for hours without any random bit flips/errors.


> ...split and join data in non-deterministic ways ... to different floating point rounding

Ah, of course! A very timely reminder, thanks!

> GPU's aren't anywhere near as reliable as CPU's when it comes to being able to run for hours without any random bit flips/errors.

Now that's worrying. A bit flip can't be expected to be skewed towards any particular bit within a float, so it could easily happen in the exponent, skewing a single value by orders of magnitude one way or the other. Combine that with the rest of your 'good' results and yuck. That's very concerning. Thanks for the warning.


> A bit flip can't be expected to be skewed towards any particular bit within a float,

Actually, I think they are - for example, the exponent path through an adder/multiplier is typically shorter, so when operated close to clock speed limits, the exponent is more likley to be correct.

(I've not actually verified the above on real hardware)


Is it not possible to use the same seed and random number generator to reproduce the results accurately?


You got to be careful. RNG is being used to initialise the layers but also for mini-batch selection. They are usually different RNG's.


At Gigantum (https://gigantum.com), we've been working with brain imaging researchers on enabling exploratory analyses that are easy to share or reproduce. For now, our collaborators have focused on large public datasets like the Healthy Brain Network (http://fcon_1000.projects.nitrc.org/indi/cmi_healthy_brain_n...) and OpenNeuro (https://openneuro.org/).

I'm curious to know how you're managing the AI engineering side of things - I know there's nothing close to "the right answer" yet in terms of pipelines for brain images. And of course I'd be interested how folks could collaborate on developing better algorithms for understanding these images (with Gigantum and otherwise).

Certainly, if you have a collaborative project and would like to try Gigantum for coordinating code, data, and computational environments, we'd be happy to support that! We provide a one-click solution to publish a project so that someone else can pick up exactly where you left off.


Hi Gigantum. We actually don't do the computer vision ourselves -- we label the data for companies that do. But your product looks very interesting. Do you work exclusively with brain imaging?


Makes sense. The founding team came out of large scale brain imaging research, but the goal is that anything you'd do in a notebook / web UI environment (like Jupyter, RStudio, etc.) you should be able to do with the Gigantum Client.

The difference from the standard approach for those tools is that we automate some command line operations (Git, Docker, etc.), and provide UI for the rest. We provide a stable foundation for how to organize data using Git LFS, along with an optimized S3 storage back-end if you need to cherry pick large datasets.

Our main goal is to improve the quality of "academic" science, but we're open to anything that fits!


Re: Type system -- at Gigantum we very aggressively enforce all classes and methods in the core libraries must be fully typed using mypy. The depth and expressiveness of mypy rivals that of any other strongly typed language.


Interesting, however there is no indication from the publisher or researchers how this result can be reproduced. It's nice that they put in some of the training data, but imagine how much more impactful to the community this could be if those interested could reproduce - and iterate - on this...

At Gigantum (https://github.com/gigantum/gigantum-client) this is literally our raison d'être to make this process as simple as possible.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: