Ask HN: Code review as part of the peer review process?

26 points by extasia 10 months ago

Code review helps discover programming errors in software engineering. Would requiring paper author's to submit reproducible, or at least reviewable, code for generating their results have the same effect on academic research?

How much of the 'crisis of reproducibility' is caused by code that wouldn't pass the smell test, let alone a real code review?

thumbuddy 10 months ago

The code has often already been peer reviewed. My take on this is, it would lead to more gatekeeping. More a hole reviewers rejecting papers because they don't like the way a comment is written, or because someone used a programming style they don't prefer.

If the code can't be reproduced 7/10 times it's because the author doesn't want you to be able to reproduce it so they can continue researching it. 1 out of 10 times it's because of an error, and the remainder is fraud.

It's also important to NOT require code be shared for some fields. I know of several companies who review papers for prominent journals in their field. They take results from papers they review put them in their software to sell and then reject the papers so competitors have no idea some new useful thing exists. It's not illegal to do this by the way, infact there's no reason not too with blind reviewers as far as typical businesses are concerned.

The crisis of reproducibility is the worst in biological sciences and becomes the best toward purer sciences. That's its own problem, I won't go too far chasing that down here... Scientific reporting is mostly flawed not because of "oh silly scientist you made an error" but because of the social factors driving it 9/10 times. People just won't admit it, because they'd lose their positions if they did...

  • krageon 10 months ago

    I would say it is fraud significantly more than 20% of the time if it is in a fashionable field (e.g. machine learning). This is a depressing realisation I came to after spending a significant amount of time (months? Perhaps more than a year) attempting to reproduce ML results from papers, to test for commercial viability. The work itself was incredibly cool and motivating (IMO it is the missing link between cutting edge science and the business world, and if I had gotten better results perhaps we could have picked researchers to sponsor), but the depressing reality was that the academic world (which I left because I was frustrated with how far from practical it was) is an island. People do things there, it does not have to be true or work and very frequently it's concerned with publishing lies for money. It reflects corporate life very closely, except it pretends to be something it is not.

not_your_vase 10 months ago

In theory, yes.

In practice, I'm not so sure. I have yet to meet a developer who reviewed another developer's project, and said

> Definitely, whoever wrote this code is 100% not braindead.

Academic code is written for a purpose - I'm afraid if code review became mandatory, there would be fewer papers submitted in fear of negative criticism due to a memory leak that eats up all memory if the program is running for 3 weeks (even though it finishes after 4 minutes).

On the other hand, most probably reviewer would come from the same science area as the submitter, with similar level of coding knowledge. Not sure if they would be any significant help regarding the code quality. They would do the same.

With that said, I would require making the code part of the paper, regardless how garbage it is.

  • extasia 10 months ago

    I understand that the goals of research code are very different from prod. code, and I suppose that the reviewers would have to tailor their review towards verifying the integrity of the results presented, rather than code quality in and of itself.

    Criticism would be moderated through the journal editor, as with typical review comments.

    • OJFord 10 months ago

      I think that's key (and I agree it would be good) - the review would have different goals/qualities in mind: more 'does the code do what you claim it does' and less 'am I going to be able to maintain this' (because probably nobody is going to maintain it and that's fine).

    • cpgxiii 10 months ago

      Considering how much of a pain journal editors are to deal with already in terms of text formatting, the absolute last thing needed is trying to turn them into code reviewers/meta reviewers. The pool of editors who are both competent in their scientific/technical field and competent software engineers is nonexistent - almost everyone with that combination of skills is working in the private sector and has no desire to wade into journal editing.

huijzer 10 months ago

I think the main problem is that the incentives are wrong. Academics are mostly evaluated on the number of papers they publish in peer-reviewed journals and the number of citations they obtain. Therefore, the main incentive for academics is to sell papers and it doesn't matter whether the papers are true or not as long as other researchers and peer-reviewers like to read and cite it.

scientism 10 months ago

The Journal Of Open Source Software addresses this problem. Reviewers test the code, submit Reviews as GH issues and PRs are corrections. Code review and coding in the open is what the academic review process should be. - https://joss.theoj.org/

xmcqdpt2 10 months ago

Personally, when I was reviewing papers I would always look at the code, and often try to run it. I've spotted issues before, nothing like academic fraud but more like "forgot to say in the paper that this value is truncated before being passed to this function". Basically the issue was often that the paper would present a "clean" version of the implementation, omitting important corner cases because it makes the paper sound less certain and therefore it increases the odds of being rejected at the first gate (by the editors not the reviewers).

I did not reject papers on that basis, just asked them to add back in the hidden details, and the authors always did.

a20230607 10 months ago

The other comments here emphasize the limitations of code review for academic papers. I generally agree that it's not a solutions to all problems.

But it is being tried. This journal: https://mpc.zib.de/ does code reviews. I am not linked with them, but from what I gather the code-reviews experiment is considered a net positive. I did submit papers there and my experience is positive as well. Code reviews were not super deep, but constructive; the reviewers seemed knowledgeable and impartial.

They do occasionally accept closed-source submissions, in which case they ask for binaries and ask the "code" reviewer to reproduce/check the numerical experiments. (It helps that they are in a field in which checking results validity is often vastly easier than generating them, in terms of computational complexity.)

Besides, the journal is fully open access (free to submit, free to read) and very well regarded in its (admittedly niche) subfield.

nextos 10 months ago

High profile journals like Nature already require code and have relatively detailed guidelines for submission [1]. They also have detailed guidelines for statistics, e.g. you must report control parameters when using MCMC.

My experience in Nature, Science, and Cell, both as a submitter and as a reviewer, is that problems tend to be in the way inference is done. I don't think people lie with their code. They lie with the way they do inference, but inference tends to be faithfully implemented as code.

For example, the null hypothesis tends to be false almost all the time. So, it is a matter of accumulating a sufficient amount of data to get a small enough p-value to call the result "significant". Solution, encourage designs with a continuous explanatory variable, emphasize effect sizes, use partial pooling / shrinking, etc. And perhaps hire technical editors that get the right to veto flawed methods.

[1] https://www.nature.com/documents/nr-software-policy.pdf

aragilar 10 months ago

You assume there's code, and not an Excel sheet.

More generally, who's doing the reviewing, and do they have the skillset needed (and how are they being compensated)? Additionally, are they limited to just the code provided, or are they going to review any libraries (who covers the cost of any purchases?) or external code?