[REVIEW]: RSMTool: a Python Package for facilitating research on automated scoring models #33

whedon · 2016-06-28T16:12:38Z

Submitting author: @desilinguist (Nitin Madnani)
Repository: https://github.com/EducationalTestingService/rsmtool
Version: v5.1.0
Editor: @arfon
Reviewer: @jkahn
Archive: 10.5281/zenodo.58851

Status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/fbc649c17d45074d92ac21084aaa6209"><img src="http://joss.theoj.org/papers/fbc649c17d45074d92ac21084aaa6209/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/fbc649c17d45074d92ac21084aaa6209/status.svg)](http://joss.theoj.org/papers/fbc649c17d45074d92ac21084aaa6209)

Reviewer questions

Conflict of interest

As the reviewer I confirm that there are no conflicts of interest for me to review this work (such as being a major contributor to the software).

General checks

Repository: Is the source code for this software available at the repository url?
License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
Version: Does the release version given match the GitHub release (v5.0.2)?

Functionality

Installation: Does installation proceed as outlined in the documentation?
Functionality: Have the functional claims of the software been confirmed?
Performance: Have any performance claims of the software been confirmed?

Documentation

A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g. API method documentation)?
Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

Paper PDF: 10.21105.joss.00033.pdf

Authors: Does the paper.md file include a list of authors with their affiliations?
A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
References: Do all archival references that should have a DOI list one (e.g. papers, datasets, software)?

The text was updated successfully, but these errors were encountered:

arfon · 2016-06-28T16:32:06Z

/ cc @openjournals/joss-reviewers - would anyone be willing to review this submission?

If you would like to review this submission then please comment on this thread so that others know you're doing a review (so as not to duplicate effort). Something as simple as :hand: I am reviewing this will suffice.

Reviewer instructions

Please work through the checklist at the start of this issue.
If you need any further guidance/clarification take a look at the reviewer guidelines here http://joss.theoj.org/about#reviewer_guidelines
Please make a publication recommendation at the end of your review

Any questions, please ask for help by commenting on this issue! 🚀

jkahn · 2016-06-29T22:29:08Z

maybe a tentative ✋ I am reviewing this? If nobody else speaks up this week, I will take it on to do a review next week (July 5-8).

Full disclosure: Nitin (the submitting author) and I are friendly (and see each other, mostly at conferences, about every 18 months).

I don't know anything about this specific work of his.

desilinguist · 2016-06-29T23:07:41Z

Thanks, Jeremy!
On Wed, Jun 29, 2016 at 6:29 PM Jeremy G. Kahn notifications@github.com
wrote:

maybe a tentative ✋ I am reviewing this? If nobody else speaks up this
week, I will take it on to do a review next week (July 5-8).

Full disclosure: Nitin (the submitting author) and I are friendly (and see
each other, mostly at conferences, about every 18 months).

I don't know anything about this specific work of his.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#33 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAJ3kGbjb4QWiBEz_wcVprg2Bh3mYVGOks5qQvG3gaJpZM4JARd3
.

arfon · 2016-06-30T02:56:14Z

Full disclosure: Nitin (the submitting author) and I are friendly (and see each other, mostly at conferences, about every 18 months).

OK that sounds great @jkahn 👍

Discovered missing when comparing to output of the `lice` tool. part of JOSS review: openjournals/joss-reviews#33

Per openjournals/joss-reviews#33.

jkahn · 2016-07-07T22:47:18Z

General checks and references

All the outer formatting looks good.

Might be nice to find a DOI for Loukina 2015 but I can't figure out how to get DOIs from ACLweb.

jkahn · 2016-07-07T23:06:53Z

Statement of need

I'm not 100% clear what the statement of actual need is here. I get that this is a useful tool within the ETS, but I'm not certain about what the contribution of this package is, and who the researchers (outside of the ETS itself) would be that would find this to be a useful tool in hand.

I'm not saying that those researchers don't exist -- I might even be one of them -- but I don't think the statement of need clearly reflects what a imaginary researcher Professor X would use these tools for. Perhaps a tutorial walking through Professor X's thought process would help clarify what problem this solves -- and if the tutorial was part of the documentation, so much the better.

I suspect is meeting >1 need and some researchers may have only a subset of those needs, and that is okay, but those are not clear. (See upcoming comment about multiply-situated work.)

jkahn · 2016-07-07T23:12:24Z

Undocumented entry points confuse new users

There are at least three different command-line tools (endpoints, in the Python jargon) that all seem to take the same argument structure (a config file) but presumably have different formats expected in the config files. There's exactly one example use, scraped from a Kaggle competition, but it only uses one of the CLI endpoints (rsmtool); the others (e.g. rsmeval and rsmcompare) don't have sample usages.

This doesn't help your statement of need much, either.

desilinguist · 2016-07-07T23:17:22Z

Please see the "available documentation" section in the main README. All
the config files for all four tools format are fully documented.

There's only example, that's true. That's because we expect the "rsmtool"
endpoint to be the most commonly used. We can certainly make that clearer.
On Thu, Jul 7, 2016 at 7:12 PM Jeremy G. Kahn notifications@github.com
wrote:

Undocumented entry points confuse new users

There are at least three different command-line tools (endpoints, in the
Python jargon) that all seem to take the same argument structure (a config
file) but presumably have different formats expected in the config files.
There's exactly one example use, scraped from a Kaggle competition, but it
only uses one of the CLI endpoints (rsmtool); the others (e.g. rsmeval
and rsmcompare) don't have sample usages.

This doesn't help your statement of need much, either.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#33 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAJ3kGR4VQittBVwyCNcyuewuj-uuqNhks5qTYfcgaJpZM4JARd3
.

jkahn · 2016-07-07T23:18:10Z

Separating the `pandas.DataFrame` manipulator APIs from the input and output formats

As far as I can tell, the format of the input features is also undocumented. I don't have a clear picture of how I might (as an external developer) go about creating the datasets of "features" (a hugely overloaded term, further overloaded in the limited documentation provided here) applied to these competitions. Furthermore, I don't understand from the documentation what aspects of those files are being displayed in the resulting HTML generation.

I wonder if a clear difference between dataframe format and on-disk storage format would clarify things here. (I can imagine:

a "stored features" form in JSON, documented
an API specification about the shapes and labels of DataFrame objects for input
some utilities to lift the stored results to appropriate API inputs
an API specification about the shapes and labels generated for output
some utilities (probably notebooks) to combine and display bundled API outputs

I'd thus like to see a separation among the following concerns, within the documentation, all of which seem to be at play here:

loading to dataframe tools and required formats for loading input data
dataframe manipulators comparing human evaluation and machine-extracted "features" to each other, including manipulations to extract useful summary statistics and dataframes
display utilities convenience collections (notebooks?) to gather and display the statistics against each other
commandline tools for coordinating the execution of the pieces above

jkahn · 2016-07-07T23:41:06Z

Okay, some more digging around in doc/ has resolved some of these (for example, the feature formats in doc/feature_file.md) but I still think there are too many control surfaces buried in the config file to get a clear picture of what the general uses are.

Is this a tool for new feature development? For comparing human raters? for comparing human raters to existing features? for comparing existing features to each other? For designing new notebooks that have lots of the existing work already done?

All of the above and more?

I think this is a configuration-based approach to desktop evaluation of how different schemes for combining numeric features improve (or hurt) the correlation with human scorers, but as such it's practically an IDE, which is why I am suggesting a clearer breakdown of the sub-responsibilities.

jkahn · 2016-07-07T23:43:13Z

Miscellaneous

The authors make no particular performance claims, and the software runs in reasonable time (a few seconds) on the sample data and produces plausible-looking HTML documentation of correlations among users and features. I'm happy to check off the corresponding boxes there.

jkahn · 2016-07-08T07:35:23Z

Recommendation: accept - Minor revisions.
I think it's clear that there's a docs problem here:

statement of need is too high-level and making it lower-level will probably articulate >1 role of users
doc'n is unclear about documenting external storage format vs internal APIs
API doc'n is not easily browseable (e.g. through a sphinx .. autodoc-including api.rst file)
Documentation not easily browseable [though github itself renders the .md files, the API documentation isn't so easy. this could be remedied quickly with http://readthedocs.io or similar]

Separately, I have a few further quibbles that should not block publication but should probably be

I'd like to see an introductory tutorial notebook, but I don't think that should block publication of this version
A package with this many .py files should probably have a coverage and a linter-style test run at least between version revisions.
installing through pip is the standard for Python -- and everything here works (if a custom requirements.txt file is included). It'd be nice to have sdist tarballs available on PyPI -- this would even be compatible with conda installations, after all, and the code defined in this package does not include any non-Python code AFAICT. (many of the declared dependencies require careful work with compilation, but that itself should be outside the scope of this project.)

aloukina · 2016-07-08T17:33:07Z

Thank you for such a detailed review, Jeremy! We'll go over your suggestions with Nitin.

desilinguist · 2016-07-08T19:12:39Z

Indeed. Thanks for the careful review, Jeremy!
On Fri, Jul 8, 2016 at 13:33 Anastassia Loukina notifications@github.com
wrote:

Thank you for such a detailed review, Jeremy! We'll go over your
suggestions with Nitin.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#33 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAJ3kMYYD6LzFaRQll__9eaZx6axZooDks5qTonXgaJpZM4JARd3
.

arfon · 2016-07-09T23:15:22Z

Thank you for such a detailed review, Jeremy! We'll go over your suggestions with Nitin.

💯 - yes thanks for this excellent review @jkahn. @aloukina & @desilinguist - please let me know when you've had a chance to update your submission based upon @jkahn's feedback.

desilinguist · 2016-07-27T14:23:58Z

Hi @arfon and @jkahn,

Thanks again for the very useful review! It helped us come up with something a lot better, we think.

We have just released v5.1.0 of RSMTool that addresses the suggestions made in the review. Specifically:

The documentation has been completely overhauled and is now hosted on readthedocs. It now includes a very clear statement of need, several tutorials, as well as browsable API documentation.
(Almost all) expected warnings are now suppressed when running nosetests. One particular warnings remains which actually does indicate the use of about-to-be-deprecated code in a related package for which I have filed an issue. This should be fixed in the next release.
Several stylistic issues fixed using pep8 and pyflakes except for a couple that are documented in the PRs.
Code coverage is now automatically computed via coveralls.io.
I have started working on pip compatibility and I think I have a way to get it working in the next release when we update one of the packages which has since become more wheel-friendly.

jkahn · 2016-07-27T16:04:13Z

Well, I am delighted to see this. @desilinguist and @aloukina, the new documentation is actually enjoyable to read, with well thought out hyperlinking and walkthroughs describing real user scenarios.

It's much less of a stretch for me to imagine a non-ETS researcher using this tool now, which was the unarticulated heart of my documentation objections before.

I'm glad to hear you're exploring pip/wheel installations and I hope you'll publish wheels or sdist tarballs on PyPi periodically as part of your release cycle. I give this a 👍 and defer to the editors as to when/if/how you should mint a new DOI.

arfon · 2016-07-29T02:38:04Z

I give this a 👍 and defer to the editors as to when/if/how you should mint a new DOI.

Excellent, thanks @jkahn.

@desilinguist - is there an associated DOI for the v5.1.0 release. If so, please add the DOI as a comment here. We can then move forward to accept and publish.

desilinguist · 2016-07-29T02:50:57Z

Hi @arfon, here is the DOI for the v5.1.0 release

Please let me know if that's okay or if I need to do something else.

arfon · 2016-07-29T03:08:06Z

Perfect. Thanks!

@desilinguist - your paper is now accepted into JOSS and you DOI is http://dx.doi.org/10.21105/joss.00033 🎉 🚀 💥

Thanks for the great review @jkahn

whedon added the review label Jun 28, 2016

jkahn self-assigned this Jul 6, 2016

jkahn added a commit to jkahn/rsmtool that referenced this issue Jul 6, 2016

Fix License to actually declare copyright

5fd01ac

Discovered missing when comparing to output of the `lice` tool. part of JOSS review: openjournals/joss-reviews#33

This was referenced Jul 6, 2016

Fix License to actually declare copyright EducationalTestingService/rsmtool#24

Merged

Dependency chain is huge EducationalTestingService/rsmtool#25

Closed

jkahn added a commit to jkahn/rsmtool that referenced this issue Jul 7, 2016

Add DOI to Williamson paper

136133f

Per openjournals/joss-reviews#33.

jkahn mentioned this issue Jul 7, 2016

Add DOI to Williamson paper EducationalTestingService/rsmtool#28

Merged

jkahn added the pending-minor-enhancements label Jul 8, 2016

arfon added the accepted label Jul 29, 2016

arfon closed this as completed Jul 29, 2016

arfon unassigned jkahn Aug 19, 2018

arfon changed the title ~~Submission: RSMTool: a Python Package for facilitating research on automated scoring models~~ [REVIEW]: RSMTool: a Python Package for facilitating research on automated scoring models Aug 19, 2018

whedon added published Papers published in JOSS recommend-accept Papers recommended for acceptance in JOSS. labels Mar 2, 2020

victorsacek mentioned this issue Sep 27, 2021

[PRE REVIEW]: Mandyoc: A finite element code to simulate thermochemical convection in parallel #3551

Closed

pstjohn mentioned this issue Aug 17, 2022

[REVIEW]: graphenv: a Python library for reinforcement learning on graph search spaces #4621

Closed

editorialbot mentioned this issue Jan 23, 2024

[PRE REVIEW]: Jury: A Comprehensive Evaluation Toolkit #6273

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REVIEW]: RSMTool: a Python Package for facilitating research on automated scoring models #33

[REVIEW]: RSMTool: a Python Package for facilitating research on automated scoring models #33

[REVIEW]: RSMTool: a Python Package for facilitating research on automated scoring models #33

[REVIEW]: RSMTool: a Python Package for facilitating research on automated scoring models #33

Comments

Status

Reviewer questions

Conflict of interest

General checks

Functionality

Documentation

Software paper

General checks and references

Statement of need

Undocumented entry points confuse new users

Separating the pandas.DataFrame manipulator APIs from the input and output formats

Miscellaneous

Separating the `pandas.DataFrame` manipulator APIs from the input and output formats