Open Source Software Publishing for University Researchers
Are you a student / software developer / research and just want the step by step guidelines? Jump to the "In a nutshell" sections!
52°North is build on a simple idea: The brightest and most creative minds often work as researchers at university, and so the newest software that solves important problems also is often started by PhD students or undergraduates in study projects. These people operate without the pressure of "product development" and "user focus groups" but often with a vision of a piece of code that would really improve how things work, and we've seen great things come out of those allegedly "ivory towers".
The goal of this best practice is to provide a recommendation to minize the "overhead" work that is required to start a real open source project
out of a research-grade software development, despite the fact that research grants might not provide funding, or that they don't value (though they should) publishing code as open source software. The main goal of research is often reduced publish papers, but times are changing and publishing data software along with research (reproducible research) will become the standard soon enough. So better get ahead of it.
We have heard people saying "no, I don't want to publish my software" for different reasons ("it is not good enough", "I just need to finish the documentation", "I don't have the time to maintain this") long enough. Here is our suggestion to make your live as a software developer / PhD student easier. A very important side effect: many of the recommendations rightaway make you a better programmer
We look forward to feedback about this page, so feel free to contact DanielNuest
Reasons not to publish research tools as open source software and arguments against it
There are a few challenges that goodwill, discipline, and personal interest might not be able to fix. Here is a place to collect these to see how we can deal with them.
- "I don't have the time to take my students by the hand and tell them how to do it."
- Just share this tutorial!
- Start using GitHub in seminars that include programming and make the students publish their code under an open source license.
Learning git will be a great side effect. Also, pull requests make it easier to see which student contributed which part of software in collaborative projects.
- "I don't want others to steal my idea before I publish a paper about it."
- Although it would make the software development process a lot better, we of course cannot force you to make software public before you've published the research.We don't say you should risk your research, but you should consider the future publishing the the light of open access and that it will actually support your case to be open about your research (open innovation, ...).
- Software projects are a lot more attractive for others if they solve an abstract problem. So if you can divide your tools into a generic part and one that is particular to your application then you can (a) hold the latter back until after you published your work, and (b) have a piece of software that is a lot more interesting for you, your colleagues and others to use and extend, and does not contain the final few lines of code that sets your work apart... only not until you published that journal paper, of course.
- GitHub offers free private repositories for university - so you could share your developments within the research group, for example.
- Publish a paper about your software: http://www.software.ac.uk/resources/guides/which-journals-should-i-publish-my-software
- "The PhD student should not maintain software but should think about new problems and publish."
- The quality of the research will increase if it is tested in software prototypes.
- A piece of software that is well documented allows new generations of students to pick it up, continue the work, and think about more complex problems. It reduces repetition and makes room for new findings.
- "Maintaining" software can increase the reputation of an institute or working group.
- "My research grant is finished and I don't have the time to work on documentation and fix bugs anymore."
- Publish the very first line of code along with a short documentation already, then you don't have many hours of documenting to do at the end (which you NEVER EVER will be able to do), but a few minutes during the development process.
- Making code publically available increases the quality of the code.
- "We don't have the means to integrate the many small programmes we have into one 'system' or 'platform' that we could properly publish."
- Many small programmes are great. It is what Linux is based on. Try to abstract the core of the software beyond your particular use case, and publish each small entity. If possible support public or standardized formats and interfaces (KML, GeoJSON, RDF, WMS, CKAN, ...) for input and output, and you can bet on somebody else trying out your software!
- Making software cope with abstract problems instead of only one use case allows you to think beyond your own problem, makes the software more interesting to others, and therefore increases your chances of getting contributions to your code!
- "My software is not good enough."
- Bull****. "Professional" software is so often not good enough that it get's updated all the time. If the software is tested and get's the job done, let's say it is "good enough" for now.
- Ok, so even if it is, if it solves an actual problem then somebody else will pick it up, use it, and start fixing bugs. You don't believe that will ever happen? Have you tried?
- "I don't know how to start or manage an open source community." or "I don't have the skills/the time to manage a community."
- No worries - just ask us, or anybody with an open source background, people will like to share experiences with.
- Soft skills become more and more important, especially for software developers (if you're working in Scrum teams - communication is everything), so why not start now?
- "We don't have the time to maintain a code repository, or to build and publish releases for software."
- Then don't do it! Instead...
- Get a GitHub organisation for your research group / institute: https://help.github.com/articles/creating-a-new-organization-account/
- Use common build tools, such as Maven for Java, so that people download the source code and build the software themselves. You can even automate that job (e.g. using Jenkins, in the case of Java) and include it in your testing. And yes, there is a free build server: https://travis-ci.org/, so no need to host one your own.
- "We don't have time to give support for the software for free."
- Then don't do it, or not only for free. Don't underestimate the positive effect of people using your software. For example they will find bugs for you!
- Use Bountysource to fund small development tasks: https://www.bountysource.com/
- If you're project is promising or getting a lot of questions, think about a spin-off or giving paid professional services.
- Do you have another challenge or arguments? Tell me, let's see if we don't find reasons to do it anyway.
How to improve software quality in research
Students are not excellent programmers. Thy can't be because you have to make experiences and many mistakes to become really good at something - and that takes time
. However, being more aware of what a "good programmer" is certainly is a good start. Here are a few ideas how to improve programming skill without it taking too much time off from research
. Note that all of them are essentially about social interaction.
- Have bi-weekly/monthly hackathons where the whole group / lab meets and everybody works on software. Friday after lunch is a perfect time.
- People can do pair programming and learn from each other.
- Explaining a problem to somebody else often leads to the solution.
- Have a "huddle", a monthly shourt meeting where one or two (not more) topics are presented with a strict time limit (15 minutes maximum).
- One student prepares a topics that deals with software development skills, or presents a problem that he had and how he solved it, or presents a problem that still needs solving.
- People can share experiences and profit from skills of the other.
- A strict time limit means that presentation skills are practiced an no time is wasted.
- Encourage people to use parts of the Scrum methodology.
- The most important aspect of Scrum is reflecting.
- "Real Scrum" doesn't work for PhD projects. Do a "one-man-research-Scrum" instead:
- Create a backlog with prioritized items.
- Compile a sprint and work only on these items until they are done. Then start the next sprint. Don't set deadlines for items, it's not like you are going to keep them anyway...
- Meet with somebody else once a week and have a "scrum meeting", i.e. report what you did, what problems were, and what you plan to do next week.
- Reflect after every sprint what took longer than expected, what showed to be more important, and adjust your backlog accordingly.
Best practice for starting a new project
The most important thing: Write good software. If you follow common software development practices from the very start then the code will be "good to publish" at any point in time.
How to increase chances that somebody else will use/extend/contribute to my software?
- Solve a problem that matters. Make that problem small and solve it for good.
- Instead of starting a completely independent project, contribute a new feature to an existing open source project.
In a nutshell
- Take a few hours once and a few minutes every week to reflect what good programming means.
- Create a project on GitHub and create a short Readme.md file before you write the first line of code.
- Document minimally, but from the very beginning.
- A good variable/function and clean code name is better long comments or long specifications.
- Document the moment that you implement something, then you're done with it.
- Write tests. Especially in research projects, where you do not really know where you're heading then writing tests can help you to find out what you actually want to do.
- Add a clear open source software license, see http://choosealicense.com/. There is no real good reason not to stick to one of these three!
If you want to learn about all the licenses out there the easy way go to http://www.tldrlegal.com/.
- Register your software to protect it (software licenses are evil), see list below.
Extended list of useful practices
Best practice for publishing an existing project
- Find an existing open source project that would benefit from the new/improved functionality.
- How to do that right? Get to know the community, follow mailing lists, be quiet first, get to know the "rules" of the community, ...
Simple Code Quality Rules
There are many books about good software development. But since you might not have time to read them all, we try to collect a short list of checks that you can keep in mind while developing.
Depending on your country and the laws in your country you might be able to register your software as an intelectual work/work of art. Since you register it with a third party then you have a proof of prior art if somebody else get's a software patent.