Find us on GitHub

Tracking and Publishing Your Data—Git and GitHub (ILGISA Conference)

Sep 14, 2015

1:30 pm–4:30 pm

Instructors: Neal Davis

General Information

In this hands-on tutorial, we will discuss principles of tracking data development and changes in your data, documents, and code. We will introduce Git for tracking revisions and changes and GitHub as an open platform for distributing and collaborating data and code. Other repositories such as Dryad will also be introduced. Please bring a laptop to the workshop, and follow the setup instructions below.

Some of the material taught is based on Software Carpentry. Software Carpentry's mission is to help scientists and engineers get more research done in less time and with less pain by teaching them basic lab skills for scientific computing. Participants will be encouraged to help one another and to apply what they have learned to their own research problems.

For more information on what Software Carpentry teaches and why, please see the paper "Best Practices for Scientific Computing".

Who: The course is aimed at researchers with an interest in maintaining and publishing data and documentation.

Where: Crowne Plaza Springfield, 3000 South Dirksen Parkway, Springfield, Illinois, 62703. Get directions with OpenStreetMap or Google Maps.

Requirements: Participants must bring a laptop with a few specific software packages installed (listed below).

Contact: Please mail training@cse.illinois.edu for more information.


Etherpad: https://etherpad.mozilla.org/CpJi7gwpeb.
We will use this Etherpad for chatting, taking notes, and sharing URLs and bits of code.


Syllabus

Setup and Background

  • Installation, troubleshooting, etc.

Data Management I

  • Practical Data Set Management

Version Control with Git

GitHub

  • Collaborating with others
  • Markdown notation

Data Management II

  • Principles of Data Management
  • Open licenses
  • Documenting codes, classes, and projects
  • Other data repositories
  • Handout for Data Management

Setup

To participate in the workshop, you will need access to the software described below. In addition, you will need an up-to-date web browser.

Git

Git is a version control system that lets you track who made changes to what when and has options for easily updating a shared or public version of your code on github.com. You will need a supported web browser (current versions of Chrome, Firefox or Safari, or Internet Explorer version 9 or above).

Windows

Install Git for Windows by downloading and running the installer. This will provide you with both Git and Bash in the Git Bash program.

Mac OS X

For OS X 10.9 and higher, install Git for Mac by downloading and running the most recent "mavericks" installer from this list. After installing Git, there will not be anything in your /Applications folder, as Git is a command line program.

For older versions of OS X (10.5-10.8) use the most recent available installer labelled "snow-leopard" available here.

Linux

If Git is not already available on your machine you can try to install it via your distro's package manager. For Debian/Ubuntu run sudo apt-get install git and for Fedora run sudo yum install git.

Text Editor

When you're writing code, it's nice to have a text editor that is optimized for writing code, with features like automatic color-coding of key words. The default text editor on Mac OS X and Linux is usually set to Vim, which is not famous for being intuitive. if you accidentally find yourself stuck in it, try typing the escape key, followed by :q! (colon, lower-case 'q', exclamation mark), then hitting Return to return to the shell.

Windows

nano is a basic editor and the default that instructors use in the workshop. To install it, download the Software Carpentry Windows installer and double click on the file to run it. This installer requires an active internet connection.

Others editors that you can use are Notepad++ or Sublime Text. Be aware that you must add its installation directory to your system path. Please ask your instructor to help you do this.

Mac OS X

nano is a basic editor and the default that instructors use in the workshop. It should be pre-installed.

Others editors that you can use are Text Wrangler or Sublime Text.

Linux

nano is a basic editor and the default that instructors use in the workshop. It should be pre-installed.

Others editors that you can use are Gedit, Kate or Sublime Text.