Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

The heartbeat of open source projects can be heard with GitHub data

Steven Max Patterson | June 28, 2016
GitHub data gives data-driven enterprises insights into their most important open source projects

After some clarifying revisions, user davkeen added tmat’s feature request to the priority backlog in February. Davkeen previously had been given the permissions by the project’s owner to accept issues into the backlog list. A developer coded the feature in his local up-to-date copy and made the changes. About a week later, jasonmolinowski made a pull Merge Pull Request from the developer’s version of the Roslyn project to move the changes to the main body of the project. He included comments to explain what and why he proposed the changes.

Emails were sent automatically to project members responsible for reviewing the change and testing—like the book changes submitted to the author and editors in the earlier example. After about a week of review and testing, he Committed the change, merging it into the main body of source code—like a book ready to be published.

To maintain software quality, the owner gave the authority to jasonmolinowski to make merge decisions. If someone else made the merge Pull Request, jasonmolinowski would have approved the merge.

Now, let’s look at some of the color-coded data.

github data chart

GitHub
github data chart legend

Pull Requests: These are the code submissions that propose a feature or bug fix by developers that don’t have permission like jasonmolinowski did to merge them into the main code body. The proposed changes will be reviewed, and if they are accepted, they will be tested and merged. Using the book example, the changes have been submitted to the editors.

Pushes: Pushes are code changes that are merged into an earlier Pull Request. In the book example, someone submitted a change to the editors. After the editors and public have commented, the writer makes some changes to his or her original submission and Pushes them, merging them into the original submission.

Pull Request Review Comments: These represent all of the comments made to lines or sections of code, usually made by developers reviewing the proposed change. In the book editing example, these would be notations in the margin.

Pull Request Comments: These are comments made to Pull Requests proposing code changes to the main body of code. In the book example, these would be an explanatory cover letter.  

Issues Comments: These measure the comments about feature requests and bugs.

Issues: This represents the outstanding feature requests and bugs reports.

The Rosyln chart above makes a couple of interesting points. Foremost is that when Microsoft made the project public, a lot more developers contributed code and each change was accompanied by a broader, more insightful discussion. It also shows a drop in productivity during the summer of 2015. The volume of comments represents the degree of collaboration. The chart below shows that as projects size increases, comments grow to coordinate decisions as issues move from draft to a final merge and a release. The transparency of the development process improves the quality of review, as well as the decisions of those managing features and fixes, which would not be possible if the project were private.  

 

Previous Page  1  2  3  Next Page 

Sign up for CIO Asia eNewsletters.