Feeds:
Posts
Comments

Archive for July, 2009

Beginning Open Source

If you are a beginner at open source you may run in to lot of tools associated with open source technical infrastructure, to the point that you feel overwhelmed by the sheer number of tools and jargon that you have to become familiar with. At least this happened to me when I started on my open source journey not so long ago. The fact that it was also the first time I was exposed to industrial level software development didn’t help things either. Anyway after riding through the initial learning curve I now realize how these systems are vital for an open source project. So I thought of making a post of what I have learned so far about the tools that typically constitute an open source software development infrastructure. But items mentioned here are not exclusive to open source software development. Any form of large software development will need these tools to enable effective collaboration among developers and project monitoring.

What an open source project typically needs

– A Web site

– A Version Controlling software

– A Bug Tracker

– A Wiki

– A Mailing List

– An IRC channel

Web Site

This is a must for any serious software project. This is central point where the information about the project is disseminated to the world. Also the project software is hosted in the site so that users can download and use it. In addition user guides and other information such a project mailing lists, source repository URLs etc. are given in the site. So an informative and easily navigable web site is a must for a good open source project.

Version Controlling Software

Version controlling is a system which enables tracking and controlling changes to a project’s files, in particular to source code, documentation, and web pages. One of long standing version controlling system are CVS. But recently Subversion has become popular with quite a bit of upcoming and existing projects adopting it. Version control systems carry their own set of jargon which anyone using them has to be familiar with.

Repository – Database where the source files and their changes are stored. Some version control systems have centralized databases while others have decentralized databases.

Commit – To make a change to the sources at version control database so that they can be incorporated into future releases of the project. People with authority to commit to the database are called commiters and the commitership is something usually attached with political powers of the project like voting rights during project votes. Usually you are made a committer when the community comes to an agreement via a vote that you are satisfactorily familiar with the project after inspecting amount of work done by you in the project up until the vote. Until you gain committership you can only provide patches so that other commitors may review it and then incorporate it to the code base using there commitership powers.

Patch – Patch is a text file illustrating differences you made to the project sources which is then sent to the project mailing list or submitted to the issue tracker of the project according to the project policy of submitting patches. Providing patches is usually the entry point for a newcomer of a project to start making contributions to the project. It is usually derived by making a diff on the working copy on your machine. See below for the definitions of diff and working copy.

Diff – This is a textual representation of change. A diff shows which lines were changed and how, plus a few lines of surrounding context on either side. Usually this is synonymous with a patch. The version controlling software usually has command with a similar name to create a diff file on a changed source. (e.g.: in Subversion it is called diff)

Checkout – This is the process of obtaining a copy of the project from the project repository. This produces a directory tree called a working copy in the local machine.

Working copy – The developer’s private directory tree containing project sources. A working copy also contains meta data managed by the version control system, telling the working copy what repository it comes from, what revisions (see below) of the files are present, etc. Generally, each developer has his own working copy, in which he makes and tests changes, and from which he commits.

Revision – A revision is one specific state of the file or directory is in or has been. For example, if the file starts out with a revision 1 after someone commits a change to the file this produces revision 2 of the same file.

Branch – A copy of the project under version control. Commits to a branch doesn’t affect other copies present in the repository or the main project directory tree which is usually called trunk. This isolates a line of development of project from the main development of the project.

There are several reasons for branches to exist in a project. Some of them may be

* The development work carried under the branch may be experimental and not in the regular lines of the project.

* When a release of the project is near necessitating it to forbid making changes in the trunk in order to maintain the stability of the code in the trunk prior to the release.

* Conversely a branch can be used as a place to stabilize a new release. In this case no changes for the release branch may be allowed while regular development work is going on in the main branch.

Merge – The final aim of creating new branches is to merge them back to main branch so that changes made in the branch is transferred to the main branch. So when branch comes to a stage that it seems to be stable then developers can incorporate changes in the branch to the trunk to incorporate enhancements to the trunk.

Conflict – This happens when two people try to make different changes to the same source. Version control system detects the conflicts and notify the users so that it is up to users to sort out the conflicts between them and resolve them according in the version control system.

Bug Tracker

The bug trackers or more correctly issue trackers, are responsible for reporting and tracking status of the bugs, catering for feature requests and submitting patches. The project developer mailing list is linked with the issue tracker so that once a user or a developer creates/resolves an issue or submits a patch a mail is sent to the developer list notifying this. Once an issue is created it is in the open stage. Then some developer may assign the bug to himself or it may have been assigned to him at the bug creation time. Others can comment on the issue as how it can be solved or its effect on the project. Then the bug is reproduced and diagnosed using the information produced in the bug report. A developer then creates a fix for this and submits his work in the form of a patch for a community review or as a direct commit if he has power to do so. Alternatively the bug gets scheduled for a future release according the nature of the bug. (Like fixing of bug requiring extensive rework of the some codes which is not possible due to the time constraints of the current release and it is not critical enough to worth the trouble).

A Wiki

This is not a must and some projects may not have a wiki though having a one is worthwhile if the project has lot of documentation requirements. It is a web site that allows any visitor to edit or extend its content. This can be a place to build documents which build over time like FAQs in which user inputs are required. Then the refined documents can be extracted from the wiki and can be transfered to the project web site.

IRC

Most projects offer real-time chat rooms using Internet Relay Chat (IRC). This can be used as a place where users and developers can ask each other questions and get instant responses. But this is not a must and some projects don’t have their own IRC channel.

Mailing List

The beauty of the open source software development is that, it brings different indiviuals across the globe to a single develoment team. So an effective communication medium is necessary. Mailing lists are the nuts and bolts of communication in the open source world. Anyone interested in the project can subscribe to the project mailing list and receive the mails sent to the list. Most projects have multiple mailing lists. It is usual to have two mailing lists called developer and user mailing lists. Developer mailing list is ususally where bug reports and version controlling system generated messages are sent. Or alternatively they can go in a seperate list. Additionally developers talk about project development topics and architectural issues in the devloper mailing list. User mailing list is for the users where they can ask questions about general usage of the software or about issues they face when using the software.

Credits

Though I had to learn most of the above stuff in an ad hoc manner as I got along with coding, one fine day I got hold of the book “Producing Open Source Software” by K.Fogel which pretty much consolidated my understanding of how things work in open source. It is really a great book and talks about many other things regarding open source though I found the above is what really needed in getting a good grasp of basic technical infrastructure.

Conclusion

So as can be seen each different component described above plays an important role in the development of the open source software. And I hope this rather long post may prove to be useful for some one coming in to the open source software development world. Any comments about the things I may have missed or stated wrong, are welcome as usual.

Technorati:

Advertisements

Read Full Post »

When I was first got introduced to the free and open source software concept not so long ago the first impression I got about this concept was that the software produced ought to be freely distributed with their sources publicly available so that they come free of charge. Immediate question followed on my mind, as is the usual case for others being introduced to the concept as well, was, :”Well how are the guys in open source earn money?”.

Then I was told “Well people can provide support services for the users of software by means of customer service, training, consulting etc. and earn money from it.” So that settled the score for me of the open source business model though I was skeptic about the amount of returns companies would gain without the bulky initial earnings that would be had for the software itself as is the case for proprietary software. Anyway I didn’t give much thought about it afterwards though I got involved in some open source software projects. That was until I stumbled upon this lecture by Richard Stallman, the founder of Free Software movement. Only then I realized that this concept was based on a set profound philosophical ideologies and also the term itself in not immune to controversy.

Anyway here is what I learned from him on that lecture.

Moral Dilemma

In his talk he describes about freedoms that he thinks that there should be present in the software usage for it to be conducive in building a “community of sharing and cooperation” to be built around it without the evils of the “moral dilemma” that users of proprietary software face. The “moral dilemma” he describes, “Think that you installed a copy righted software which you bought recently on your machine. So you wanted to show it off to your friend and he says that it is way too cool for you to have it alone after using it himself on your machine and he asks a copy of it for his usage. Right then you are being pushed in to a dilemma. You are tortured by the conflicting moral evils of not being able to help out a friend and being selfish if you reject his request based on the software copy rights, and on the other side violating the copy rights of the software if you were to give it to him, making you what they call a “pirate”.”

Four Freedoms of software

“Saving software users from this dilemma”, he says is what made him start the GNU project. So he goes on to say that he believes that there should be four freedoms that any software should offer to its users in order to serve the means of achieving this goal.

Freedom zero:The freedom to run the program as you wish.

The user has the choice of running software to fulfill his purposes, not any purpose pushed on to him by the developer or any other for that matter. So he should be able to setup and run the program the way he wants.

Freedom one: The freedom to study the source code and change it so that it does what you wish

This empowers the user to make the software work in a way which is optimally suited for his requirements. Even if the user is not a programmer he can still hire a knowledgeable professional to do the job. Nobody will object that since this freedom is granted with the software itself. And another alternative would be to let his requirements be known to the software developer community and if the community find it useful to a broader user base and the project itself then they will add or modify the features of the software according the request. This is achievable partly due to the below mentioned Freedom Three which allows software modifications without restrictions.

Freedom two:The freedom to help your neighbor.

You should be able to sell or give free of charge an exact copy of the software to any one requiring it. This saves the software user from the above “moral dilemma”. Another point worth noting is that in this regard it seems that he doesn’t object to making money out of selling the software itself. But the common sense implies that this would not be practically feasible since it will not hinder buyers of your software from giving away free copies of the software you are selling, bringing the price levels to zero. This I thought cleared up some misconception I had about the free software about the fact that they are called free because they are zero cost. In fact I realise now there are zero cost software that are not free according to this definition the best examples being Internet Explorer and early Netscape Navigator browsers. Both of them were free of charge but the source codes were not available to public so users had to depend on respective companies to do bug fixes and enhancements. The ambiguity of the word “Free” led the people to introduce a new phrase to describe the concept of free software as “Free as in free speech not in free beer”. So it was a matter of liberty, not price. The zero cost software is just a coincidental side effect.

Freedom three:The freedom to contribute to your community.

You should be able to do some modifications and be able to contribute it back to the community. This keeps the spirit of sharing alive and allowing the non programmer and programmer software users to reap the benefits of others works.

Open Source

The officially coined term “Free Software” didn’t gain much popularity with the cooperate world due to the fact that “Free” was associated with lower grade, cheap or not being ethical. So the term “Open Source” was introduced describing another facet of the Free software. The marketing gimmick paid off big time and the cooperate acceptance levels of open source software is now much higher now than they used to be which Apache Web Server bears fine enough testimony.

A matter of perspective

No doubt this is just a one person’s perspective of what software should be and is not immune to criticism of other ideologies. It seems there are issues among different fractions of open source software developers about what open source software should be and is partly a matter of personal judgment. Any how the lecture left me with much broader perspective of what Free Software is.

Technorati:

Read Full Post »

To start up with my wordpress blogging I decided that a desktop blog authoring tool would be great thing to try out which would enable me to write blogs offline. To my surprise the range of choices for Linux wordpress blogging was not that large. After some pretty long search I decided to try out Bleezer which seems to support most of the features I required. I had some troubles with setting up my wordpress blog with Bleezer initially so I decided to share the required information to set up wordpress.

Setting up WordPress with Bleezer

In the accounts section of preferences it asks for host, path and port values. For example if your blog is http://www.myblog.wordpress.com then respective values should be

– Host : myblog.wordpress.com (note that www is not necessary)

– Path : xmlrpc.php

– Port : 80

Explanation

Host and port parts hardly deserve an explanation. The key is the path value which is xmlrpc.php. WordPress supports desktop blog tools via XML-RPC API called MetaWeblog API. This API enables weblog entries to be posted, edited and deleted using an XML-RPC web service. This web service supports methods to enable above actions. In earlier example the XML-RPC web service endpoint would be http://www.myblog.wordpress.com/xmlrpc.php.

About Bleezer experience

Now a little bit about Bleezer itself.

Pros

– Works on Linux (which was the deal breaker for most of the other apps)

– Ability to upload images

– Category and Tag support

– WYSIWYG editing.

– Ping Technorati, Weblog.com etc.

Cons

– It seems to get stuck without any report on the progress of the operation, when publishing the posts online for some time which is not the most user freindly aspect of a desktop app.

– Only a placeholder can be seen instead of the image when a post with embedded image is previewed.

So far this is what I found out working with Bleezer. Think I have missed some important points? Just let me know.

Read Full Post »

First Post

Here in this blog I will be thinking aloud about my open source software experience. Feel free to check out and comment of what you think of my thinkings. Have a nice time.

Read Full Post »