Site Map - skip to main content - dyslexic font - mobile - text - print

Hobby Public Radio

Your ideas, projects, opinions - podcasted.

New episodes Monday through Friday.


Correspondent

bjb

Host Image
Host ID: 357

email: bjb.nospam@nospam.sourcerer.ca
episodes: 2

hpr2619 :: A Gentle Introduction to Quilt

Released on 2018-08-16 under a CC-BY-SA license.

A gentle introduction to quilt

Or, patch management for software.

Speaker Intro

Hi, I'm bjb. I'm a programmer.

Motivation and topic intro

I needed to learn how to use the software tool "quilt", so you get to listen to my podcast about an introduction to quilt.

People collaborating on a project must edit the same set of source files. After one person commits some changes, then the other people must rebase their own changes on the new version of the shared files before they can push their changes.

A minor fix for some old typo should not be in the same patch as a new feature; a comment correction should also be in its own patch. Essentially, two new features and some bug fixes should not all be smushed together in one patch. Each feature should be in its own patch (or patch series), and each bug fix should be in its own patch. This allows others to be able to review the proposed changes easily, and even lets them pick and choose which patches they want to apply. It becomes a chore to manage all these patches. That's where quilt comes in.

Sadly, I hadn't learned quilt till this weekend ... well one way to ensure I learn it fairly well is to write a HPR episode about it! Here goes.

I have written this episode to be understandable by anyone - you don't have to be a coder. You could use this tool to keep track of any plain-text files - recipes, todo lists, html, hpr show notes, poetry, what-have-you.

Introduction

First let's describe what a patch is. No, first let's describe what source code looks like. Source code is a plain text file full of computer instructions. It is a plain text file, as opposed to a word processing file. Plain text files do not have any formatting codes or styles in them (such as which font should be used, or what colour, and so on). They just contain the characters that make up words of the content.

A key feature of these source code files is that a new section of the file starts on a new line. The source code is almost never "reflowed" like prose might be. It is sort of like poetry - the more formal poetry, not prose poetry. There are a lot of really small sections in source code files (called "statements" and "expressions"). Most of these sections fit on one line. This is useful for the tools we're going to discuss because when one line changes it does not affect the following lines, as it might when text is reflowed after a change.

People have been coding with plain text files in various languages for decades. Thus a large set of tooling has grown around this format. One of those tools is called "diff" and another one is called "patch".

Diff is a way to compare two text files. Typically it would be used to compare the "before" and "after" of a source code file undergoing changes. So you could find out what was done to the source code file by running diff on the before and after versions of that file.

A diff file is a series of excerpts from the original and changed files. There are various kinds of diffs. Some of them show only the changed lines. Some of them show a few lines before and a few lines after in addition to the changed lines themselves. That second kind is called a "context diff" and helps the automated machinery (and humans too) find the correct part of the file to which the change must be applied.

By default there are 3 lines of context before and after the changed lines.

The changed part is represented by including the old AND new line. In order to distinguish which lines are old and which are the replacements, all the lines (context lines, removed lines and added lines) are shifted over to the right by one character. The context lines start with a space in the extra left-most character, the original removed lines have a minus sign in the left-most character and the new added lines have a plus sign.

Thus if any character on a line in the source file has changed, been added or removed, then the whole line will be replaced with a new line in the new file. The diff will show both the removed line and the new one.

The patch utility takes the "diff" output and applies it to the original file to produce the later version of that file. You can apply it in reverse mode to the later version to get the original version. So patch is also a really useful program, and these two tools, diff and patch, are the basis of most of the version control systems out there. It is the existence of these text-based diff and patch tools that makes revision control systems work really well on plain-text files that are naturally structured in a line-by-line format.

A note about terminology: the diff program produces a diff. This diff is also called a patch. The patch program takes the diff (aka patch) and applies it to the original file to produce the changed file.

So if you have a timeline of adding a few features and making a few fixes on a code-base, it can be fully described by the original file plus a set of patches that had been produced with diff. You can get the final source code by taking the original file, applying the patches one by one, and voila, the final version of the file has been recreated.

Now we know enough to give a concise description of quilt:

Quilt lets you work with patches, creating them, applying them, un-applying them, and moving some things from one patch to another with a minimum of effort.

How to use quilt

Now a tutorial on how to get started using quilt.

This tutorial will start with a buggy program, create a few bad patches, and fix them up into good patches. I make no claims as to the quality of the final code though. The reason for starting with bad code and patches is to illustrate how to use quilt.

Starting to use quilt on a project

To start using quilt, create a directory called "patches" at the top of your code or just above.

$ mkdir patches

If you don't do this, quilt will create it for you. However, first it will look for a directory called "patches" in the current working directory, its parent, and all the way up ... if it finds one, it will use it. If not, it will create one in the current directory.

So, to keep it from finding some unrelated directory with the name "patches", just create a patches directory yourself in the right place.

Quilt first patch, including a new file!

You must tell quilt before you make any changes to your source code. Then it can store the original versions of the files that will change, so it can produce the diffs that will become that patch once you change the files.

Create a directory called example, and create a file in it like this, called hello.c (don't fix the errors):

#include "stdio.h"

int main (int argc, char *argv[], char *env[])
{
    print ("Hello, world!n")
    return 0;
}

Now create a new patch - that is, give it a name - before you change any code. This will create (or find) a couple of directories, "patches" and ".pc", and populate them with some files to start.

$ quilt new fix-typo

And now you can fix the typo and generate the patch. First start by telling quilt that you want hello.c to be in the patch. Quilt saves a copy of it aside for comparing with the later versions:

$ quilt add hello.c

You can get quilt to tell you what files it knows about:

$ quilt files

Edit the file - add a semicolon at the end of the print line, and change the double-quotes on the #include line to angle brackets:

#include <stdio.h>
print ("Hello, world!n");

Save the file and exit the editor. Next generate the patch:

$ quilt refresh

The oddly named "refresh" command creates the patch itself. It is called "refresh" because it can also be used to update the patch.

Now you can see the current set of patches by giving the command:

$ quilt series

The single patch is called fix-typo, and its name in the list is coloured brownish. That is because it is the "current" patch, and it is the one that will be updated if you "quilt refresh" again with more changes.

One thing I did not find in the quilt documentation is how to add a new file. When adding a new file, there is no existing file that you can name in the quilt add command. Of course, the very first patch I wanted to manage with quilt, I had introduced a new file. It turns out that the quilt edit command can be used to add a file to the patch, even if the file does not yet exist:

$ quilt edit header.h

Add content to header.h (see below) using the plain-text editor that quilt has started up for you. Save the file.

#ifndef HEADER_HH__
#define HEADER_HH__

#define NAME "bjb"

#endif

Regenerate the patch with the new changes:

$ quilt refresh

Now you can list the patch series again with quilt series. So far there is one patch. You can see what the patch consists of with the

$ quilt diff

command.

$ quilt diff
Index: hello/hello.c
===================================================================
--- hello.orig/hello.c
+++ hello/hello.c
@@ -1,8 +1,8 @@
-#include "stdio.h"
+#include <stdio.h>

 int main (int argc, char *argv[], *env[])
 {
-    print ("Hello, world!n")
+    print ("Hello, world!n");
     return 0;
 }

Index: hello/header.h
===================================================================
--- /dev/null
+++ hello/header.h
@@ -0,0 +1,7 @@
+#ifndef HEADER_HH__
+#define HEADER_HH__
+
+#define NAME "bjb
+
+#endif
+
$

Quilt second patch

Now it is time to make a second patch. First we tell quilt we are moving to a new patch:

$ quilt new prototype
$ quilt edit header.h

Edit this file again - add a function prototype.

int do_output(const char *name);

Create the patch and look at the list of patches:

$ quilt refresh
$ quilt series

Now when we give the quilt series command, we see two patches. The first one is green, meaning it has been applied, and the second one is brown, meaning this is the one that quilt refresh will change if you call it.

Again you can see what latest diff looks like by giving the quilt diff command.

$ quilt diff

Now let's unapply the latest diff:

$ quilt pop
$ quilt series

We see that the list of patches has the same patches in it, but now the second patch is white (meaning unapplied) and the first patch is brown (meaning it is the one that would change if we edited a file and typed quilt refresh.

$ quilt files

That first patch has two files in it, hello.c and header.h.

Now unapply the first diff:

$ quilt pop
$ quilt series

Both patches are listed, and both are shown as white.

We can see what files quilt knows about before any patches are applied:

$ quilt files

No files.

Apply all the patches at once:

$ quilt push -a
$ quilt series

And look at what files quilt knows about:

$ quilt files

Now quilt reports on only one file, while in the first patch it knew about two files. You must be careful to "add" each file to each patch, or it will not put the changes in those files into the patch. Luckily, quilt edit will put the files in the patch for you, so if you always start your editor with quilt edit fname, then you will have your changed files added to your patches without having to take any other action. But, if you are adding an existing file to the patch, you can add it without having to open your editor with the quilt add command:

$ quilt add fname

In order to avoid forgetting to add a file in a patch as I was editing, I just added all the files in the directory each time I created a new patch, whether I edited them or not.

Split a patch in two parts

We are going to split the first patch in two parts. We had fixed a typo and added a new file in one patch. They should be two separate patches.

First make the first patch current:

$ quilt pop

Then make a copy of that patch:

$ quilt fork

This makes a copy of the first patch called fix-typo-2. But, it removes the first patch fix-typo and puts fix-typo-2 in the series. We need to put the first patch back, and then edit each of the two fix-typo patches so each one contains one part of the original patch.

# edit patches/series file and put the first patch back
# The file should contain:

fix-typo
fix-typo-2
prototype

Now edit the first patch using a plain-text editor. It is in patches/fix-typo. Remove the part about the new file, header.h. It should now look like:

Index: hello/hello.c
===================================================================
--- hello.orig/hello.c
+++ hello/hello.c
@@ -1,4 +1,4 @@
-#include "stdio.h"
+#include <stdio.h>

 int main (int argc, char *argv[], *env[])
 {

Save this file. Now edit the second patch patches/fix-typo-2 using a plain-text editor. Remove the part about the file hello.c. It should now look like:

Index: hello/header.h
===================================================================
--- /dev/null
+++ hello/header.h
@@ -0,0 +1,7 @@
+#ifndef HEADER_HH__
+#define HEADER_HH__
+
+#define NAME "bjb
+
+#endif
+

If you give a quilt series command now, you will see that fix-typo-2 is the current patch and quilt thinks fix-typo has been applied.

We have to fix up quilts idea of reality.

Pop the current patch. Things have changed under quilts feet so we have to force this with the -f option:

$ quilt pop -f

Now, because quilt thought the original state of fix-typo-2 is the unchanged file, quilt shows the series as being completely un-applied.

$ quilt series
patches/fix-typo
patches/fix-typo-2
patches/prototype

Now we can push the patches:

$ quilt push -a

Rename a patch

Here we rename a patch from fix-typo-2 to add-header. The quilt rename command acts on the current patch, so make fix-typo-2 current first:

$ quilt pop fix-typo-2
$ quilt rename add-header
$ quilt series
$ quilt push -a

Reorder the patch series

We will make a new patch, then move it earlier in the series:

First make the new patch:

$ quilt new printf
$ quilt edit hello.c

And change the print statement to:

printf("Hello, world!n");

Save the patch:

$ quilt refresh

Now to demonstrate the reordering.

Unapply all the patches, edit the patches series file patches/series so the patches are in the order you like, and then re-apply the patches. If you are lucky, they will re-apply with no conflicts.

$ quilt pop -a
$ vi patches/series
# move "printf" between fix-typo and add-header.
# now all the bug-fixes are at the beginning of the series
$ quilt push -a

Merge two patches into one

Make another new patch:

$ quilt new output-function
$ quilt edit hello.c

Change the c file to this:

#include <stdio.h>

int do_output(const char *name)
{
    return printf("Hello, %s!n", name);
}

int main (int argc, char *argv[], char *env[])
{
    /* ignoring the return code for do_output */
    do_output(NAME);
    return 0;
}
$ quilt refresh

Now, to merge two patches into one:

$ quilt pop prototype
$ quilt fold < patches/output-function

We have merged the prototype and output-function patches, because they describe a related change.

Save the patch.

$ quilt refresh

Throw away a patch

Now we no longer need the last patch, output-function, as it has been included into the prototype patch. But we might want to rename the prototype patch.

$ quilt delete output-function
# we have to clean up a bit for quilt or the rename won't work
$ rm patches/output-function
$ quilt rename output-function

Deleting will not work on a patch that has been applied before the current patch.

You are ready to contribute your patches ... go forth and code.

Summary

We have seen that quilt can help you manage your contributions to any project that is written in plain-text files. It can generate patch files (usually needed for contributions to open source projects) and can help you manage and update them as the tip of the development branch moves forward with other peoples' contributions.

To use quilt successfully, you need to remember to add files to each patch with quilt add/or quilt edit before editing, and to generate the patch with quilt refresh once all the editing of each patch is done. The rest is easy.

Commands that edit the patches:

$ quilt new patch-name
$ quilt add fname
$ quilt edit fname
$ quilt refresh
$ quilt pop [-a]
$ quilt push [-a]
$ quilt rename [-P oldname] newname
$ quilt delete [-P patchname]
$ quilt fold < patch_to_merge

Commands that view the state of the patches:

$ quilt series
$ quilt files
$ quilt diff [-P patchname]
$ quilt graph [--all]
$ quilt patches fname
$ quilt annotate fname
$ quilt applied
$ quilt unapplied

HPR exhortation

You've been listening to Hacker Public Radio. Anyone can make a show -if I can do it, so can you.


hpr2322 :: A bit of background on virtualenvwrapper

Released on 2017-06-27 under a CC-BY-SA license.

A bit of background on virtualenvwrapper

Or, Linux processes, the process environment and the shell.

speaker intro

Hi, I'm bjb. I've been using Linux for wow, 20 years now.

motivation

knox gave a nice podcast on virtualenvwrapper - it was timely for me, I was just trying to use it the other day and not finding all the bits and pieces. So thank you for collecting that info in one place.

knox asked why virtualenvwrapper behaves as it does ...

introduction

virtualenvwrapper is a combination of bash functions and programs.

To understand how it works you need to know a little bit about bash and Linux.

I know there have been some very good earlier and current! HPR shows on bash. But bash is a huge topic. The man page for it was 3500 lines about 10 years ago ...now it is 4300 plus lines. It has a LOT of functionality, and when you're just trying to get something done, it's overwhelming to look at. So in this HPR episode, I will just answer one or two of knox's questions. It gives me an excuse to make an episode.

Also I'm not going to go too deep into the description. In order to keep the podcast short and to-the-point, I'm just going to cover what is needed. There is lots more depth - there are several shells you could use and I'm only going to talk about bash; at startup bash can read more than just the files I mention in this podcast ... I'm just not going to cover all the possibilities. That's what the over 4300 line man page is for : -). If you have questions, ask them in the comments, or make your own podcast and ask them! Maybe you'll get some answers - either from me or from another HPR community member.

environment for processes

A program that has no inputs is not flexible or powerful. As a simple example, a program that displays the results of a hard-coded search is certainly useful if you want to know about that hard-coded search term. But a program that can search for a term that you specify at run time is so much more useful. You do not have to recompile the program to change the search term.

Programs can receive inputs in several ways.

On Linux and other unix-like OSs, a program can be run with arguments, read and write to file descriptors (and that includes standard in, standard out and standard error), they can receive signals - and they have another input: the "environment". That is a bunch of key-value pairs that are made available to the program when it starts. Some examples of environment variables are PATH, HOME, EDITOR and PAGER. The name of the environment variable, 'PAGER', is the key, and the thing on the other side of the equals sign, like 'less', is the value - the pair make up a key-value entry in the environment.

People who program in C or C++ and maybe other languages know that the program starts with a main function, and that function has some parameters. The first one is a count of arguments and the second one is an array of strings, each string being one of the arguments passed to the program when it is launched. There is a little-known optional third parameter: an array of strings that represents the "environment".

The way the program gets these strings is that it inherits them from its parent process. The parent process of programs that are run from the command line is ... the command line itself, bash. Or csh, or whatever your shell is. When the program starts, it gets a copy of the exported parts of the environment of its parent.

environment in bash

Bash gives you the ability to set these environment variables and mark them as "available for handing to subprocesses", and that is what is happening when you give that "export" command.

You can view all the currently defined variables that have been marked for export by using the "env" command with no arguments. E N V - echo november victor. Or, env, short for environment.

Since these variables are passed down the generations from parent to child, it is usually sufficient to define it once at the top level.

The command line itself is a program called bash. It reads some files at startup.

As an example of the "generations", you can call bash from within bash. And you can call bash again from within that bash. Then the first bash is the parent of the second one, and the second one is the parent of the third. The third bash is the child of the second.

You can see the environment changing: Set a variable fred=one in the first shell and export it:

export fred=one

then run bash. In that bash you can echo $fred, and see that fred is one. Now you can change fred to two:

export fred=two

and run the third bash. In the third bash, you can see that fred is two:

echo $fred

now exit bash with the exit command.

If you echo $fred, you will see fred is still two, since we set it to two just before we ran the third bash. But if you exit again, you will be back to the first bash, and you will see that fred is now one. This is the environment that bash had, just before you launched the second bash. The second and third environments are gone - those processes terminated when the exit command was given on their prompts; and when they did, their environments were cleaned up and removed.

In the show notes, I have another exercise to help with understanding this environment thing.

Here's another exercise to illustrate this principle. Type bash and
enter, and you will be in a subshell. If you show a process listing
in a hierarchical format, with children indented from their parents,
you will see that the bash you are currently in is a child of
another bash. The command to see the list of running processes in
hierarchical format is:

    ps -efH

There are several bash processes. In order to pick out the bash
instance that I'm running, I look for the ps process, because it has
a uniqe string in the arguments: -efH. In the less session, search
for 'efH' by typing "/efH". The screen will jump to where the
ps -efH process is, and highlight the "efH" string that you searched
for. The line you searched for will be at the top of the display
... to see the few lines above, type "kkkk" (one k for each line to
move up). To exit from less, type q.

Go ahead and export another made-up variable - perhaps your street name:

   export CHESTNUT=rizwan

Make sure it is there with the env command:

   env | grep CHESTNUT

and then run another subshell, and search for it again:

   bash
   env | grep CHESTNUT

Exit the various shells with the "exit" command or by typing ^D. If
you exit the subshell, and the shell in which you created the
CHESTNUT environment variable, you can run the env command and
search for that environment variable - it will not be there. The
program in which the environment variable was created has terminated,
and its environment has been discarded.

bash startup files

When bash is a login shell, it reads ~/.bash_profile. When it is not a login shell, but some subshell of the login shell, it reads ~/.bashrc.

So for things that you only need to set once, you can put them in ~/.bash_profile. For things that you have to run for each new subshell, you put them in .bashrc.

(Note that most distributions will set up the user accounts so they will run ~/.bashrc from .bash_profile for interactive shells)

the PATH

This is important, because of two things. The first is the PATH. The PATH is one of the environment variables that is used by the system to look for executables. So if you want to run a program, it should be in one of the directories on the PATH, or you will have to specify the full path to the program when running it.

When you first get your account on a system, there is a default version of the .bashrc and .bash_profile files. In .bash_profile there should be a definition of the PATH. It contains the system directories like /usr/bin and /bin - you don't want to remove those from your path or your shell will become next to useless - you will have to use full paths for all commands. So the way that people add directories to the PATH is to assign the existing value of PATH to itself, plus the desired new directories. For example:

PATH=$PATH:/home/bjb/bin

But if you put this in .bashrc, then every subshell will have another copy of the directory /home/bjb/bin tacked onto the end of the PATH. So the right place to put this definition is in ~/.bash_profile, where it will be executed once and then inherited by all the subshells.

shell functions and aliases

However not everything you need in the shell is inherited from the parent program. It turns out that another facility that bash supplies and that virtualenv uses is the ability to define and execute bash functions. Bash also has aliases.

A bash function is a series of bash commands that have been given a name, and that you can run by typing that name. It can also receive arguments that can influence how the function will behave. HPR episode 1757 by Dave Morriss called "Useful Bash Functions" talks about bash functions.

You can see the list of currently defined bash functions by using the bash command: declare -F

An alias is a simpler version of a function - it is (usually) just a shorter string to represent a longer or more complicated command, to make command line use easier (assuming you can remember all the aliases in the first place).

You can see the list of currently defined aliases by using the bash command: alias

virtualenvwrapper makes use of bash functions. This has consequences.

the bash builtin command 'source'

One is that you need to define those functions in every subshell. That's why you need to put "source /usr/local/bin/virtualenvwrapper.sh" in your bashrc.

Well it seems that on a Debian system virtualenvwrapper puts the workon shell function into your shell via a more convoluted route. I will describe it in the show notes. But in the end, the virtualenvwrapper file that defines the virtualenvwrapper adds the function workon to your shell by sourcing the file /etc/bash_completion.d/virtualenvwrapper whenever .bashrc is sourced. (Note that "." is shorthand for the bash "source" built-in command.) The "workon" function is defined in /etc/bash_completion.d/virtualenvwrapper (the definition is about in the middle of the file.)

- ~/.bashrc sources /etc/bash_completion or /usr/share/bash-completion/bash_completion
  (whichever one it finds first);
- which sources /usr/share/bash-completion/bash_completion;
- which sources all the files in /etc/bash_completion.d
- one of which is virtualenvwrapper.sh
- which defines the bash function workon.

Look at that, on a Debian system "apt-cache show virtualenvwrapper" does indeed list bash-completion as a dependency. The virtualenvwrapper upstream does not assume you will be using command completion, and in the comments at the top of the /etc/bash_completion.d/virtualenvwrapper file tell you to put "source .../virtualenvwrapper.sh" into your ~/.bashrc file.

A description of bash-completion could be a topic of another podcast (I'm not actually volunteering to do this one, heh, just suggesting it as a topic).

life cycle of environment

Another consequence is this: When you run a program, it will inherit a copy of the environment of its parent. When it is done, it will exit and that environment will disappear. So, you cannot run a program or subshell to try to affect your environment. It will affect the subshell or program environment, and as soon as the command is done, that updated environment will disappear.

The "source" built-in bash command is meant to allow you to run a bunch of commands in a file as if they had been typed on the command line. So you can put commands that affect the environment, and the environment will still have the changes when the sourcing is done.

back to virtualenvwrapper: conclusion

So, virtualenvwrapper is mainly changes to the environment. It consists of a few files that are stored in ~/.virtualenvs, with names like postactivate and premkvirtualenv. They are basically hooks to add functionality before and after the commands you would issue for virtualenv, so you can customize virtualenv.

To understand virtualenvwrapper, let's have a quick look at virtualenv first. The things you do with virtualenv are to create a virtualenv, destroy one, and activate one.

So the things you can do with virtualenvwrapper are to run some script or scriptlet before or after you create a virtualenv, destroy a virtualenv, or activate a virtualenv.

The main thing to customize is the "where to find the activate file" and the "what to do after activating 'postactivate'".

It does this by setting environment variables (like PATH and PYTHONHOME) appropriately and by defining bash functions to do things like change directory to where the project is.

You just have to edit .virtualenvs/postactivate to contain the location of your project files. You also define WORKON_HOME to be the directory that contains all your virtualenvs (for me that is /usr/local/pythonenv, but for most people it will be some directory in their home directory.

Summary

virtualenv manipulates the environment in order to allow you to have different python setups for your different projects - handy if you have one project that depends on different versions of python packages than another project and you want to run both.

But virtualenv leaves a few rough edges, like leaving it up to you to find the virtualenv in order to source the activate script. That is where virtualenvwrapper comes in.

We have talked about the environment, and how virtualenvwrapper manipulates the environment to make it easier to work with the virtualenvs that you have created.

The environment refers to the set of environment variables that are defined and passed to child processes. We also discussed the process hierarchy and that a new environment is created for a new process, and it is destroyed when that process exits. We covered sourcing a file of shell commands, so that if those commands affect the environment, then when the sourcing is done, the environment left is the one that was changed and the changes persist past the source command. We talked about the .bash_profile and the .bashrc files.

HPR exhortation

You've been listening to Hacker Public Radio. Anyone can make a show -if I can do it, so can you.


Become a Correspondent