Thorbjørns stuff

tirsdag den 19. august 2014

Emacs: <dead-tilde> is undefined

It appears that Emacs is not handling the "~" - which is a dead key on Danish keyboards - correctly under Ubuntu 14.04, making it a bit hard to navigate to your home directory.

The following line in ~/.emacs.el fixes it:

(require 'iso-transl)

(Solution found on http://ubuntuforums.org/showthread.php?t=1472204&p=9234073#post9234073)

mandag den 21. juli 2014

HDHomeRun with Yousee: a8qam64-6900

I bought a second hand HDHomeRun, which apparently does not cater for that Yousee (our local cable tv provider) does things a little differently so you must explicitly mention:

~~a8qam64-6875~~
a8qam64-6900

For OS X, the command line after installing the SiliconDust software is

hdhomerun_config xxxx set /sys/dvbc_modulation "a8qam64-6900"

(where xxxxx is the deviceid found with "hdhomerun_config discover")

(EDIT 2014-12-27: Apparently the 6875 is close enough to almost work, but after experimenting with the 4 tuner DVB-C model I found that I got a lot better results with 6900, as prompted by the Silicon Dust Windows configuration software). Anyway EyeTV only works with the old model, not the new one.

Oneliner:

for i in $(hdhomerun_config discover| awk '{if ($2 = "device"){print $3}}'); do echo $i; hdhomerun_config $i set /sys/dvbc_modulation "a8qam64-6900"; done

torsdag den 17. april 2014

A new job - with a lot of the same and a lot of new things - resulting in Canonical ZIP.

After 9 years at Kewill/Four Soft/Transaxiom working with Java programs running on or against the AS/400 platform, I started working at 2014-04-01 at Digital Bevaring at Statsbiblioteket in Århus (Digital Preservation at the State and University Library in Aarhus, Denmark).

Initially I am full time on the Netarkivet project , which harvests the Danish part of the internet several times a year and stores it for future research as part of preserving the Danish internet heritage. Think Internet Archive/Wayback Machine for Danes.

The Netarchive Suite (which controls the Heretrix harvester and archives the harvested files) is Open Source and available at https://sbforge.org/display/NAS/NetarchiveSuite.

One of the pending maintenance tasks is to migrate from a rather evolved ant build to Maven (after migrating from subversion to git). There is a large set of tests in place but as the ant build generates several artifacts and Maven is designed to only generate one artifact for each Maven project it is important to reproduce the ant artifacts in the maven build.

As the build contains a lot of classes and a lot of tests the easiest way to programatically confirm the builds are the same is to simply see if the resulting jar/war files are binary identical. I have earlier found in my work making Jenkins Fingerprinting working with multi-step maven builds that Maven builds are simply not designed to be reproducible, and has the following problems:

Maven generated artifacts depend on where and when they were done and not only on the source file contents. Default property files enclosed in the artifacts have timestamps, but this can be turned off directly in pom.xml
All entries in a jar/war/ear file have a time stamp corresponding to when the file was written to disk. This needs to be explicitly set to the same value across all builds. The Maven jar plugin does not support this.
The order of the entries in the jar/war/ear files are non-deterministic by default making them uncomparable. A sorting order needs to be defined - it appears that much of this work has already been done in the pack200 repackaging utility. To my knowledge this is not supported in any of the default jar/war/ear creation mechanisms in ant or Maven.
The operating system platform indicator stored with each entry (which apparently is used for defining how platform dependent information is stored) needs to be normalized. A reasonable default would be the unix platform allowing for rwx-bits for scripts, or the fat platform to be as simple as possible.

After thinking it over the simplest approach is probably to rework the current ant build to avoid having any timestamps inside generated files, and create a new project capable of normalizing a zip-file. Inspired by the work done by James Clark to make XML files comparable - http://jclark.com/xml/canonxml.html - I think a good name for such a normalized zip-file is Canonical ZIP.

søndag den 29. april 2012

Ubuntu 12.04 on iMac G3

An elderly iMac G3 450 MHz came back in my possession after having been found too generally slow.

An ideal test subject for the latest Ubuntu 12.04 LTS on a non-mainstream computer (which has 1 GB RAM so it should be able to run Linux well).

The live cd does not configure X correctly, but the netboot image has a textmode installer, which installs a working X.

I found the following:

The Ubuntu 12.04 live cd does not configure X correctly so it was unusable on this particular iMac, but the netboot image has a textmode installer, which installs a working X.
The default X configuration shows a black screen when rebooted, but this model has a VGA connector in the back, so an external monitor could be attached showing the 1024x768 screen. This mean that the modelines deduced by the graphics driver do not hit the limited range supported by the built-in monitor.
The Rage128 graphics card is not supported by Mesa (the 3D component in X) anymore so all the graphic niceties are rendered in software, which is simply too slow. Switching to Unity-2D in the login screen (click the tiny Ubuntu logo) did not help noticeably.

Conclusion: Ubuntu 12.04 LTS expects much from the graphics card. Elderly machines without supported hardware accelleration does not give a good user experience.

søndag den 26. september 2010

How I Learned to Stop Worrying and Love the Eclipse "Format Source" Save Action...

One of the most interesting things in Eclipse in recent years (besides Team Project Sets and the standalone version of the compiler) was when "Editor Save Actions" were introduced, and that it allowed to run "Format Source" on every Java file when saved from within Eclipse.

This has proven to be quite disruptive to the way I write Java programs for several reasons. First let me explain why I think this is a good idea in the first place:

When you have code which may have lived a few years in a source repository with a bugfix once in a while it is inevitable that the code will need to be reindented inside a {}-block, or an array assignment has become to wide, and you groan and manually do the reformatting. The reason for the groaning is because you cannot just reformat the whole source file with Ctrl-Alt-F as there is some code elsewhere in the source file which has been manually reformatted ages ago, and the automatic formatter breaks this manual formatting.

Breaking is one thing, what is much worse is that if you change the source by reformatting, it means that the source control system (CVS; Subversion etc.) dutifully records this and everytime you in the future wants to know when that line was changed you get the time of the reformatting, not the actual last change to the source file. I have actually worked on a project where we were prohibited to use the formatter because of exactly this side-effect.

Naturally it wouldn't be a problem if everybody reformatted their source before comitting as that would mean that a later reformat wouldn't change anything and not count as a revision with regards to the source repository, and now Eclipse can help by doing it automatically. You never have to think about it again! This is so big a thing that it is hard to understand before you have actually used it for a while.

When working with code you can always just reformat the whole file at any time if your code gets messy. Don't worry about breaking formatting elsewhere - just do it!
After a while you move from Ctrl-Alt-F to reformat to just saving the file with Ctrl-S. If anything moves unexpectantly, you give it a closer look.
Code gets a common look. The individual styles everybody picks up, like spaces or no spaces around equal signs, or should the + be at the end of the line or at the beginning of the next line when having a multi line string concatenation, vanish automatically when the source is reformatted. This common look is also beneficial because everybody is used to reading that style and does not get annoyed with small things like missing spaces.
For this to work for a whole team everybody on the team needs to have exactly the same settings. The easiest way in Eclipse to do this, is just to accept the standard Eclipse settings, but you can set up the formatter as desired to conform with your local standard, and save the preferences to a file, which others then can load when starting a new workspace. Very useful, but unnecessary if you just stick with the Eclipse default.

This makes all very good sense on paper, but after a while it shows that the Eclipse formatter on default settings occasionally does some very unexpected indentations when formatting a wide complex expressions and needing to break it over multiple lines. All in all I found it was time to introduce an additional rule to avoid the formatter having to do this.

All statements must fit on a single source line.

This proved to have a much wider impact than I originally thought.

The immediate consequences were that a lot of no-need-to-think refactoring was needed if a line was too wide.

long method arguments could be extracted to a local variable
assignments and definitions could be split ("String s = ...." were to be turned into two lines "String s;" and "s = ...."). Eclipse offers this with Ctrl-1 with the cursor on the equals sign. This is not necessarily a good thing but if it is enough it is acceptable.
if blocks were nested more than about 3-4 levels the lines were still too short anyway and you had to extract a method to get the lines long enough.

In other words you have a very simple rule for when to refactor and how. If the formatter breaks the line, it is time to do it, and you are constantly reminded until you do so.

But here it is very common for people to say "All this just gives extra work, which is unnecessary. Why do it then?". I thought so too myself, but then I concluded that:

Extraction of method arguments to local variables allow you to give meaningful names to each argument! This is important for future programmers to understanding it when reading it. There is no names at the calling spot in Java (Math.substring(10,19) does not indicate whether 19 is the number of characters wanted or something else completely) . By doing this manual listing of parameters you get naming there too.

NullPointerExceptions are also easier to handle in this way. By listing each argument on its own line instead of something like "new String[] {f.getX(), g.getX(), h.getX()}" where you have no idea of whether f, g or h was null if you get a NullPointerException. The short lines help too here. (This goes for ArrayIndexOutOfBoundsExceptions too but they are rarer).

And I think that the resulting code is generally easier to read which is also an advantage in itself...

(This blog entry was written to elaborate on my StackOverflow answer to a question about non-standard coding styles)

mandag den 24. maj 2010

A gentle introduction to logging in Java with slf4j

This blog entry is intended for starting Java programmers who have learned to get information from inside their program with

System.out.println("a = " + a);

wants to learn how experienced Java programmers do the same thing in production code by using

log.debug("a = {}", a);

The way of programming described in the following is exactly the same as is used in large Java projects, and is the best practice known to me. The only difference is that the setup described here, is kept small and simple in order to be useful to small, standalone programs as those typically written by starters. If you need more advanced features, you can change the setup without having to change a single line of your own code (but how to do that is beyond the scope of this blog entry).

It is my hope that by using the techniques described in this document you will have a greater chance to avoid some very embarrassing and exhausting experiences in your future programming jobs, and hopefully not having to unlearn a habit or two.

"And you try and tell the young people of today that ..... they won't believe you."

Java was designed by Sun Microsystems, which happened to make a living at the time selling computers running the Sun dialect of the Unix operating system. It is not a surprise then that Java programs tend to look like other Unix programs, and they have been designed to work on the computer you see in the picture below.

## PIC ###
Picture shows Ken Thompson and Dennis Ritchie working at the first real Unix machine. Please note that there is real paper in the "typewriters", the closet behind the men is the computer, and that the circular containers on top of the computer contain program and data on paper tape. Things were different in 1973.

Under Unix every program has three input/output connections available - namely:

standard input - this provides the things the user type to the program. (Java: System.in)

standard output - this is normal information (Java: System.out)

standard error - this is information about error situations (Java: System.err)

A central concept in Unix is that you can let the output of one program be the input of another program - called a pipe - encouraging small specialized programs being combined to complete a larger task, like creating a list of words in a file by first extracting all words, sorting them, and then eliminate duplicates. Unfortunately this has not caught on in Java, mostly because the Java environment takes too long to start for small programs to be really usable. The concepts are still the same, however.

So the simplest possible Java program uses System.out to print out the line "Hello World":


1: public class HelloWorld {
2:   public static void main(String[] args) {
3:        System.out.println("Hello World");
4:   }
5: }

This should be named "HelloWorld.java", and is compilable from the command line with "javac HelloWorld.java", and the compiled class can then be invoked with "java HelloWorld".

Line 1 declares this to be the class named HelloWorld - there is no "package" line so it is just HelloWorld instead of e.g. com.bigcoorporation.HelloWorld.

Line 2 is the magic words needed to be able to invoke the class file from the command line as "java HelloWorld".

Line 3 prints the line "Hello World" to standard output.

Now, why did that happen?

One of the unfortunate side effects of programming for a living is that people are using your programs, and they tend to find that your programs do not work as they expect - this is traditionally called a "bug"##FIXME FOOTNOTE 2## - and it is then your responsibility to identify and correct the problem.

In most cases, all you have to work with is the user describing what happened and what was expected. The program has ended and is for all practical purposes unavailable to you. Hence a way of saving information from within the program for later is needed, much in the same way that logbooks are used on ships to aid in navigation. The term "logging" is used in general for the process of systematically saving information for later.

Several logging solutions are available to Java programmers. All of these generally share the ability to log:

The area of the code containing the log statement.

The time when the log statement was invoked.

An importance level of this log statement.

An exception (with accompanying stack trace).

A log message.

This has proven to be very helpful for logs intended for the human reader. Facilities available in some logging frameworks include

store events in a database

alert users by email, instant messaging or any other electronic means

(but that is too advanced to be covered in this introduction - it is just to give you an idea of what is possible)

The HelloWorld program rewritten to use the SLF4J logging solution (full instructions are in appendix A) and put in the "demo" package, looks like:


package demo;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class HelloWorld {
    public static void main(String[] args) {
        Logger log = LoggerFactory.getLogger(HelloWorld.class);
        log.info("Hello World");
    }
}

and generates the following output:


2009-06-07 10:38:21 demo.HelloWorld main
INFO: Hello World

The first line contains a date and time stamp, the name of the logger - here "demo.HelloWorld" - and the name of the method containing the log.info statement - here "main".

The second line contains the importance level - here "INFO" - and the message passed - here "Hello World". Note how much of this information was derived automatically!

If we add a method call with a delay, we can see that the timestamp changes, and that the method names in the log correctly indicates if the log statement is done from "main" or "waitTwoSeconds".


package demo;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class HelloWorld2 {
    public static void main(String[] args) throws Exception {
        Logger log = LoggerFactory.getLogger(HelloWorld2.class);
       log.info("Hello World");
        waitTwoSeconds(log);
        log.info("Done");
    }
    private static void waitTwoSeconds(Logger log) throws Exception {
        Thread.sleep(2000);
       log.info("Slept 2 seconds");
    }
}

with the following output:


2009-06-07 10:53:52 demo.HelloWorld2 main
INFO: Hello World
2009-06-07 10:53:54 demo.HelloWorld2 waitTwoSeconds
INFO: Slept 2 seconds
2009-06-07 10:53:54 demo.HelloWorld2 main
INFO: Done

If you were shown this log, you would be able to determine that the program had run as expected to the statement logging "Done" in the main method of demo.HelloWorld2 and you should not look for problems happening before that statement.

Logging information stored in variables

If you need to add information stored in a variable, slf4j allow you to specify this with {}. (Note: Example works with Java 5 and later)

package demo;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class ZeroToFour {
    public static void main(String[] args) throws Exception {
        Logger log = LoggerFactory.getLogger(ZeroToFour.class);
        for (int i = 0; i < 5; i++) {
             log.info("i = {}, i*i = {}", i, i * i);
        }
    }
}

which outputs

2009-06-07 13:23:57 demo.ZeroToFour main
INFO: i = 0, i*i = 0
2009-06-07 13:23:57 demo.ZeroToFour main
INFO: i = 1, i*i = 1
2009-06-07 13:23:57 demo.ZeroToFour main
INFO: i = 2, i*i = 4
2009-06-07 13:23:57 demo.ZeroToFour main
INFO: i = 3, i*i = 9
2009-06-07 13:23:57 demo.ZeroToFour main
INFO: i = 4, i*i = 16

All objects can be logged in this way, and the toString() method is called to get the string to log. Note that int, long, double, floats, chars are not objects, but that in Java 5 and later they are invisibly converted correctly.

Logging problems

All programs regardless of programming language must have a way to handle problems. In Java this is done by "throwing exceptions" which causes the Java Virtual Machine to stop the normal execution of your program, and start looking for code which can "catch the exception thrown" - if none is found in the current method, the method calling it is investigated and if none is found there the method which called it is investigated, etc. This continues until either a suitable "try-catch-clause" has been found or the program ends with an error message and a stack trace.

A stack trace looks like:


java.lang.ArithmeticException: / by zero
    at demo.CatchException.doStuff(CatchException.java:20)
    at demo.CatchException.main(CatchException.java:12)

The first line lists the full name of the exception, a colon, and the error message. The next line tells where the exception was thrown, here in the method doStuff() in the class demo.CatchException (the exact location is on line 20 in the file CatchException.java which contains the source code for the class). The rest of the lines is on the same form, but tells where the call to the previous line happened.

In the above example, main() calls doStuff() on line 12, and in doStuff() on line 20 the exception is thrown.

The full program looks like:

1: package demo;
2:
3: import org.slf4j.Logger;
4: import org.slf4j.LoggerFactory;
5: 
6: public class CatchException {
7:
8:   public static void main(String[] args) {
9:        Logger log = LoggerFactory.getLogger(CatchException.class);
10:
11:       try {
12:            doStuff(log);
13:        } catch (Exception e) {
14:            log.error("Something happened", e);
15:        }
16:    }
17: 
18:     public static void doStuff(Logger log) {
19:        log.info("1/1 = {}", 1 / 1);
20:        log.info("1/0 = {}" ,1 / 0);
21:    }
22:}

Here is a sample run with line numbers.


1: 2009-06-07 12:59:27 demo.CatchException doStuff
2: INFO: 1/1 = 1
3: 2009-06-07 12:59:27 demo.CatchException main
4: SEVERE: Something happened
5: java.lang.ArithmeticException: / by zero
6:     at demo.CatchException.doStuff(CatchException.java:20)
7:     at demo.CatchException.main(CatchException.java:12)

Line 19 in the program logs "1/1 = {}" with the argument 1. This results in line 1-2 in the output. Line 1 contains a timestamp and the class and method. Line 2 contains the severity of the log statement - here INFO - and the log message with {} replaced with arguments.

Line 20 tries to log "1/0 = {}" with the argument 1/0, but dividing by zero causes the JVM to throw an ArithmeticException which cancels the execution of line 20 and begins the search for a suitable try-catch-clause. There is not one in doStuff() so the JVM goes back to the place where doStuff was called - this is line 12 in main() - which is inside a suitable try-catch-clause!

The exception thrown is therefore assigned to the variable e in line 13, and an error is logged on line 14. This results in line 3-7 in the output. Line 3 contains the timestamp and the class and method of the logging statement. Line 4 contains the severity of the log statement - here SEVERE - and the message logged. Line 5-7 contains the stack trace contained in the exception.

What SHOULD I log?

When you think about what you should log, you might consider thinking of the worst case scenario, where you need to locate and fix a Heisenbug - a problem which you cannot reliably reproduce - only using the log from when it happened to the customer.

This means doing a "log.debug" call for every piece of work or decision taken with arguments describing what has been done, and a "log.info" call for every important piece of work with arguments.

Fortunately this is usually not necessary! But such production scenarios exist - all you have is your log. You might as well think about it when you write your code.

Usually the log.debug is used by the developers to debug methods, and they are switched off in production scenarios so they are just skipped (hence they must be fast to skip).
Use log.info about things you would consider important enough to warrant logging but not enough for any more than that.
Use log.warn to note unusual things - preferably in detail - that might warrant a second glance from a system operator.
Use log.error for those things that require immediate attention. Like waking up people at 03:00...
Log the output of "new java.util.Date()" when starting the program. Perhaps also with the name of the machine and its IP-number. This helps you identifying restarts, get the full timestamp of the restart, plus keeping track of multiple machines with identical configuration.
For files, consider logging the absolute file name when opening a file. This might help you identifying problem with the wrong file being read or written.

Appendix A - Setting up SLF4J

SLF4J is a frontend which allow the usage of any of several logging frameworks without changing code and recompiling. In order to run the examples, you need to download and unpack the latest release of SLF4J from http://slf4j.org/download.html. To get the same output as this document you will need to tell slf4j to use the logging framework built into Java from 1.4 onwards.

Amongst the unpacked files you will find slf4j-api-1.5.8.jar and slf4j-jdk14-1.5.8.jar (numbers vary slightly, please adjust accordingly) which need to be added to your classpath.

To compile a program with slf4j from the command line, use


javac -g -cp .../slf4j-api-1.5.8.jar demo/CatchException.java

and to run it with slf4j and the jdk14 backend on a Unix system (Linux, Mac etc.), use:

java -cp .:.../slf4j-api-1.5.8.jar:.../slf4j-jdk14-1.5.8.jar demo.CatchException

(Note that the argument after "-cp" is period-colon-pathtoslf4japi-colon-pathtoslf4jjdk. I.e. three locations separated with a colon)

To run it under Windows with slf4j and jdk14 backend, use:

java -cp .;.../slf4j-api-1.5.8.jar;.../slf4j-jdk14-1.5.8.jar demo.CatchException

(Note that the argument after "-cp" is period-semicolon-pathtoslf4japi-semicolon-pathtoslf4jjdk. I.e. three locations separated with a semicolon)

For Eclipse, right-click on a Java project in the Package Explorer, choose Properties, choose Java Build Path, choose Library pane, click "Add External Jars" and add each of the two jar files.

This allows you to use the logging framework built into Java with slf4j. Other frameworks exist and most can be used through slf4j by including the correct jar file instead of slf4j-jdk14-1.5.8.jar when the program is started. See http://www.slf4j.org/ for the latest list of native and wrapped implementations.

Regardless of which logging framework is being used, the usual rules of that framework apply - slf4j is just a simple, common frontend - so if you want to configure the jdk14 logging you must read the official Sun documentation. It is a bit hard to read, so I suggest you see if your specific problem is covered at http://www.exampledepot.com/egs/java.util.logging/pkg.html

Appendix B: Questions you might think

Q: Isn't it slow to have lots of "log.debug" calls if they have been switched off?
A: No. Logging frameworks are optimized to handle this very quickly, and it is not a problem.

Q: I have read that it is necessary to use "isDebugEnabled()" to make logger calls fast in log4j. Why?
A: Log4j predates slf4j, and this does not apply to slf4j if you use the {}-placeholders, as they delay the actual creation of the full string to be logged as much as possible, notably after the event has been determined to be logged.

Q: What is the recommended way to declare the logger variable inside an object (and not a static method)?
A: You should declare it like this

private final org.slf4j.Logger log = org.slf4j.LoggerFactory.getLogger(this.getClass());

Note: This will not work for static methods. Here you should start each method with requesting a Logger object similar to:

public static doStuff() {
  final org.slf4j.Logger log = org.slf4j.LoggerFactory.getLogger(ThisClass.class);
  log.info("in doStuff()");
}

Notes:

2) The origin of the word is that in 1945 a moth was found to cause a malfunction in the Mark I computer. It can be seen at http://www.history.navy.mil/photos/images/h96000/h96566kc.htm

lørdag den 24. oktober 2009

The simplest possible "wget"/"curl" downloader in Java to bootstrap a new Java environment

Occasionally you have to work or deploy code on a server where you are severely limited in what you can and may do. If so, you may not have access to a browser so you can download files, or a gzip implementation so you can unpack files or a compiler so you can add programs that can do those things you need. I am frequently in this position as my job is to run Java program on AS/400 machines.

Plenty of Java programs exist which can help with all this, but you still need to download them in the first place. You may have access to an ftp program but these days that is often not enough as things tend to be accessible for browsers only. I have not found an out-of-the-box way to get Java to run code from the net in a telnet connection which doesn't have a GUI so the hen-and-egg-problem is getting to download a program which allow you to download programs.

Fortunately most telnet programs allow for pasting in information, so a Java program that will fit on a single editor screen can easily be saved to a file, and compiled with javac to running code. So save the following snippet as "Get.java":

=======================================

      import java.io.*;
      import java.net.URL;
      
      public class Get {
      public static void main(String[] args) throws Exception {
       System.out.println(args[0] + " -> " + args[1]);
       InputStream in = new URL(args[0]).openStream();
       OutputStream out = new FileOutputStream(args[1]);
       byte[] buf = new byte[4096];
       int len;
       while ((len = in.read(buf)) > 0) {
         out.write(buf, 0, len);
       }
       in.close();
       out.close();
      }
      }

=======================================

Compile it with:

=======================================

javac Get.java

=======================================

If you have a CLASSPATH variable then put Get.class in the CLASSPATH otherwise you can just run it with:

=======================================

java Get http://www.google.dk google.txt

=======================================

This saves the Google home page in the file "google.txt". You can then download zip-files directly and unzip them with jar. For instance to download and unpack ant so you can use build.xml-files you can use

=======================================

java Get http://mirrors.dotsrc.org/apache/ant/ant-current-bin.zip ant.zip
jar xvf ant.zip

=======================================

If you need to go through a proxy, then do the same configuration as you would do for any other Java program.