{ |one, step, back| } 1 of 1 article Syndicate: full/short

Line Noise   22 Aug 03
[ print link all ]
There is a thread on the ruby-talk list on punctuation as noise. Hal Fulton wrote a short program to analyze the symbol to puncuation ratio of a program and then compared the results on several programs in different languages.

I did something similar a while back. A Java programmer had taken a look at Ruby and declare that he didn’t like all that "line noise" in the language. He was refering to the "@" and "$" characters used to mark instance variables and globals. I pointed out that Ruby actually uses quite a bit less punctuation than Java, and wrote the following linenoise program do demonstrate.

    #!/usr/bin/env ruby
    ARGV.each { |fn|
      noise = open(fn) { |file| file.read }.gsub(/[A-Za-z0-9_ \t\n]/m, "")
      puts "#{fn} (#{noise.size}): #{noise}"
    }

Linenoise will strip out all alphanumeric characters and white space, leaving only the "line noise" behind. Running linenoise on a series of small programs written in different languages produces this (edited slightly for line breaks) …

  animal.cc   (83):  #<>{:()=;};:{:();};::(){::<<"\";}:{:();};::(){::
                     <<"\";}(){*[]={,};(=;<;++)[]->();;}
  Animal.java (67):  {{();}{(){..("");}}{(){..("");}}([]){[]=[]{(),()}
                     ;(=;<.;++)[].();}}
  animal.pl   (41):  ;{{};}{"\";};{{};}{"\";};$(->,->){$->();}
  animal.py   (23):  :():"":():""[(),()]:.()
  animal.rb   (10):  """"[.,.].

The number in the paranthesis is the number of line noise characters in the file.

What I find interesting is the amount of semantic information that still comes through the "line noise". For example, the "#<>" sequence in the C++ code is obviously an include statement for something in the standard library and the "<<" are output statements using "cout".

It would be interesting to see if you could determine the language given only the line noise. You could tell Java from C++ by the ";}" vs ";};" punctuation. Python is pretty clear from the ’:():"":():’ style patterns.

Before I go, here is the source code to the Animal programs I used in my examples…

Language: C++

  #include <iostream>

  class Animal {
  public:
      virtual void talk() = 0;
  };

  class Dog : public Animal {
  public:
      virtual void talk();
  };

  void Dog::talk() {
      std::cout << "WOOF\n";
  }

  class Cat : public Animal {
  public:
      virtual void talk();
  };

  void Cat::talk() {
      std::cout << "MEOW\n";
  }

  int main() {
      Animal * (a[]) = { new Dog, new Cat };
      for (int i=0; i<2; i++)
          a[i]->talk();
      return 0;
  }

Language: Java

  public class Animal {
      interface IAnimal {
          void talk();
      }

      static class Dog implements IAnimal {
          public void talk () {
              System.out.println("WOOF");
          }
      }

      static class Cat implements IAnimal {
          public void talk() {
              System.out.println ("MEOW");
          }
      }

      public static void main (String args[]) {
          IAnimal[] zoo = new IAnimal[] { new Dog(), new Cat() };
          for (int i=0; i<zoo.length; i++)
             zoo[i].talk();
      }
  }

Language: Perl

  package Dog;

  sub new {
      bless {};
  }

  sub talk {
      print "WOOF\n";
  }

  package Cat;

  sub new {
      bless {};
  }

  sub talk {
      print "MEOW\n";
  }

  package main;

  for $a (Dog->new, Cat->new) {
      $a->talk();
  }

Language: Python

  class Dog:
      def talk(self):
          print "WOOF"

  class Cat:
      def talk(self):
          print "MEOW"

  for a in [Dog(), Cat()]:
      a.talk()

Language: Ruby

  class Dog
    def talk
      puts "WOOF"
    end
  end

  class Cat
    def talk
      puts "MEOW"
    end
  end

  for a in [Dog.new, Cat.new]
    a.talk
  end

blog comments powered by Disqus

 

Formatted: 19-May-13 15:29
Feedback: jim@weirichhouse.org