Unexplored corners of Ruby's syntax

/hiki/rubyweirdsyntax/update.png Florian Groß finds an especially nasty mixture of string interpolation and heredocs.

Xue Yong Zhi recently released rubyfront, which is already able to parse all the Ruby code included in the stable snapshot and the >70KLocs of Ruby on Rails (btw. things have changed a fair deal since that "riding a 1Kloc framework" talk). I've been providing some unusual syntax samples to chew on.

This is part of the activity in the Grammarians mailing list, where a bunch of courageous Rubyists*1 are struggling to build a stand-alone Ruby parser, while a few others (like me) throw ugly constructions at them.

Here are some of the things I found in the wild, loosely sorted so the most common come first (I could feed them to my lexical complexity estimator for Ruby, but that's too much work).

Naughty regexps

Seen in Getopt:

while self.sub!(/(^|\n)([^\t\n]*)(\t+)/sex) { |f|

%s literals

I hadn't seen that many before:

foo = %s'some symbol!($)!!'                        # => :"some symbol!($)!!"

Found in Borges.

_ separators after the decimal point

Seen in Extmath:

C = 0.577_215_664_901_532_861

rescue on non-constant expressions

def foo(exception)
rescue exception

foo(NameError){ blergh }                           # => 1

First seen in ruby-dipus. It can get much nastier:

def foo
rescue [NameError, LoadError][rand(2)]

  foo{ blergh }                           # => 
rescue NameError
  puts "No luck"
# >> No luck


Most people wouldn't expect anything after the HEREDOC marker; one of the first snippets I posted to the grammarians ML looked like

a, b, c = <<E1.chomp, 3, <<E2.split(/\n/).join(" - ")
hello world
[a,b,c]                         # => ["hello world", 3, "a - b - c"]

but I think this is the first time I see something like that (in a much more benign form though) in the wild. Ruwiki uses HEREDOCS inside a hash to store localized messages:

 message = {
   :charset_encoding             => "iso-8859-15",
 # ...
         :no_empty_search_string       => <<EOM ,
Das Suchfeld darf nicht leer sein. Bitte etwas eingeben bevor der Suchknopf
gedr�kt wird.
   :page_is_locked               => "Die Seite ist bereits zur Bearbeitung gesperrt. (...)",
 # ...

HEREDOCs plus string interpolation

Florian Groß reports

Just stumbled upon this one:

p "#{(<<'"').reverse}
hello world
hello world
"\ndlrow olleh\nhello world\n"

It confuses IRB pretty badly. :)

Not only IRB, Florian :-)

*1 plus Terence Parr, of ANTLR fame, who is in the process of becoming one