PLABO: perl

Showing posts with label perl. Show all posts

Thursday, 27 February 2014

perl gzip libraries (probably zlib issue) does not play well with bgzip files.

In bioinformatcis, bgzip files are important for random access to big files . Bgzip is a program modified from gzip program that uses block compression and is fully backwards compatible with gzip. But I have issues when using bgzip compressed vcf files with Perl scripts that uses IO::Uncompress::Gunzip (that I believe it uses zlib under the hood). A similar problem happen to my recently with snpeff program (Java). In both cases the data is decompressed but truncated after a few hundred lines aprox. I could be totally wrong but I was wondering if zlib (or whatever gzip compatible library they are using) is getting confused with the bgzip bloks and only processing one or a few of them leaving the output incomplete. perl code that does not work:

#!/usr/bin/env perl
use strict;
use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ;

my $infile = shift;
my $infh = IO::Uncompress::Gunzip->new( $infile ) 
         or die "IO::Uncompress::Gunzip failed: $GunzipError\n";
my $line_count = 0;
while (my $line=<$infh>){

    $line_count++
}
print "total lines read = $line_count\n";

This gives 419 lines

    $ perl /home/pmg/tmp/test_zlib-bgzip.pl varsit.vcf.gz
    total lines read = 419

but using open with gzip pipe works:

    #!/usr/bin/env perl
    use strict;
    # I can use bgzip intead gzip
    my $infile = shift;

    open(my $infh , 'gzip -dc '.$infile.' |'); 
    my $line_count = 0;
    while (my $line=<$infh>){

        $line_count++
    }
    print "total lines read = $line_count\n";

Gives the expected number of lines

    $ perl /home/pmg/tmp/test_gzip-bgzip.pl varsit.vcf.gz
    total lines read = 652829

I googled about and I was unable to find quickly any relevant entry, but this is something that I am sure other people would have already faced. Do someone have a clue about why is this happening? I am using ubuntu 12.04.4 with perl 5.16

[UPDATE 2014-02-28]: finally a clue come from biostars where Heng Li remind me a footnote in the SAM specs about a java library for gzip that only sees first block of bgzip when decompressing. Seems that Perl gzip implementations had the same problem.

Saturday, 9 February 2013

BioPerl is thinking about to be more practical and adaptative

There has been a good number of BioPerl threads in the mailing list [0] last week about how to make BioPerl more fitted to the current times.

[0] http://thread.gmane.org/gmane.comp.lang.perl.bio.general

I like the phrase of George Hartzell about being able to move forward because we need to support Perl 5.8

But why should the all-volunteer BioPerl community be stuck supporting
code from 12 years ago because it's cost effective for someone else to
avoid spending *their* $/time/people to stay up to date.

And the links to the discussion:
Next BioPerl release : http://thread.gmane.org/gmane.comp.lang.perl.bio.general/26348
dependencies on perl version : http://thread.gmane.org/gmane.comp.lang.perl.bio.general/26344
BioPerl future : http://thread.gmane.org/gmane.comp.lang.perl.bio.general/26394
removing packages from bioperl-live: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/26341

Saturday, 17 November 2012

Looking for a Perl tool like Ipython notebook.

I have just recently discovered Ipython notebooks[1] they are fantastics and like R sweave[2] they are essential for Reproducible Research.

I use Perl and pdl for bioinformatics research. Being pdl a scientific tool these kind of reproducible research utilities seems a perfect match.

Some time ago in the bio(perl|python) communities they started to look to use a similar approach to sweave but I have not seen much progress on that. A cheap alternative (if your OS is called emacs ;-)) is to use org-mode babel[3]. I like to use babel because I almost always work under ssh connection to my servers using also screen session with emacs -nw . Babel could be a substitute for sweave, but Ipython notebook is a more advanced and interactive beast. You can see this fantastic presentation from Fernando Pérez about scientific Python[4] where he demonstrates its powers.

I don't know yet any Perl tool similar for Ipython notebooks but if it exist I would like to find and use it.

Any comments on this topic and links would be more than wellcome.

[1] http://blog.fperez.org/2012/09/blogging-with-ipython-notebook.html
[2] http://www.stat.uni-muenchen.de/~leisch/Sweave/
[3] http://orgmode.org/worg/org-contrib/babel/

[4] http://www.youtube.com/watch?v=F4rFuIb1Ie4

Thursday, 15 November 2012

bioperl popularity (measured by searches) going down day by day.

According to the following figure bioperl is loosing all its appealing day by day. One critique to this plot is that big bio projects still are in perl like Ensembl, biopieces etc, but not shown here. The pity is that R and biophyton have very good tools/pipelines for New Generation Sequencing and bioperl or other perl bio projects don't.

http://www.google.com/trends/explore#q=biopython,%20bioperl,%20bioconductor

Tuesday, 9 October 2012

DBD::myql 4.022 not passing test 80procs.t and how to FIX it

I failed to install DBD::mysql. This time I had the dev files and everything.

FAIL Installing DBD::mysql failed. See /home/pmg/.cpanm/build.log for details. $ cpanm DBD::mysql [...] DBD::mysql::db do failed: alter routine command denied to user ''@'localhost' for routine 'test.testproc' at t/80procs.t line 41.

Looking at this line of code:

my $drop_proc= "DROP PROCEDURE IF EXISTS testproc";

ok $dbh->do($drop_proc);

And looking at my mysql db table I can see that there is no privileges for alter or execute procedures

mysql> select * from db\G
                 Host: localhost
                    Db: test
                  User:
           Select_priv: Y
           Insert_priv: Y
           Update_priv: Y
           Delete_priv: Y
           Create_priv: Y
             Drop_priv: Y
            Grant_priv: N
       References_priv: Y
            Index_priv: Y
            Alter_priv: Y
 Create_tmp_table_priv: Y
      Lock_tables_priv: Y
      Create_view_priv: Y
        Show_view_priv: Y
   Create_routine_priv: Y
    Alter_routine_priv: N  # <<====
          Execute_priv: N  # <<====
            Event_priv: Y
          Trigger_priv: Y


        mysql> show grants for ''@localhost;
        +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
        | Grants for @localhost                                                                                                                                                                                       |
        +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
        | GRANT USAGE ON *.* TO ''@'localhost'                                                                                                                                                                        |
        | GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, REFERENCES, INDEX, ALTER, CREATE TEMPORARY TABLES, LOCK TABLES, CREATE VIEW, SHOW VIEW, CREATE ROUTINE, EVENT, TRIGGER ON `test`.* TO ''@'localhost'    |
        | GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, REFERENCES, INDEX, ALTER, CREATE TEMPORARY TABLES, LOCK TABLES, CREATE VIEW, SHOW VIEW, CREATE ROUTINE, EVENT, TRIGGER ON `test\_%`.* TO ''@'localhost' |
        +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
        4 rows in set (0.00 sec)

FIXING the problem: Add permision to anyone from localhost to test


 mysql> grant ALL on test.* to ''@'localhost';
  mysql> show grants for ''@localhost;
  +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  | Grants for @localhost                                                                                                                                                                                       |
  +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  | GRANT USAGE ON *.* TO ''@'localhost'                                                                                                                                                                        |
  | GRANT ALL PRIVILEGES ON `test`.* TO ''@'localhost'                                                                                                                                                          |
  | GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, REFERENCES, INDEX, ALTER, CREATE TEMPORARY TABLES, LOCK TABLES, CREATE VIEW, SHOW VIEW, CREATE ROUTINE, EVENT, TRIGGER ON `test\_%`.* TO ''@'localhost' |
  +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4 rows in set (0.00 sec)
  mysql> select * from db\G
  *************************** 1. row ***************************
                   Host: localhost
                     Db: test
                   User:
            Select_priv: Y
            Insert_priv: Y
            Update_priv: Y
            Delete_priv: Y
            Create_priv: Y
              Drop_priv: Y
             Grant_priv: N
        References_priv: Y
             Index_priv: Y
             Alter_priv: Y
  Create_tmp_table_priv: Y
       Lock_tables_priv: Y
       Create_view_priv: Y
         Show_view_priv: Y
    Create_routine_priv: Y
     Alter_routine_priv: Y  ## <<<<==== OK
           Execute_priv: Y  ## <<<<==== OK
             Event_priv: Y
           Trigger_priv: Y

After this change I installed DBD::mysql and all test passed.

Sunday, 16 September 2012

VCFtools 0.1.9 perl Vcf.pm does not work with perl 5.14 or newer. use the VCFtools svn or this patch.

Error: bare qw() in a for loop: for $x qw() {} # deprecated. It needs to be surrounded by parents. . See my previous blog entry about this feature.
patch:

$  diff Vcf.pm Vcf.pm-ori 
1622c1622
<         for my $key (qw(fmtA fmtG infoA infoG)) { if ( !exists($$out{$key}) ) { $$out{$key}=[] } }
---
>         for my $key qw(fmtA fmtG infoA infoG) { if ( !exists($$out{$key}) ) { $$out{$key}=[] } }

Now all tests work OK. [upate] the svn version 779 of Vcf.pm is fixed

for my $key (qw(fmtA fmtG infoA infoG)) { if ( !exists($$out{$key}) ) { $$out{$key}=[] } }

perl 5.14 and up will break some of well known programs! Think twice (test thrice) before use it!

Seems that at some point someone thought that a qw() in a for loop should be surrounded by parents (qw()).
Lines like

$  perl -e 'foreach $x qw(1 2 3){print $x}'
Use of qw(...) as parentheses is deprecated at -e line 1.

are not permitted any longer. People has revealed against this and patches were proposed, but perl 5.16 still does not permit it. Some people say that this was a 'feature' of the parser not an intended behaviour (see comments in Reini's post).
In my case I switched to perl 5.14 and VCFtools was not working any more. At least I discovered it testing the test suit. 20 test failed all with the same error:

[..]
not ok 30 - Testing vcf-consensus .. perl -I../perl/ -MVcf ../perl/vcf-consensus  consensus.vcf.gz < consensus.fa
#   Failed test 'Testing vcf-consensus .. perl -I../perl/ -MVcf ../perl/vcf-consensus  consensus.vcf.gz < consensus.fa'
#   at test.t line 418.
#     Structures begin differing at:
#          $got->[0] = 'Use of qw(...) as parentheses is deprecated at ../perl//Vcf.pm line 1622.
#     '
#     $expected->[0] = '>1:1-500
[...]

I am lucky to be using perlbrew so I can change quickly from one version of perl to another (not so lucky that I need to reinstall a lot of CPAN modules.)

Test::Most not automatically installed with cpanm. Some inner dependencies failed

After installing the the CVFtools, I went to test the perl scripts with

perl vcftools/perl/test.t

But failed becuase I needed Test::More. I tried to install it but failed:

$  cpanm Test::Most.pm
--> Working on Test::Most
Fetching http://search.cpan.org/CPAN/authors/id/O/OV/OVID/Test-Most-0.31.tar.gz ... OK
Configuring Test-Most-0.31 ... OK
==> Found dependencies: Exception::Class
--> Working on Exception::Class
Fetching http://search.cpan.org/CPAN/authors/id/D/DR/DROLSKY/Exception-Class-1.33.tar.gz ... OK
Configuring Exception-Class-1.33 ... OK
Building and testing Exception-Class-1.33 ... FAIL
! Installing Exception::Class failed. See /home/pmg/.cpanm/build.log for details.
! Bailing out the installation for Test-Most-0.31. Retry with --prompt or --force.

Looking at .cpanm/build.log I saw that Class/Data/Inheritable.pm was missing. The dependency was not automatically picked and installed so I installed by hand. Then tried again Test::Most, and again Exception-Class was failing. This time needed Devel::StackTrace. I installed this one also by hand and now all went OK.

cpanm Devel::StackTrace
cpanm Test::More

These missing dependencies are not uncommon in CPAN but are a bit annoying, mainly if your cpanm install action becomes large and the failed module is at the beginning and your last line says, 11 packages installed (all dependencies of your wanted package, but the wanted one is not installed). In situations like this, is easy to skip that your modules was not installed.

Tuesday, 3 April 2012

Calculate years between two dates with Perl

Problem:
Giving two dates calculate the number of years between them.

Preamble:
This seems a trivial problem BUT it is not: you could take the year of each date subtract them and if > 0, then if the month of the second date is smaller than the first, add -1, and if it is equal, check the day. What could possibly go wrong?

First and more important, there are zillions of date formats and some very difficult to parse with RE. Then comes the problem that you would like to see the difference in years forward... and backwards!! and probably you would like to show the years with decimal point, so you need to count days .... and so on, and at some point your script would need to report days and months, that is life!.

Solution:
This is a task for the industry grade CPAN's DateTime family modules in Perl.

DateTime::Format::xxx is a family of modules specialised in parsing and formatting any date format you can imagine. Then you can calculate things with Date::Calc or DateTime::Duration.

#!/usr/bin/env perl


=head1 [progam_name]

 description: Calculate years between dates (two dates or one date and current time)

=head2 First version

  a file with three columns (id, start date, end date)

  - parse the date

      | strptime($strptime_pattern, $string)
      | Given a pattern and a string this function will return a new DateTime object.
      | %F
      | Equivalent to %Y-%m-%d. (This is the ISO style date)


  - check that first date < last date

  - output years as a new column

=cut


use feature ':5.10';
use strict;
use warnings;
use Getopt::Long;
use DateTime::Format::Strptime;
use File::Slurp;


my $prog = $0;
my $usage = <<EOQ;
Usage for $0:

  $prog [-test -help -verbose] file_with_dates_in_column_2_and_3

EOQ

my $has_header =1;

my $file_status;

my $file = shift;


my $fmt = DateTime::Format::Strptime->new(
    pattern => '%F',
    locale  => 'en_US',
);

# take care of windows end of line in a linux machine (need both chomp and s/\r$//)
my $dates = [map{chomp;s/\r$//;[split /\t/]} read_file($file)];

my $header = shift @$dates if $has_header;

my $line;
foreach my $date_aoa (@$dates) {

    # get the DateTime objects
    my $start = $fmt->parse_datetime($date_aoa->[1]);
    my $end   = $fmt->parse_datetime($date_aoa->[2]);

    unless ($start || $end){
        die "Error parising id '$date_aoa->[0]' at  line $line\n"
    }
    # get a DateTime::Duration object (that is automatic when doing math with DateTime objects)
    my $dur = $end - $start;

    $date_aoa->[3] = $dur->years;
    $line ++;
}

print_result($dates);


sub print_result {

    my ($dates) = @_;

    say join("\n", map{join("\t",@$_ )}@$dates);

}

Some Links:

The question http://stackoverflow.com/questions/6549522/how-to-make-datetimeduration-output-only-in-days and its answer

http://stackoverflow.com/a/6550372/427129 are also interesting

http://stackoverflow.com/questions/821423/how-can-i-calculate-the-number-of-days-between-two-dates-in-perl

http://stackoverflow.com/questions/3055422/calculating-a-delta-of-years-from-a-date

http://stackoverflow.com/questions/8308655/how-to-find-the-difference-between-two-dates-in-perl

http://stackoverflow.com/questions/3910858/using-perl-how-do-i-compare-dates-in-the-form-of-yyyy-mm-dd

http://datetime.perl.org/wiki/datetime/page/FAQ%3A_Date_Time_Formats#How_Do_I_Convert_between_Date::Manip_and_DateTime_Objects_-6

The Many Dates and Times of Perl

Tuesday, 28 February 2012

New edition of Chromatic's "Modern Perl" book

new version of Chromatic's book "Modern Perl" covering perl 5.14
http://onyxneon.com/books/modern_perl/index.html

Free PDF version.

Thursday, 23 February 2012

Perl programming 4th edition published

http://shop.oreilly.com/product/9780596004927.do

$20 the PDF in O'Reilly
25 GBP the paper back in Amazon (but not available until march)

Adopted as the undisputed Perl bible soon after the first edition appeared in 1991, Programming Perl is still the go-to guide for this highly practical language. Perl began life as a super-fueled text processing utility, but quickly evolved into a general purpose programming language that’s helped hundreds of thousands of programmers, system administrators, and enthusiasts, like you, get your job done.

In this much-anticipated update to "the Camel," three renowned Perl authors cover the language up to its current version, Perl 5.14, with a preview of features in the upcoming 5.16. In a world where Unicode is increasingly essential for text processing, Perl offers the best and least painful support of any major language, smoothly integrating Unicode everywhere—including in Perl’s most popular feature: regular expressions.

Important features covered by this update include:

New keywords and syntax
I/O layers and encodings
New backslash escapes
Unicode 6.0
Unicode grapheme clusters and properties
Named captures in regexes
Recursive and grammatical patterns
Expanded coverage of CPAN
Current best practices

Monday, 6 February 2012

Converting your perl module documentation into confluence

Short story

perldoc -u myPerlModule.pm | pod2wiki -s confluence > myConfluenceDoc

The key point is to use -u for printing out the original POD code (without any formatting).

Probably it is possible to use perldoc -owiki but I do not know yet how to pass the -s confluence. I will look at that later.

Long story

Today I was documenting some of my perl modules in our Confluence wiki when I decided to paste the POD. First I tried the HTML output copying it between {html}{html} directives, but as the .css was not created, the \blocks where not boxed, and I wanted them boxed.
HTML output

perldoc -ohtml  myPerlModule.pm

Then I made the output in text and used a oneliner to put some confluence markup
But these was not complete and needed some hand curation.
So I search in CPAN and find that Confluence is supported in the Pod::Simple::Wiki and also give you a script for that pod2wiki
pod2wiki accepts POD as input from STDIN, nice, so I printed the pod and passed to pod2wiki: perldoc myPerlModule.pm | pod2wiki -s confluence
NO OUTPUT!!!!!!!!!!
Today I was a bit slow, and took me some time to realize that perldoc does not print 'POD' but man-formatted POD. I did perldoc perldoc to find out how to print the raw POD of the document: option -u
Confluence output

perldoc -u myPerlModule.pm | pod2wiki -s confluence > myConfluenceDoc

Thursday, 30 June 2011

DBI::SQLite and dbish

Today I needed to have a SQLite db with foreign keys and realise that my server's SQLite3 is very old (3.3.6). My perl DBD::SQLite is using version 3.7.6. I read in the DBD::SQLite that you can access to the SQLite db using dbish (a shell wrapper to DBI::Shell).

I wanted to use the same SQLite version in shell and scripts. Before installing a new version of sqlite3 I tried dbish.

dbish is part of DBI::Shell. It will be installed in your local-lib/bin when you install DBI::Shell.

First encounter was disappointing: following the POD it was not working at all

$   dbish dbi:SQLite:test.db

     DBI::Shell 11.95 using DBI 1.611

     WARNING: The DBI::Shell interface and functionality are
     =======  very likely to change in subsequent versions!


     Connecting to 'dbi:SQLite:test.db' as ''...
     @dbi:SQLite:test.db> table_info
     @dbi:SQLite:test.db> quit
     @dbi:SQLite:test.db> exit
     @dbi:SQLite:test.db> help
     @dbi:SQLite:test.db> type_info
     @dbi:SQLite:test.db> help
     @dbi:SQLite:test.db>

None of the comands worked :-(.

Googling around I found that the book 'Programming the Perl DBI' By Alligator Descartes, Tim Bunce has a chapter about it and discovered that the commnads must be preceded with a '/'.

@dbi:SQLite:srf2cram.db> /help
Defined commands, in alphabetical order: 
  [/;]chistory   display command history 
  [/;]clear      erase the current statement 
  [/;]col_info   display columns that exist in current database 
  [/;]commit     commit changes to the database 
  [/;]connect    connect to another data source/DSN 
  [/;]count      execute 'select count(*) from table' (on each table listed). 
  [/;]current    display current statement 
  [/;]describe   display information about a table (columns, data types). 
  [/;]do         execute the current (non-select) statement 
  [/;]drivers    display available DBI drivers 
  [/;]edit       edit current statement in an external editor 
  [/;]exit       exit 
  [/;]format     set display format for selected data (Neat|Box) 
  [/;]get        make a previous statement current again 
  [/;]go         execute the current statement 
  [/;]help       display this list of commands 
  [/;]history    display combined command and result history 
  [/;]load       load a file from disk to the current buffer. 
  [/;]option     display or set an option value 
  [/;]perl       evaluate the current statement as perl code 
  [/;]ping       ping the current connection 
  [/;]primary_key_info display primary keys that exist in current database 
  [/;]prompt     change the displayed prompt 
  [/;]quit       exit 
  [/;]redo       re-execute the previously executed statement 
  [/;]rhistory   display result history 
  [/;]rollback   rollback changes to the database 
  [/;]run        load a file from disk to current buffer, then executes. 
  [/;]save       save the current buffer to a disk file. 
  [/;]spool      send all output to a disk file. usage: spool file name or spool off. 
  [/;]table_info display tables that exist in current database 
  [/;]trace      set DBI trace level for current database 
  [/;]type_info  display data types supported by current server 
Commands can be abbreviated.

dbish is interesting because I can interact whith the databases and test the DBI commands interactively, but I think that with SQLite3 I have more control and is better documented. So I ended installing the latest version of SQLite3.

[update]

I downloaded the latest precompiled sqlite3 and it was not working properly in my CentOS release 5.4: the cursor got detached from the line and was able to move through all the screen (not allowing up for previous history). Also was not quiting.

$  sqlite3 test.db
SQLite version 3.7.7.1 2011-06-28 17:39:05
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .q
   ...> ;
   ...>
   ...>
   ...>
   ...>

I killed manually and compiled from the 'amalgamation' download and now works fine.

[edit2]

After reading this question in StackOverflow about how to set by default the foreign keys for a database, I am having second thoughts about using foreign keys with SQLite:

No, not even with compile-time options.

The only way, so far, is to use pragma foreign_keys=on at run time. The particular danger is that every application that touches the database has to do that.

If a specific application doesn't run that pragma statement, it can insert data that will violate foreign key constraints, and other applications won't know it. That is, turning on foreign keys doesn't warn you of existing data that violates the constraint.

Monday, 27 June 2011

qrcode generator with perl

There are some good pages for genereating QRcode and a lot of not so good ones. There are probably to many to decide which one to use so I decided to create yet another one:
pmg-qrcode

I have used
HTML::QRCode
Imager::QRCode



[...]
use Imager::QRCode;
use HTML::QRCode;

[...]

sub get_html_qr {
  my $text = shift;
  my $qrcode = HTML::QRCode->new->plot($text);
  return $qrcode;
}

sub print_qrimage {
  my $text = shift;
  my $imga_type = shift || 'png';
  my $qr = Imager::QRCode->new(
      size  =>  5,
      level => 'M',
    );

  $qr->plot($text)->write( fh => \*STDOUT, type => $img_type);
}

This is very simple, but .....
the difficult part is to have them working because the prerequisites not very well explained.

If you read the README for HTML::QRCode it says that it works out of the box with an standard cpan install. Well that is true if you have already Text::QRCode, that usually is not the case. Don't panic, lets install T::QRC and all will be fine. Humm... Marvin still depressed. Despite T::QRC also telling you about doing a standard install and all will go OK, again, that will only be true if you have all prerequisites already installed and in the default place. T::QRC needs libqrencode headers and libs and look for them at the root installed paths. The README does not say anything about this but the description of T::QRC gives a hint

DESCRIPTION ^
This module allows you to generate QR Code using ' ' and '*'. 
This module use libqrencode '2.0.0' and above.

To make the long story short, I tried to install HTML::QRCode in my local-lib environment and failed because Text::QRCode missing. I tried to install T::QRC and failed because no lib-qrencode. Installed locally lib-qrencode and not able to install T::QRC because not knowing how to pass a 'prefix' option to Makefile.pl for the ld_lib and includes. Solved editing the Makefile.PL manually, adding an env-var for LD_RUN_PATH and install manually T::QRC and then cpan install of HTML::QRCode.

Long story:

HTML::QRCode needs Text::QRCode and Text::QRCode needs lib-qrencode:

HTML::QRCode:

t/00-load.t .. 1/1
        #   Failed test 'use Text::QRCode;'
        #   at t/00-load.t line 6.
        #     Tried to use 'Text::QRCode'.
        #     Error:  Can't load '/homes/pmg/.cpan/build/Text-QRCode-0.01-n4g2CG/blib/arch/auto/Text/QRCode/QRCode.so' for module Text::QRCode: libqrencode.so.3: cannot open shared object \
file: No such file or directory at /homes/pmg/pmg-soft/local-perl/lib/5.12.1/x86_64-linux/DynaLoader.pm line 200

I searched for libqrencode and download it:

# download
wget http://fukuchi.org/works/qrencode/qrencode-3.1.1.tar.gz
# extract
# install
[~/pmg-soft/src/qrencode-3.1.1]
$  ./configure --prefix=$HOME/pmg-soft
$ make
$ make install

And obtained a helpful message that I save for later:

----------------------------------------------------------------------
Libraries have been installed in:
/homes/pmg/pmg-soft/lib

If you ever happen to want to link against installed libraries
in a given directory, LIBDIR, you must either use libtool, and
specify the full pathname of the library, or use the `-LLIBDIR'
flag during linking and do at least one of the following:
- add LIBDIR to the `LD_LIBRARY_PATH' environment variable
during execution
- add LIBDIR to the `LD_RUN_PATH' environment variable
during linking
- use the `-Wl,-rpath -Wl,LIBDIR' linker flag
- have your system administrator add LIBDIR to `/etc/ld.so.conf'

See any operating system documentation about shared libraries for
more information, such as the ld(1) and ld.so(8) manual pages.
----------------------------------------------------------------------

Now I have a T::QRC build extracted somewhere (I can see the path in the error message) [ Error: Can't load '/homes/pmg/.cpan/build/Text-QRCode-0.01-n4g2CG/blib/arch/auto/Text/QRCode/QRCode.so'].

I went there and read the Makefile.PL. It has two places (includes and ld-lib) where I need to add the path for my local lib-qrencode files:

a) the directives for the Makefile

sub MY::post_constants {
  [...]
  return <<"POST_CONST";
  CCFLAGS += $define -I/homes/pmg/pmg-soft/include
  LDDLFLAGS += -L/homes/pmg/pmg-soft/lib -lqrencode
  LDFLAGS += -L/homes/pmg/pmg-soft/lib -lqrencode
  POST_CONST
}

b) the comand for testing that the lib-qrencode exist (testing existence by compilation success)

sub test_libqrencode {
  my $compile_cmd
        = 'cc -I/homes/pmg/pmg-soft/include -I/usr/local/include -I/usr/include -L/homes/pmg/pmg-soft/lib -L/usr/lib -L/usr/local/lib -lqrencode';
  [..]
}

Then exported the environmental variable 'LD_RUN_PATH':

export LD_RUN_PATH=/homes/pmg/pmg-soft/lib

$ make clean
$ perl Makefile.PL 
| Cannot determine perl version info from lib/Text/QRCode.pm
| Checking if your kit is complete...
| Looks good 
| Writing Makefile for Text::QRCode
$ make          
$ make test
| PERL_DL_NONLAZY=1 /homes/pmg/pmg-soft//local-perl/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'inc', 'blib/lib', 'blib/arch')" t/00-load.t t/01-plot.t
| t/00-load.t .. 1/1 # Testing Text::QRCode 0.01, Perl 5.012001, /homes/pmg/pmg-soft//local-perl/bin/perl
| t/00-load.t .. ok
| t/01-plot.t .. ok
| All tests successful.
| Files=2, Tests=3,  0 wallclock secs ( 0.02 usr  0.03 sys +  0.06 cusr  0.04 csys =  0.15 CPU)
| Result: PASS
$ make install
|Files found in blib/arch: installing files in blib/lib into architecture dependent library tree
|Installing /homes/pmg/pmg-soft/local-perl/local-lib/lib/perl5/x86_64-linux/auto/Text/QRCode/QRCode.so
|Installing /homes/pmg/pmg-soft/local-perl/local-lib/lib/perl5/x86_64-linux/auto/Text/QRCode/QRCode.bs  
|Installing /homes/pmg/pmg-soft/local-perl/local-lib/lib/perl5/x86_64-linux/Text/QRCode.pm
|Installing /homes/pmg/pmg-soft/local-perl/local-lib/man/man3/Text::QRCode.3
|Appending installation info to /homes/pmg/pmg-soft/local-perl/local-lib/lib/perl5/x86_64-linux/perllocal.pod

Then go to cpan and intall HTML::QRCode

cpan[2]> install HTML::QRCode

That's all, happy hacking.

Pablo

Sunday, 29 May 2011

What can go wrong when working with UTF8?

Seems that everything if you don't know what UTF8 is about. see this Stack Overflow question and the response of @tchris

Deobfuscating large or complex regular expressions

From time to time you find a large or complex regular expression that has not been coded with //x thus you have a oneliner without comments, and a big headache after some time trying to decoding it.

Recently I found a RE of this class and a module (YAPE::Regex::Explain) that helps you to decompose the RE elements.

regexp:

m{^((\w*)://(?:(\w+)(?:\:([^/\@]*))?\@)?(?:([\w\-\.]+)(?:\:(\d+))?)?/(\w*))(?:/(\w+)(?:\?(\w+)=(\w+))?)?((?:;(\w+)=(\w+))*)$}

This regexp parses URIs like:

mysql://anonymous@my.self.com:1234/dbname/tablename

(NOTE: these URIs are better parsed with URI::Split but this is another story)

And how to decompose it:

#!/usr/bin/env perl

use feature ':5.10';
use strict;
use URI::Split qw(uri_join uri_split);
use YAPE::Regex::Explain;
use Data::Dumper;

explain_RE($REx);

sub explain_RE {
    my $REx = shift;
    my $exp = YAPE::Regex::Explain->new($REx)->explain;
    print $exp;
}

result:

The regular expression:

(?x-ims:
^((\w*)://(?:(\w+)(?:\:([^/\@]*))?\@)?(?:([\w\-\.]+)(?:\:(\d+))?)?/(\w*))(?:/(\w+)(?:\?(\w+)=(\w+))?)?((?:;(\w+)=(\w+))*)$)
matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?x-ims:                 group, but do not capture (disregarding
                         whitespace and comments) (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n):
----------------------------------------------------------------------
  ^                        the beginning of the string
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    (                        group and capture to \2:
----------------------------------------------------------------------
      \w*                      word characters (a-z, A-Z, 0-9, _) (0
                               or more times (matching the most
                               amount possible))
----------------------------------------------------------------------
    )                        end of \2
----------------------------------------------------------------------
    ://                      '://'
----------------------------------------------------------------------
    (?:                      group, but do not capture (optional
                             (matching the most amount possible)):
----------------------------------------------------------------------
      (                        group and capture to \3:
----------------------------------------------------------------------
        \w+                      word characters (a-z, A-Z, 0-9, _)
                                 (1 or more times (matching the most
                                 amount possible))
----------------------------------------------------------------------
      )                        end of \3
----------------------------------------------------------------------
      (?:                      group, but do not capture (optional
                               (matching the most amount possible)):
----------------------------------------------------------------------
        \:                       ':'
----------------------------------------------------------------------
        (                        group and capture to \4:
----------------------------------------------------------------------
          [^/\@]*                  any character except: '/', '\@' (0
                                   or more times (matching the most
                                   amount possible))
----------------------------------------------------------------------
        )                        end of \4
----------------------------------------------------------------------
      )?                       end of grouping
----------------------------------------------------------------------
      \@                       '@'
----------------------------------------------------------------------
    )?                       end of grouping
----------------------------------------------------------------------
    (?:                      group, but do not capture (optional
                             (matching the most amount possible)):
----------------------------------------------------------------------
      (                        group and capture to \5:
----------------------------------------------------------------------
        [\w\-\.]+                any character of: word characters
                                 (a-z, A-Z, 0-9, _), '\-', '\.' (1 or
                                 more times (matching the most amount
                                 possible))
----------------------------------------------------------------------
      )                        end of \5
----------------------------------------------------------------------
      (?:                      group, but do not capture (optional
                               (matching the most amount possible)):
----------------------------------------------------------------------
        \:                       ':'
----------------------------------------------------------------------
        (                        group and capture to \6:
----------------------------------------------------------------------
          \d+                      digits (0-9) (1 or more times
                                   (matching the most amount
                                   possible))
----------------------------------------------------------------------
        )                        end of \6
----------------------------------------------------------------------
      )?                       end of grouping
----------------------------------------------------------------------
    )?                       end of grouping
----------------------------------------------------------------------
    /                        '/'
----------------------------------------------------------------------
    (                        group and capture to \7:
----------------------------------------------------------------------
      \w*                      word characters (a-z, A-Z, 0-9, _) (0
                               or more times (matching the most
                               amount possible))
----------------------------------------------------------------------
    )                        end of \7
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  (?:                      group, but do not capture (optional
                           (matching the most amount possible)):
----------------------------------------------------------------------
    /                        '/'
----------------------------------------------------------------------
    (                        group and capture to \8:
----------------------------------------------------------------------
      \w+                      word characters (a-z, A-Z, 0-9, _) (1
                               or more times (matching the most
                               amount possible))
----------------------------------------------------------------------
    )                        end of \8
----------------------------------------------------------------------
    (?:                      group, but do not capture (optional
                             (matching the most amount possible)):
----------------------------------------------------------------------
      \?                       '?'
----------------------------------------------------------------------
      (                        group and capture to \9:
----------------------------------------------------------------------
        \w+                      word characters (a-z, A-Z, 0-9, _)
                                 (1 or more times (matching the most
                                 amount possible))
----------------------------------------------------------------------
      )                        end of \9
----------------------------------------------------------------------
      =                        '='
----------------------------------------------------------------------
      (                        group and capture to \10:
----------------------------------------------------------------------
        \w+                      word characters (a-z, A-Z, 0-9, _)
                                 (1 or more times (matching the most
                                 amount possible))
----------------------------------------------------------------------
      )                        end of \10
----------------------------------------------------------------------
    )?                       end of grouping
----------------------------------------------------------------------
  )?                       end of grouping
----------------------------------------------------------------------
  (                        group and capture to \11:
----------------------------------------------------------------------
    (?:                      group, but do not capture (0 or more
                             times (matching the most amount
                             possible)):
----------------------------------------------------------------------
      ;                        ';'
----------------------------------------------------------------------
      (                        group and capture to \12:
----------------------------------------------------------------------
        \w+                      word characters (a-z, A-Z, 0-9, _)
                                 (1 or more times (matching the most
                                 amount possible))
----------------------------------------------------------------------
      )                        end of \12
----------------------------------------------------------------------
      =                        '='
----------------------------------------------------------------------
      (                        group and capture to \13:
----------------------------------------------------------------------
        \w+                      word characters (a-z, A-Z, 0-9, _)
                                 (1 or more times (matching the most
                                 amount possible))
----------------------------------------------------------------------
      )                        end of \13
----------------------------------------------------------------------
    )*                       end of grouping
----------------------------------------------------------------------
  )                        end of \11
----------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

and my manual explanation for the URI parsing:

^(
                 (\w*)
                 ://
                 (?:
                   (\w+)  # user
                   (?:
                     \:
                     ([^/\@]*)  # passw 
                   )?
                   \@
                 )?  # could not have user,pass
                 (?:
                   ([\w\-\.]+)  # host
                   (?:
                     \:  
                     (\d+)  # port
                   )?  # port optional
                 )?  # host and port optional
                 /  # become in a third '/' if no user pass host and port
                 (\w*)  # get the db (only until the first '/' is any). Will not work with full paths for sqlite.
               )
               (?:
                 /   # if tables 
                 (\w+)  # get table
                 (?:
                   \?  # parameters
                   (\w+)
                   =
                  (\w+)
                 )?  # parameter is conditional but would have always a tablename
               )?  # conditinal table and parameter
               (
                 (?:
                   ;
                   (\w+)
                   =
                   (\w+)
                 )*  # rest of parameters if any
               )
               $

Probably this regular expression was easy but while searching for more examples of YAPE::Regex::Explain I found two interesting links about Perl obfuscation with RE at StackOverflow and perl monks threads

Wednesday, 27 April 2011

nice example of a helping community in CPAN

Reading today the Catalyst mailing list I felt a great satisfaction. I saw how having the right attitude, asking the right questions, listening to people, doing your homework and reporting to CPAN authors sometimes gives you satisfactory results.

All happened in this thread:

[Catalyst] building 'local' lib with dependencies for shipping
where Fernan Aguero asked how to pack a catalyst application with all the needed dependencies. The problem is that if you want to do it platform independent you need to do it by setting list of dependencies (and your current versions of modules) and and use a method to automatically install all of them.

One of the solutions was to use Shipwright, created by Best Practical. It didn't work the first run but seemed the right tool so after a few messages more in the list, Fernan contacted with one of the developers and:

[Fernan]

I contacted sunnavy (one of the Shripwright developers) and this is
what he said:

"I found the archive( PerlMagick ) has an interesting files format, which
shipwright didn't handle before.

I just released 2.4.25 to fix this, which will show up in cpan soon.
in case you are hurry, here it is: http://goo.gl/6Jtzz";

Just tested 2.4.25 and shripwright now imported all of the
dependencies for my cat app correctly.

I'm now building my ship :) and will get into testing the catalyst app
running from the vessel soon.

It is nice to see that collaboration works in the Perl world.

Thursday, 21 April 2011

converting ANSI to HTML. How to convert to html the colored shell output

The main aim of this was able to put in html the ouptut of git log and diff.

Googling around I have found that Perl CPAN has the HTML::FromANSI module. Also, this module installs ansi2html which accepts input from stdin.

ls --color | ansi2html -p > my_web_page.html

ls --color | ansi2html > my_snpipet_code-no_header-footer.html

But I prefer the default output from ansi2html.sh from pixelbeat

Unfortunately the ls --color get properly converted to HTML but the git one not. No matter which script I use. Could it be bacause the color is defined in the config as color.ui=auto?

git diff HEAD master -- ensembl/sql/CVS/Tag | ansi2html -p > ~/public_html/htdocs_dev/diff1.html

git diff HEAD master -- ensembl/sql/CVS/Tag | ansi2html.sh --bg=dark > ~/public_html/htdocs_dev/diff2.html

[UPDATE]
Yes! the problem would be that I have in the configuration color.ui=auto because explicitly having --color in the command make it work:

$ git diff --color HEAD master -- ensembl/sql/CVS/Tag | ansi2html.sh --bg=dark > ~/public_html/htdocs_dev/diff3.html

Sunday, 16 January 2011

Managing script options: Getopt-Long-Descriptive

I use GetOpt::Long for all my scripts, and as many other people I have my own methods to handle some logic for many of the standard options that my scripts have (-verbose -debug -test -run -in_file -out_file -bsub, etc...). Things get complicated when you check if your script has the right options, some times there are incompatible options, or alternative ones, or if you pass -test, many mandatory options are not mandatory anymore. And also you need to provide the right message for each scenario and write the usage message.

Today reading the code of the WWW::PivotalTracker::Command Perl module I have seen that it uses Getopt::Long::Descriptive. This is right away the module that I was looking for. It implements the 'one of these', 'required', etc.., and also writes the usage automatically for you. I have not tried it yet but I will give it a try this week. Any comments about other users experiences would be welcome.

PivotalTracker perl module

I like XP and Agile programming style. And I am very fond of day and week to-do lists to mark project goals achievements. I have recently started to use PivotalTraker for managing my projects tasks and recod completion velocity and I am very happy with it. I use it as a companion to the workplace Jira (old version with no extensions so a very handicaped Jira.). Therefore I am using PivotalTraker as a pseudo jira extension ( when I finish a week I export the stories to CSV and paste into Jira).

I love the approach of PivotalTraker and also is good that it is easy to import/export data. Also it has perl bindings to its API: WWW-PivotalTracker . I will try it soon.