Unfortunately the person behind Kyoto Encyclopedia of Genes and Genomes (KEEG) is reaching retirement and the main funding agency that was suporting KEGG has changed and is no longer supporting individual databases.
Therefore starting on July 1, 2011 the KEGG FTP site for academic users will be transferred from GenomeNet at Kyoto University to NPO Bioinformatics Japan, and it will be available only to paid subscribers. The publicly funded portion, the medicus directory, will continue to be freely accessible at GenomeNet.
You can read the rationale behind in this page
We will see how this affects to the Reactome guys.
Tuesday, 31 May 2011
Sunday, 29 May 2011
What can go wrong when working with UTF8?
Seems that everything if you don't know what UTF8 is about. see this Stack Overflow question and the response of @tchris
Deobfuscating large or complex regular expressions
From time to time you find a large or complex regular expression that has not been coded with //x thus you have a oneliner without comments, and a big headache after some time trying to decoding it.
Recently I found a RE of this class and a module (YAPE::Regex::Explain) that helps you to decompose the RE elements.
regexp:
This regexp parses URIs like:
(NOTE: these URIs are better parsed with URI::Split but this is another story)
And how to decompose it:
result:
and my manual explanation for the URI parsing:
Probably this regular expression was easy but while searching for more examples of YAPE::Regex::Explain I found two interesting links about Perl obfuscation with RE at StackOverflow and perl monks threads
Recently I found a RE of this class and a module (YAPE::Regex::Explain) that helps you to decompose the RE elements.
regexp:
m{^((\w*)://(?:(\w+)(?:\:([^/\@]*))?\@)?(?:([\w\-\.]+)(?:\:(\d+))?)?/(\w*))(?:/(\w+)(?:\?(\w+)=(\w+))?)?((?:;(\w+)=(\w+))*)$}
This regexp parses URIs like:
mysql://anonymous@my.self.com:1234/dbname/tablename
(NOTE: these URIs are better parsed with URI::Split but this is another story)
And how to decompose it:
#!/usr/bin/env perl use feature ':5.10'; use strict; use URI::Split qw(uri_join uri_split); use YAPE::Regex::Explain; use Data::Dumper; explain_RE($REx); sub explain_RE { my $REx = shift; my $exp = YAPE::Regex::Explain->new($REx)->explain; print $exp; }
result:
The regular expression:
(?x-ims:
^((\w*)://(?:(\w+)(?:\:([^/\@]*))?\@)?(?:([\w\-\.]+)(?:\:(\d+))?)?/(\w*))(?:/(\w+)(?:\?(\w+)=(\w+))?)?((?:;(\w+)=(\w+))*)$)
matches as follows:
(?x-ims:
^((\w*)://(?:(\w+)(?:\:([^/\@]*))?\@)?(?:([\w\-\.]+)(?:\:(\d+))?)?/(\w*))(?:/(\w+)(?:\?(\w+)=(\w+))?)?((?:;(\w+)=(\w+))*)$)
matches as follows:
NODE EXPLANATION ---------------------------------------------------------------------- (?x-ims: group, but do not capture (disregarding whitespace and comments) (case-sensitive) (with ^ and $ matching normally) (with . not matching \n): ---------------------------------------------------------------------- ^ the beginning of the string ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- ( group and capture to \2: ---------------------------------------------------------------------- \w* word characters (a-z, A-Z, 0-9, _) (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \2 ---------------------------------------------------------------------- :// '://' ---------------------------------------------------------------------- (?: group, but do not capture (optional (matching the most amount possible)): ---------------------------------------------------------------------- ( group and capture to \3: ---------------------------------------------------------------------- \w+ word characters (a-z, A-Z, 0-9, _) (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \3 ---------------------------------------------------------------------- (?: group, but do not capture (optional (matching the most amount possible)): ---------------------------------------------------------------------- \: ':' ---------------------------------------------------------------------- ( group and capture to \4: ---------------------------------------------------------------------- [^/\@]* any character except: '/', '\@' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \4 ---------------------------------------------------------------------- )? end of grouping ---------------------------------------------------------------------- \@ '@' ---------------------------------------------------------------------- )? end of grouping ---------------------------------------------------------------------- (?: group, but do not capture (optional (matching the most amount possible)): ---------------------------------------------------------------------- ( group and capture to \5: ---------------------------------------------------------------------- [\w\-\.]+ any character of: word characters (a-z, A-Z, 0-9, _), '\-', '\.' (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \5 ---------------------------------------------------------------------- (?: group, but do not capture (optional (matching the most amount possible)): ---------------------------------------------------------------------- \: ':' ---------------------------------------------------------------------- ( group and capture to \6: ---------------------------------------------------------------------- \d+ digits (0-9) (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \6 ---------------------------------------------------------------------- )? end of grouping ---------------------------------------------------------------------- )? end of grouping ---------------------------------------------------------------------- / '/' ---------------------------------------------------------------------- ( group and capture to \7: ---------------------------------------------------------------------- \w* word characters (a-z, A-Z, 0-9, _) (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \7 ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- (?: group, but do not capture (optional (matching the most amount possible)): ---------------------------------------------------------------------- / '/' ---------------------------------------------------------------------- ( group and capture to \8: ---------------------------------------------------------------------- \w+ word characters (a-z, A-Z, 0-9, _) (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \8 ---------------------------------------------------------------------- (?: group, but do not capture (optional (matching the most amount possible)): ---------------------------------------------------------------------- \? '?' ---------------------------------------------------------------------- ( group and capture to \9: ---------------------------------------------------------------------- \w+ word characters (a-z, A-Z, 0-9, _) (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \9 ---------------------------------------------------------------------- = '=' ---------------------------------------------------------------------- ( group and capture to \10: ---------------------------------------------------------------------- \w+ word characters (a-z, A-Z, 0-9, _) (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \10 ---------------------------------------------------------------------- )? end of grouping ---------------------------------------------------------------------- )? end of grouping ---------------------------------------------------------------------- ( group and capture to \11: ---------------------------------------------------------------------- (?: group, but do not capture (0 or more times (matching the most amount possible)): ---------------------------------------------------------------------- ; ';' ---------------------------------------------------------------------- ( group and capture to \12: ---------------------------------------------------------------------- \w+ word characters (a-z, A-Z, 0-9, _) (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \12 ---------------------------------------------------------------------- = '=' ---------------------------------------------------------------------- ( group and capture to \13: ---------------------------------------------------------------------- \w+ word characters (a-z, A-Z, 0-9, _) (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \13 ---------------------------------------------------------------------- )* end of grouping ---------------------------------------------------------------------- ) end of \11 ---------------------------------------------------------------------- $ before an optional \n, and the end of the string ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------
and my manual explanation for the URI parsing:
^( (\w*) :// (?: (\w+) # user (?: \: ([^/\@]*) # passw )? \@ )? # could not have user,pass (?: ([\w\-\.]+) # host (?: \: (\d+) # port )? # port optional )? # host and port optional / # become in a third '/' if no user pass host and port (\w*) # get the db (only until the first '/' is any). Will not work with full paths for sqlite. ) (?: / # if tables (\w+) # get table (?: \? # parameters (\w+) = (\w+) )? # parameter is conditional but would have always a tablename )? # conditinal table and parameter ( (?: ; (\w+) = (\w+) )* # rest of parameters if any ) $
Probably this regular expression was easy but while searching for more examples of YAPE::Regex::Explain I found two interesting links about Perl obfuscation with RE at StackOverflow and perl monks threads
Monday, 9 May 2011
malware web page scam today
A friend had a link is its twitter: http://bit.ly/mcHjaZ and when I followed the first time I ended in a malware scam page, that was very convincing. (The second time onwards I was directed to the right page)
I do not know if it was something to do with bit.ly or twitter or the target page (probably the latter), but the page title in the the browser tab was the same for the original linked page. I assume that this is a hijacking javascript injection in the web server of the target page:
[not put as a link to prevent cliking ;-)]
http://dornob.com/mind-warping-wood-folding-chair-looks-curved-packs-flat/
Very frightening.
[update]
Thanks to Karen from my Systems Team here there is a more detailed explanation
http://www.snipe.net/2011/05/rogue-mac-antivirus/
[update 2011-05-30]
Apple has created an entry about this in support.apple.com:
How to avoid or remove Mac Defender malware
I do not know if it was something to do with bit.ly or twitter or the target page (probably the latter), but the page title in the the browser tab was the same for the original linked page. I assume that this is a hijacking javascript injection in the web server of the target page:
[not put as a link to prevent cliking ;-)]
http://dornob.com/mind-warping-wood-folding-chair-looks-curved-packs-flat/
Very frightening.
[update]
Thanks to Karen from my Systems Team here there is a more detailed explanation
http://www.snipe.net/2011/05/rogue-mac-antivirus/
[update 2011-05-30]
Apple has created an entry about this in support.apple.com:
How to avoid or remove Mac Defender malware
Subscribe to:
Posts (Atom)