Perl

add news feed

post a story

These links are collected from the Perlbuzz Twitter feed. If you have suggestions for news bits, please mail me at andy@perlbuzz.com. A Test::Class anti-pattern (blogs.perl.org) What's new in Perl 5.18? (perltraining.com.au)...
These links are collected from the Perlbuzz Twitter feed. If you have suggestions for news bits, please mail me at andy@perlbuzz.com. A Test::Class anti-pattern (blogs.perl.org) What's new in Perl 5.18? (perltraining.com.au) Videos from the Polish Perl Workshop (youtube.com) vroom lets you do presentations in vi (blogs.perl.org) A roundup of YAPC::NA 2013 links (randomgeekery.org) ctags extensions for modern Perl (github.com) Experimental Perl features now warn (effectiveperlprogramming.com) My virtual YAPC::NA 2013 (blogs.perl.org) I love pre-modern Perl and so should you (blogs.perl.org) Chicago.PM has new website and Meetup page (blogs.perl.org) Stop talking about Perl (blogs.perl.org) Perl hash basics (perltricks.com) ExtUtils::MakeMaker is STILL doomed after a decade (blogs.perl.org)
about 1 hour ago
A few days ago, one of Perl's most distinguished free-floating agents of chaos posted a patch for discussion. dots.pm changes the Perl 5 dereference arrow to a dot. This lexically scoped pragma also changes the concatenation operator fro...
A few days ago, one of Perl's most distinguished free-floating agents of chaos posted a patch for discussion. dots.pm changes the Perl 5 dereference arrow to a dot. This lexically scoped pragma also changes the concatenation operator from dot to tilde, which is the current syntax in Perl 6. p5p reception tended to be positive. Discussion elsewhere tended to be negative. After reviewing the proposal and reception, Perl 5 pumpking rjbs rejected the patch for 5.20. His reasoning is straightforward. While fans of dots.pm may feel a legitimate disappointment at the current rejection of the patch, it's worth praising rjbs for an evenhanded evaluation of the intent and implementation of the feature. (If that evaluation had gone into the smartmatch operator or automatic dereferencing of references to aggregates with each, for example, Perl 5 would be in better shape.) My initial reaction to the patch was mild interest. It didn't immediately grab me as an idea that Perl needs. The more I thought about it, the less I liked it, for four reasons. It's too invasive technically for my taste It's a small patch to the parser, but it repeats a pattern which I've never liked in feature.pm, which is to say that it adds branches to the Perl parser/tokenizer/lexer which apply to the current compilation unit based on hints provided to the lexical scoping. To some degree this is a limitation of how a Perl program gets parsed, but the more optional features and branches in the parser, the more difficult it is to maintain the parser. I know this sounds like a slippery slope argument, and a pernicious one. To some extent it is, but the parser and tokenizer and lexer are already a big ball of mud. Making that more so worries me. With that said, the patch itself is as clean as you can reasonably expect. This is no criticism of Chip's skills. The patch does too much I like the idea of changing method calls from $invocant->method to $invocant.method. I'd like to experiment with that in my code for a while. (Back when I wrote P6 code, that syntax was easy to use and easy to read.) I'm ambivalent about changing the concatenation operator from dot to tilde. If there's a way to keep concatenation as it is, so much the better—but that probably means requiring significant whitespace around the concatenation operator and forbidding whitespace around the method invocation operator. The latter is troublesome; it would be a shame to borrow the problematic "unspace" concept from P6. I'm not thrilled at all about using dot as a generic dereference operator, turning $href->{key} into $href.{key}. That seems to borrow trouble; think about bugs waiting to happen with that code. It might encourage fragmentation While some parser changes merely add new keywords (say, defined-or) or make code that was previously a syntax error work (package BLOCK), this changes the meaning of two (maybe three) operators which have worked this way for almost 20 years. Yes, the effect of the dots pragma is local, as it should be, but the effect also creates divergent dialects of Perl. Rather than having to learn the meaning of new terms when the say feature is in effect, dots means having to learn the new meaning of a term when it is in effect. That cognitive burden seems higher. It's like updating the style guide on a well established project. Previously it recommended always quoting hash keys. Now it recommends the opposite. You'll spend time working with both styles until you scrub the old version out of your code and tests and support files and everywhere it's reached. Yes, the lexical scoping of dots.pm helps, but do you really want to sprinkle use dots; and no dots; throughout existing code while you're making that transition? If the use of dots.pm spread to the CPAN, could it ever be contained? Modern::Perl was never supposed to be a dependency of other CPAN modules, but it is. So is common::sense and, most unfortunately, strictures.pm. (The latter is worse becau
2 days ago
Nicholas Clark writes: As per my grant conditions, here is a report for the April period. I started the month looking at the Unicode Names code. At Karl's suggestion I changed it to parse the UnicodeData.txt file properly. Previously i...
Nicholas Clark writes: As per my grant conditions, here is a report for the April period. I started the month looking at the Unicode Names code. At Karl's suggestion I changed it to parse the UnicodeData.txt file properly. Previously it had hardcoded various constants, particularly related to the CJK ideographs and Hangul syllables. The CJK ranges in Unicode have increased in the past, and so it's possible that they will increase again. Not only is it (more) future proof, it also made it simpler to detect significant gaps in the allocated character ranges, which is useful for size optimisations. By handing the gaps better I reduced the data size by 13K, and by using two sizes of arrays for the trie structure, saved a further 25K. The intent of all this is to provide the data needed for the \N{} syntax directly as C code and static data, to avoid the tokeniser needing to load charnames if it sees \N{}. Given that the C code in question is generated by Perl, but to compile the Perl you need the C code, there's a potential bootstrapping problem here. Not wishing to ship >1M of generated code if avoidable, I experimented to see whether the \N{} escape syntax is needed by miniperl. It turns out that if you replace the \N{} handler by abort() in miniperl, you can still build perl perfectly. Excellent! Also, telling the Makefile to build distinct toke.o and tokemini.o is a one line change - it's nice when easy things are easy. Frustratingly the work is not yet ready to merge into blead, as it's not yet finished enough, and other things keep taking priority. We had a bit of a digression involving perl on s390. Merijn was given a CD for a linux distribution for s390, soon had it running on the emulator "Hercules". What does one do next? Obviously - try to build Perl. So he build blead (which worked) and ran the tests (which mostly worked). My initial reaction was: Is anyone actually using Perl on it? In that, we've not had any bug reports about this before, and for a lot of these somewhat esoteric platforms I'm sort of wondering at what point do they stop being "fun", and start being "work". Right now, I think it's more at the "fun" level, and it might be telling us something interesting about portability, as it might be that all these tests failing are down to the same problem. The problems all seemed to be involve conversion of large floating point values to integers. Which we do quite a lot of, and should work, but historically we've been skirting the boundary of what's conformant ANSI C, sometimes on the wrong side. So Merijn and I exchanged code and results as I tried to remote debug what the cause was. We tried to establish whether it was even handling large unsigned integers correctly (it was). We tried to rule out unions and ? : ternaries (which have confused compilers in the past). Nope. In the end, we ascertained that it was a bug in the supplied gcc 4.3.4 - it generated bad code for casting from unsigned integers to doubles. At which point Niko Tyni replied that the particular problem was already diagnosed as a compiler bug, and had been fixed. Debian was building on s390 with gcc 4.6.3, and he believed that gcc 4.4.7 was fixed. So that all ended up being rather a waste of time, thanks to the continued installation and use of an obsolete and buggy compiler. Particularly frustrating given that a fix exists in newer versions of that compiler. A significant development this month was having serious second thoughts about $* and friends. As mentioned in last month's report, everything smoked fine, so the change was merged to blead. Only then did the problems emerge. Specifically Father Chrysostomos demonstrated rather succinctly that the core's tests weren't comprehensive enough. The tests correctly verified that using any of @*, &* ** and %* generated the desired deprecation warning. But the warning was also generated by *{*}, *{"*"} and C, none of which "need" to be deprecated. Nothing tested these, so noth
5 days ago
These links are collected from the Perlbuzz Twitter feed. If you have suggestions for news bits, please mail me at andy@perlbuzz.com. A Test::Class anti-pattern (blogs.perl.org) What's new in Perl 5.18? (perltraining.com.au)...
These links are collected from the Perlbuzz Twitter feed. If you have suggestions for news bits, please mail me at andy@perlbuzz.com. A Test::Class anti-pattern (blogs.perl.org) What's new in Perl 5.18? (perltraining.com.au) Videos from the Polish Perl Workshop (youtube.com) vroom lets you do presentations in vi (blogs.perl.org) A roundup of YAPC::NA 2013 links (randomgeekery.org) Lessons from upgrading to Perl 5.16 (blogs.perl.org) How to do multi-dimensional arrays in Perl (perlmaven.com)
7 days ago
This month I worked on three 5.18 blocker tickets; all three being regressions related to my jumbo re_eval fix back in 5.17.1. The first, which I continued working on from last month, was the "Regexp::Grammars" bug. Basically, my rewor...
This month I worked on three 5.18 blocker tickets; all three being regressions related to my jumbo re_eval fix back in 5.17.1. The first, which I continued working on from last month, was the "Regexp::Grammars" bug. Basically, my reworking of the /(?{})/ implementation assumed that a constant string segment like "foo" in /foo..../ would indeed be constant; but in the presence of use overload::constant qr => sub { bless [], ... } the "constant" can be anything but, including an overloaded or REGEXP object. So the concatenation of the pattern's string segments didn't handle all the extra stuff like doing overloading properly or extracting out pre-compiled code blocks from qr// objects. This is now fixed. The second issue concerned handling arrays embedded within literal regexes, e.g. /...@a.../. This was partially to fix a regression from 5.16.x where, if @a contained a qr/...( {...}).../, then suddenly you'd need a 'use re eval' where you didn't need one before: RT #115004. But it also enhances the behaviour of array interpolation relative to 5.16.x too, especially relating to closures and overloading. Basically, the traditional behaviour of run-time patterns such as /a${b}c/ was to concatenate the pattern components together, then pass it to the regex engine. My 5.17.1 jumbo re_eval fix changed that so that the list of args was preserved and passed as-is to the regex engine. This meant that the engine could do things like extract out existing optrees from code blocks in something like $b = qr/...(?{...}).../, rather than having to recompile them. So closures work properly. The thing I missed back then was applying the same new handling to arrays as well as scalars. Until my fix, /a@{b}c/ would be parsed as regcomp('a', join($", @b), 'c') This meant that the array was flattened and its contents stringified before hitting the regex engine. I've now changed it so that it is parsed as regcomp('a', @b, 'c') (but where the array isn't flattened, but rather just the AV itself is pushed onto the stack, c.f. push @b, ....). As well as handling closures properly, it also means that 'qr' overloading is now handled with interpolated arrays as well as with scalars: use overload 'qr' => sub { return qr/a/ }; my $o = bless []; my @a = ($o); "a" =~ /^$o$/; # always worked "a" =~ /^@a$/; # now works too As well as the new handling of arrays, the pattern concatenation code within Perl_re_op_compile was heavily reworked, resulting in fixing a utf8 edge case, and generally simplifying the code, including enabling the removal of a clunky if (0) { label: ... } bit of code. This issue is now fully fixed. The third issue concerned how caller() and SUB work within regex code blocks. It turns out that since my re_eval jumbo fix, code blocks in literal matches were displaying an extraneous extra stack frame. This code: #!/usr/bin/perl use Carp; sub f3 { croak() } sub f2 { "a" =~ /a(?{f3(3)})/ } sub f1 { f2(2) } f1(1); gives the following results: 5.16.3: main::f3(3) called at (re_eval 1) line 1 main::f2(2) called at /home/davem/tmp/p line 6 main::f1(1) called at /home/davem/tmp/p line 7 5.17.10: main::f3(3) called at /home/davem/tmp/p line 5 main::f2 called at /home/davem/tmp/p line 5 main::f2(2) called at /home/davem/tmp/p line 6 main::f1(1) called at /home/davem/tmp/p line 7 blead: main::f3(3) called at /home/davem/tmp/p line 5 main::f2(2) called at /home/davem/tmp/p line 6 main::f1(1) called at /home/davem/tmp/p line 7 In addition, the SUB token, which returns a reference to the current subroutine, was returning a ref to the hidden anonymous sub which is now used to implement closure behaviour correctly for code blocks within qr//'s; that is, $r = qr/foo(?{...})bar/; is supposed to behave like $r = sub { /foo/ && do {...} && /bar/ } as far as closures are concerned. The trouble
8 days ago
This month was mostly spent on removing global state from the regex engine, making re-entrantcy less error-prone. The extract from the merge commit description below gives you all the details you could ever want. Apart from that I spent...
This month was mostly spent on removing global state from the regex engine, making re-entrantcy less error-prone. The extract from the merge commit description below gives you all the details you could ever want. Apart from that I spent a few hours re-enabling Copy-on_Write by default post the 5.18.0 release, plus a few other bits and pieces. It turns out that I have finally used up all the hours on my grant plus extensions. I really must get round to applying for a new grant sometime soon! commit 7d75537ea64f99b6b8b8049465b6254f5d16c693 Merge: 3a74e0e 28d03b2 Author: David Mitchell AuthorDate: Sun Jun 2 20:59:58 2013 +0100 [MERGE] get rid of (most) regex engine global state Historically, perl's regex engine was based on Henry Spencer's regex code, which was all contained within a single file and used a bunch of static variables to maintain the state of the current regex compile or execution. This was perfectly adequate when only a single thread could execute a regex, and where the regex engine couldn't be called re-entrantly. In 5.0, these vars were promoted to be full global vars as perl became embeddable; then in 5.5 they became part of the perl interpreter struct when MULTIPLICITY was introduced. In 5.6, the Perl_save_re_context() function was introduced that did a whole bunch of SAVEPPTR type stuff, and was called in various places where it was possible that the engine may be re-entered, to avoid overwriting the global state of the currently executing regex. This was particularly important now that Unicode had been introduced, and certain character classes could trigger a call to the perl-level SWASH code, which could itself execute a regex; and where /(?{ ... })/ code blocks could be called which could do likewise. In 5.10, The various PL_foo variables became fields within the new re_save_state struct, and a new interpreter var, PL_reg_state, was introduced which was of type struct re_save_state. Thus, all the individual vars were still global state, but it became easier to save them en-mass in Perl_save_re_context() by just copying the re_save_state stuct onto the save stack and marking it with the new SAVEt_RE_STATE type. Perl_save_re_context() was also expanded to be responsible for saving all the current $1 values. Up until now, that is roughly how things have remained, except for bug fixes such as discovering more places where Perl_save_re_context() needs to be called. Note that, philosophically speaking at least, this is broken in two ways. First, there's no good reason for the internal current state of the executing regex engine to be stored in a bunch of global vars; and secondly we're relying on potential callers of the regex engine (like the magic tie code for example), to be responsible for being aware that they might trigger re-entrancy in the regex engine, and to thus do Perl_save_re_context() as a precaution. This is error-prone and hard to prove correct. (As an example, Perl_save_re_context() is only called in the tie code if the tie code in question is doing a tied PRINT on STDERR; clearly an unusual use case that someone spotted was buggy at some point). The obvious fix, and the one performed by the series of commits in this merge, is to make all the global state local to the regex engine instead. Indeed, there is already a struct, regmatch_info, that is allocated as a local var in regexec(), then passed as an argument to the various lower-level functions called from regexec(). However, it only had limited use previously, so here we expand the number of functions where it is passed as an argument. In particular, it is now also created by re_intuit_start(), the other main run-time entry point to the regex engine. However, there is a problem with this, in that various regex vars need cleaning up on croak (e.g. they point to a malloced buffer). Since the regmatch_info struct is just a local var on the C stack, it will be lost by the longjmp done by a croak() before leave_scope() can clear
8 days ago
The other day, I found myself with the perfect example to explain continuations to some web programmers I know. (If you didn't already know how continuations can make you sandwiches without you leaving your office, they're powerful thing...
The other day, I found myself with the perfect example to explain continuations to some web programmers I know. (If you didn't already know how continuations can make you sandwiches without you leaving your office, they're powerful things. Imagine you have a controller for a web application. Imagine one of the methods is edit_profile. You have an access control mechanism that requires authentication before someone can edit a profile. Your code might look something like: sub edit_profile { my ($self, $request) = @_; return $self->redirect_to_login_and_return( $request ) unless $request->is_authenticated; # actually edit the profile here ... } Now HTTP is stateless. Barring some persistent connection through websockets or the like, the server must redirect the client to an authorization page, receive and verify another request from the client, and, if that succeeds, redirect to the action shown here to complete the user's request. If there's state built up to make this request (a user ID, form parameters, the contents of an active shopping cart, whatever), something needs to manage that data through all of these redirects. There are plenty of ways to handle this; you've probably implemented at least one. It's a common pattern. If you're like me, you'd like to be able to set aside the current request temporarily, manage authentication, and then resume the request just after the point of checking for authentication. In other words, once your web framework has handled the HTTP response to the point of figuring out what the client wants to do (edit a profile), restore the state of the web application and start over as if the client request had been authenticated from the start. Again, you can do this all manually, but if your language or framework supported continuations, you could have a mechanism to capture all (or some—that's a delimited continuation) of the control flow and request state through your application at the point of checking for authentication and restore the state of that control flow. There are nuances here. You probably also need to serialize the state of the continuation and HTTP request and store that on your server (never trust the client). You need to collect stale continuations (clients can abandon requests) but not too frequently (you probably have tabs open in a browser window that's been open for days, too). You have to go to some lengths to avoid circular references, and it's not always easy to serialize information like open sockets, open files, and active database connections. ... but these problems are solveable, and if your language and/or web framework supports this (if someone has solved them once and for all so that the average programmer doesn't have to), then you have a powerful tool for managing state atop a stateless protocol. Note that a web framework could provide this feature even when its host language doesn't... but that's an idea for another time.
8 days ago
These links are collected from the Perlbuzz Twitter feed. If you have suggestions for news bits, please mail me at andy@perlbuzz.com. GSOC project: Move wget's test suite from Perl to Python. (google-melange.com) Announcing ...
These links are collected from the Perlbuzz Twitter feed. If you have suggestions for news bits, please mail me at andy@perlbuzz.com. GSOC project: Move wget's test suite from Perl to Python. (google-melange.com) Announcing the Perl Companies project (anonymoushash.vmbrasseur.com) Visualizing module dependencies (blogs.perl.org) Paying respect to Module::Build (dagolden.com) Four projects that are models of design (jeffreykegler.github.io) Tips for YAPC first-comers (babyl.dyndns.org) given & smartmatch in Perl 5.18 (domm.plix.at) YAPC::NA streaming live now (yapcna.org)
14 days ago
If programmers could learn one thing from successful businesspeople, they should learn about the idea of opportunity costs. Sure, it's fun to throw away a lot of code and rewrite it from nothing, but in the years you're waiting for that ...
If programmers could learn one thing from successful businesspeople, they should learn about the idea of opportunity costs. Sure, it's fun to throw away a lot of code and rewrite it from nothing, but in the years you're waiting for that mystical magical super sixy project to get usable, you could have been making money with working code, even if it's a little shabby around the edges. Sometimes opportunity cost works the other way, too. If you can get a 1% return by putting your money in a CD for 12 months or you have the chance to get a 10% return if you can buy the right stock sometime in the next three months, hold out for the 10% return. You might get it. You might not. Yet the reward is greater than the risk. So it goes with programming. When I wear my programmer hat, I want to write the best code imaginable. I want to find the right abstractions. I want to discover the most elegant design. I want to put in the least effort. The risk of getting it wrong and having to do more work is greater than the risk of missing a deadline. When I wear my business hat, I want the most valuable features as soon as possible. The risk of missing out on business value (greater revenue, lesser costs, greater productivity) is greater than the risk of increased future maintenance costs. After all, it should be possible to measure those increased costs and deal with them when it makes the most sense from the business point of view. Project management includes the art of navigating between the business desire to have working software sooner and the programmer desire to have elegant software. That's not easy, but there are ways to give both groups some of what they need. Sadly, community-driven development of the free and open source software worlds often lacks this management. We don't lack the tension though. Consider, for example, the debate over whether it's acceptable to release software without documentation. The business argument is "It can provide value to people." The developer argument is "It's not finished without documentation." This tendency matters less in the F/OSS world as in the business world where your paycheck depends on your ability to deliver working software. What's the worst that can happen? People will move on to a competing project which better meets their needs. (And you thought F/OSS people didn't understand capitalism.) So you annoy your users and drive them away; you're a volunteer, and there are always plenty of volunteers. ... until there aren't. Sure, that's an extreme position. Though it's easy to trawl through GitHub (and before that, SourceForge) to find the abandoned carcasses of projects which never delivered anything of value to anyone and consequently never attracted sustainable development beyond the whims of the originators, I suspect that it's more interesting to consider the quiet desperation of active projects stuck in not-invented-here rewrite limbo which struggle to achieve usefulness. What if they spent more time focusing on the value their code should provide to potential users and less time constructing elegant, airy edifices? It's important to write clean and maintainable code. It's important to focus on quality and craft. Yes, please let us do that. Yet if you want to have real users, shouldn't you also consider how your choices affect them and what that costs them?
16 days ago
These links are collected from the Perlbuzz Twitter feed. If you have suggestions for news bits, please mail me at andy@perlbuzz.com. Do your modules pass work under Perl 5.18? (blogs.perl.org) Speeding up Test::WWW::Mechani...
These links are collected from the Perlbuzz Twitter feed. If you have suggestions for news bits, please mail me at andy@perlbuzz.com. Do your modules pass work under Perl 5.18? (blogs.perl.org) Speeding up Test::WWW::Mechanize tests (blogs.perl.org) Perl 5.18's hash key ordering changes in a nutshell (blog.twoshortplanks.com) How I manage new Perls with perlbrew (dagolden.com) A "simple matter of programming" task for those wanting to contribute to #perl (rt.perl.org) ack 2.05_01 just released to CPAN (search.cpan.org)
20 days ago