Perlwikipedia/Bugs
This page is a makeshift list of bugs in perlwikipedia for those without Google Code accounts or that don't want to create them.
New
edit- Description: get_text does not work when on a non-English wiki (example bug)
- Summary: When using get_text on a non-English wiki, the function will error out with a 404 (I hope to God this isn't true).
- List any relevant steps to reproduce the bug to help the developers, or a nice *nix-style patch if you've got one. Shadow1 (talk) 19:14, 30 May 2007 (UTC)
Open
editClosed
edit- Description: When running on ActivePerl on a Windows machine, the get_text method hangs in an infinite loop.
- Summary: This loop seems to occur because the condition on line 295 is never met, because $res->content contains garbled text. (Looks like an encoding problem.)
- This occurs on my computer running ActivePerl on Windows, with the latest versions of all modules. – Quadell (talk) (random) 16:36, 6 June 2007 (UTC)
- A work-around has been found! Shadow1 suggested I go through Perlwikipedia.pm and change all instances of ->content to ->decoded_content. This fixes it. I'm not sure if a more seamless solution should be developed before closing this bug though. . . – Quadell (talk) (random) 19:55, 6 June 2007 (UTC)
- Fixed in SVN. Shadow1 (talk) 15:56, 7 June 2007 (UTC)
- The code at http://perlwikipedia.googlecode.com/svn/trunk/Perlwikipedia.pm has a bug, in the _put subroutine one declares the variable $res twice. That is easily fixed, and I can do it since I have access to the repository, but I am not sure if the googlecode version of the code is the most recent one. Oleg Alexandrov (talk) 15:53, 20 June 2007 (UTC)
- I fixed it myself. Oleg Alexandrov (talk) 02:07, 23 June 2007 (UTC)
3. Description: get_text fails on certain UTF-8 characters
- Summary: If you attempt to retrieve the text of a page such as Š, the following error is produced:
Can't escape \x{0160}, try uri_escape_utf8() instead at {path}/perlwikipedia/Perlwikipedia.pm line 64
- Test Case: The following code segment demonstrates the problem.
my @results = $bot->what_links_here("Caron"); for my $result (@results) { my $page = $result->{title}; print "Getting $page\n"; my $text = $bot->get_text($page); }
- Resolution: I patched my copy of Perlwikipedia.pm by doing exactly what the error message states. I don't know if this is the best approach, but it works.
$ svn diff Index: Perlwikipedia.pm =================================================================== --- Perlwikipedia.pm (revision 88) +++ Perlwikipedia.pm (working copy) @@ -7,6 +7,7 @@ use XML::Simple; use Carp; use Encode; +use URI::Escape qw(uri_escape_utf8); our $VERSION = '0.90'; @@ -61,7 +62,7 @@ my $extra = shift; my $no_escape = shift || 0; - $page = uri_escape($page) unless $no_escape; + $page = uri_escape_utf8($page) unless $no_escape; $page =~ s/\&/%26/g; # escape the ampersand my $url =
- Thanks. -- JLaTondre 12:00, 18 July 2007 (UTC)
- I applied the patch. I tested it too. Thanks! The new revision is available at the Google code repository for Perlwikipedia. Oleg Alexandrov (talk) 03:28, 19 July 2007 (UTC)
4. Description: get_pages_in_category() does not return images in the category
- Summary: Can this be changed to include images as well? – Quadell (talk) (random) 13:38, 7 June 2007 (UTC)
- Patch written, tested, and committed. Shadow1 (talk) 13:10, 25 August 2007 (UTC)
5. Description: get_history failing on articles with UTF-8 characters in the name
- Summary: For articles with UTF-8 characters in the name, such as Kashō, get_history fails. The query does not retrieve the results as the UTF-8 characters need to be escaped. I added $pagename = uri_escape_utf8($pagename); to the start of get_history and it fixed the problem. This same problem will occur with any other function that uses _get_api. It cannot be fixed by simply escaping $query within _get_api as that will also escape characters that shouldn't be (ex. the & in &action).
- The following is the diff for the change I made. Thanks. -- JLaTondre 00:14, 27 August 2007 (UTC)
236a237,238 > $pagename = uri_escape_utf8($pagename); >
- Committed to SVN, along with some other functions with the same bug. Should be rolled out in version 1.01 soon. Shadow1 (talk) 01:05, 27 August 2007 (UTC)