Perl and LWP
32%
off

Perl and LWP

3.85 (20 ratings by Goodreads)
By (author) 

Free delivery worldwide

Available. Dispatched from the UK in 2 business days
When will my order arrive?

Description

Perl soared to popularity as a language for creating and managing web content, but with LWP (Library for WWW in Perl), Perl is equally adept at consuming information on the Web. LWP is a suite of modules for fetching and processing web pages. The Web is a vast data source that contains everything from stock prices to movie credits, and with LWP all that data is just a few lines of code away. Anything you do on the Web, whether it's buying or selling, reading or writing, uploading or downloading, news to e-commerce, can be controlled with Perl and LWP. You can automate Web-based purchase orders as easily as you can set up a program to download MP3 files from a web site.Perl & LWP covers: *Understanding LWP and its design *Fetching and analyzing URLs *Extracting information from HTML using regular expressions and tokens *Working with the structure of HTML documents using trees *Setting and inspecting HTTP headers and response codes *Managing cookies *Accessing information that requires authentication *Extracting links *Cooperating with proxy caches *Writing web spiders (also known as robots) in a safe fashion Perl & LWP includes many step-by-step examples that show how to apply the various techniques. Programs to extract information from the web sites of BBC News, Altavista, ABEBooks.com, and the Weather Underground, to name just a few, are explained in detail, so that you understand how and why they work. Perl programmers who want to automate and mine the web can pick up this book and be immediately productive. Written by a contributor to LWP, and with a foreword by one of LWP's creators, Perl & LWP is the authoritative guide to this powerful and popular toolkit.show more

Product details

  • Paperback | 262 pages
  • 177.8 x 233.68 x 20.32mm | 498.95g
  • O'Reilly Media, Inc, USA
  • Sebastopol, United States
  • English
  • 1, black & white illustrations
  • 0596001789
  • 9780596001780
  • 1,178,750

About Sean M. Burke

Sean Burke is an active member in the Perl community and one of CPAN's most prolific module authors. He has been a columnist for The Perl Journal since 1998, and is an authority on markup languages. Trained as a linguist, he also develops tools for software internationalization and Native language preservation.show more

Table of contents

Foreword Preface 1. Introduction to Web Automation The Web as Data Source History of LWP Installing LWP Words of Caution LWP in Action 2. Web Basics URLs An HTTP Transaction LWP::Simple Fetching Documents Without LWP::Simple Example: AltaVista HTTP POST Example: Babelfish 3. The LWP Class Model The Basic Classes Programming with LWP Classes Inside the do_GET and do_POST Functions User Agents HTTP::Response Objects LWP Classes: Behind the Scenes 4. URLs Parsing URLs Relative URLs Converting Absolute URLs to Relative Converting Relative URLs to Absolute 5. Forms Elements of an HTML Form LWP and GET Requests Automating Form Analysis Idiosyncrasies of HTML Forms POST Example: License Plates POST Example: ABEBooks.com File Uploads Limits on Forms 6. Simple HTML Processing with Regular Expressions Automating Data Extraction Regular Expression Techniques Troubleshooting When Regular Expressions Aren't Enough Example: Extracting Links from a Bookmark File Example: Extracting Links from Arbitrary HTML Example: Extracting Temperatures from Weather Underground 7. HTML Processing with Tokens HTML as Tokens Basic HTML::TokeParser Use Individual Tokens Token Sequences More HTML::TokeParser Methods Using Extracted Text 8. Tokenizing Walkthrough The Problem Getting the Data Inspecting the HTML First Code Narrowing In Rewrite for Features Alternatives 9. HTML Processing with Trees Introduction to Trees HTML::TreeBuilder Processing Example: BBC News Example: Fresh Air 10. Modifying HTML with Trees Changing Attributes Deleting Images Detaching and Reattaching Attaching in Another Tree Creating New Elements 11. Cookies, Authentication, and Advanced Requests Cookies Adding Extra Request Header Lines Authentication An HTTP Authentication Example: The Unicode Mailing Archive 12. Spiders Types of Web-Querying Programs A User Agent for Robots Example: A Link-Checking Spider Ideas for Further Expansion A. LWP Modules B. HTTP Status Codes C. Common MIME Types D. Language Tags E. Common Content Encodings F. ASCII Table G. User's View of Object-Oriented Modules Indexshow more

Rating details

20 ratings
3.85 out of 5 stars
5 10% (2)
4 70% (14)
3 15% (3)
2 5% (1)
1 0% (0)
Book ratings by Goodreads
Goodreads is the world's largest site for readers with over 50 million reviews. We're featuring millions of their reader ratings on our book pages to help you find your new favourite book. Close X