TEXT LIST (version 5)

Some web publishers do not trouble themselves by formatting their data using HTML elements and often simply put information on their web site as a plain text. What is even worse, they sometimes add supplementary notes in the same manner as the main information making it harder to separate them. But a good web scraper should overcome all these obstacles.

In this test, the web scraper needs to scrape a list of US cities with their population organized as a simple text. Specifically, it has to:

  1. Extract all the cities and their population, while skipping all the notes
  2. Scrape cities with their notes (if any)
  3. Scrape bold cities (with their population) only

There is a ver parameter (which varies from 1 to 5) to show different list versions (with different city numbers, bold cities and their notes).

For testing, you may use the following sample links. The scraper should sufficiently scrape all data from any link using the same project:

New York      8,244,910
Los Angeles   3,819,702
Chicago       2,707,120
Houston       2,145,146
Philadelphia  1,536,471
Phoenix       1,469,471
San Antonio   1,359,758
San Diego     1,326,179
Dallas        1,223,229
change: +2.12%
San Jose      967,487
945,942 in 2010