Web Scraping of File Download Using Python and R

SUMMARY: The purpose of this project is to practice web scraping by extracting specific pieces of information from a website. The web scraping python code leverages the BeautifulSoup module.

INTRODUCTION: On occasions we need to download a batch of documents from a single web page without clicking on the download link one at a time. This web scraping script will automatically traverse through the entire web page and collect all links to the PDF documents. The script will also download the PDF documents as part of the scraping process.

Starting URL: https://www.knime.com/about/events/knime-spring-summit-2019-berlin

The source code and output can be found here for Python and here for R on GitHub.