Web Scraping - web scraping, screen scraping, data parsing and other related things

Web Scraping with Node.js

nodejs-web-scraping-logoThe web scraping topic has been actively growing in popularity for dozens of years now. Freelance sites are overcrowded with orders connected with this contradictory data extracting process. Today we will combine two new and revolutionary directions in web development. So, let’s consider an elegant and modern way to scrape data from websites with Node.js! more…

JavaScript rendering library for scraping javascript sites

logo-js-rendering-libraryCan you imagine how many scraping instruments are at our service? Though it has a long history, scraping has at last become a multi-lingual and simple approach. Unfortunately, there is a list of non-trivial tasks which can’t be resolved in a snap.

One of these tasks is scraping javascript sites, those that output data using JavaScript. Facing this task, classic scrapers (not all of them though) ignore JS-data and continue their own life-cycle. However, when this little defect becomes a big trouble, developers all over the world take measures. And they did it! Today we consider one of the most awesome tools which scrapes JS-generated data – Splash. more…

Octoparse 7.0 – a free web scraping tool for non-developers

Octoparse has recently launched a brand new version 7.0, which has turned out to be the most revolutionary upgrade in the past two years, with not only a more user-friendly UI, but also some of the advanced features make web scraping even easier. In this post, I will walk through some of the new features/changes made available in this new version, with respect to how a beginner, even one without any coding background, can approach this web scraping tool. more…

New European e-communication regulations and web scraping

GDPR-eu-rulesGeneral Data Protection Regulation or GDPR: enforcement date –  25 May 2018. The GDPR covers the matter of online user data privacy rules for electronic communication and data protection. The regulation includes modern communication messengers and services, eg. Skype, Viber, Gmail, etc., that have not been previously mentioned in the former EU e-communication directives.   more…

Design patterns for hierarchical data storage and effective processing

The hierarchical data storage problem is a non-trivial task in relational database context. For example, your online shop has goods of different categories and subcategories creating tree spans for 5 levels. How should they be stored in a database?

Luckily, there are several approaches (design patterns) that will help the developer to design database structure without both odd tables and code. As a result, the site will work faster and any changes, even on database layer, won’t cause troubles. We will study these approaches below. more…

Big Data Basics

big-data-worldPower of Big Data: capabilities and perspectives

As everyone knows, technological development is evolving flash like. At the same time software requirements, approaches and algorithms are growing with equal speed.  In particular, relatively recently, developers have faced the problem of huge data volume processing – making it necessary to create a new, effective approach, a new paradigm of data storage. The solution was not long in coming – in 2011 huge companies all over the world started using the Big Data concept. In this article we will talk about this engaging approach. more…