Resource Data

Embracing Big Data for Business Intelligence

It is hardly disputed these days that business intelligence is beneficial for any organization, regardless of its sector of activity. Data optimization and governance have been shown to produce better long-term decision making.

This does not mean that the data implementations have been perfect. Some companies have failed in their efforts to become data-driven on a much larger scale than one might expect. Others, however, rushed in and began to heavily use external data sources.

Big Data has been incredibly useful for those who have successfully managed their internal sources. Strategic use of big data enables organizations to better understand their customers, create more engaging marketing campaigns, and forecast demand with greater accuracy.

Big data and external sources

Following the five-V model, two of the key determinants, important in our case, are volume and velocity. Big data from external sources is different from internal data because there is no limit.

Internal sources will always be pre-limited by company size. In a poetic sense, the company itself is at the mercy of its customers for obtaining such data. If the organization is small in both operations and revenue, little data will be produced. Trying to get big insights from small datasets is often a recipe for failure.

External sources are, however, limited by the pace of data production on the Internet. In practice, the speed and volume of data is practically unlimited, limited only by technical capabilities. There is so much information produced daily that even after all the consideration and slicing of sources, there is something to find and analyze.

As such, the volume and velocity of big data, mostly from external sources, is of a greater magnitude than internal resources would allow. Moreover, there is an important qualitative difference in the data.

External sources provide us with data from a wide variety of different sources. Most of them have no direct connection to the company that will use this information, which makes them much more unbiased than anything an inside source could produce.

Ultimately, a combination of both sources produces Big Data. Externals, however, play much greater volume and speed. It is important to note that these two sources are complementary. While some of the information they provide may overlap (such as customer habits), they may also offer unique signals that can help improve overall business strategy.

Hidden BI Gems in Big Data

External sources don’t always produce unique signals that make us change our strategy, but they reinforce our existing methods. Additionally, they may provide information that would otherwise not be available.

Take the use of CRMs, for example. Almost all digital businesses use these systems in their day-to-day operations. Customer profiles, however, have broadened in many directions. There is now potentially useful data on businesses and individuals scattered all over the web.

Social media is a great example. Many businesses may choose to pull publicly available data from social sources because most of their customers will have some form of presence. These enrichments would be particularly useful for those who work in B2B.

On the other hand, a combination of internal and external sources can create better planning and budgeting options for any business. External data allows organizations to predict and forecast demand, while internal sources can more accurately represent the resources available to meet those needs.

It is particularly useful for industries such as e-commerce. External data provides organizations with a better insight into the overall market, its trends and opportunities. Companies have successfully used various methods to collect and access large amounts of external data.

Big data acquisition

Since most digital businesses successfully collect a lot of data from internal sources, acquiring it is often not a problem. The other counterpart, the external data, is however more complicated.

It can be separated into two distinct categories – traditional and advanced. Traditional external data (i.e. government reports, statistical databases, etc.) have primarily been used by financial firms and large e-commerce companies. These are usually huge datasets that provide insight into big overviews of markets and economies.

Advanced External Data, however, is somewhat new, but has already produced great results. This data can be considered as publicly available online data, for example, reviews, price information, etc.

When internal information sources are combined with advanced external data, that is when Big Data emerges. Integrating these two together is not as difficult these days as it once was. Many third-party web scraping vendors and even DaaS companies can provide data upon request.

There is no longer a need to build scraping solutions or similar infrastructure in-house. Most of them can be outsourced at fairly efficient prices, which simplifies data governance. All it takes is a data warehouse to place the pre-packaged information retrieved from a third party.

The analysis can be done in two ways. The simplest approach is to treat the external data as its own complete data set and retrieve information directly from it without it interacting with the internal data. Treating it as something completely independent is often easier and there is less room for error.

Sources can be combined, however, if proper labeling is undertaken and data is carefully selected. CRMs, as I mentioned earlier, are a prime example of a mix candidate. Data tends to be more insightful as their sets are complete.


Embracing Big Data, for most companies, means engaging with external sources of information. These have enormous potential hidden within them, although we must consider them as completely independent. However, when combined with internal sources, they can greatly improve day-to-day decision-making and business operations.

Photo credit: PlusONE/Shutterstock

Andrius Palionis is vice president of enterprise sales at web scraping solution provider