BI Application & Data Warehouse Project Planning should be done perfectly from all aspects: technically, resource-wise and from a business perspective. It is key to a project’s successful execution and completion.
So here is what you should do.
1. For Data Sourcing
Ensure you have all the data that is needed to compile measurements and to answer the driving business questions. For that matter, focus on these 4 things:
i. Identify data sources in the early planning stages so that all the data you might need, is readily available.
ii. Develop an understanding of how you will integrate the different data sources from, let’s say, multiple internal systems and external repositories.
iii. Ensure timeliness and consider all the cost-related constraints. You should consider how frequently you can collect data from the aforementioned sources. Also, consider the cost of refreshing your data from external data sources. In case you are using social media as a data source, more frequent updates may be required.
iv. If cost becomes a limiting factor, as would be the case with social media data, you can consider sampling smaller data sets more frequently rather than collecting large amounts of data. Larger data sets may give you more accurate social media measures; however, well designed sampling techniques can still give useful information.
2. For BI/Data Warehouse Resources
You need a team with the capability to handle cross-organizational interactions and a deep understanding of the business issues for strengthening relationships with business users. Moreover, they should know the best ways to work with others and communicate more effectively both in writing and in business meetings and presentations. The challenges become even more pronounced as the DW/BI system evolves into a standard component of the IT environment, thus weakening the connection to the business.
3. When Dealing With Big Data
Historically, Business Intelligence (BI) was practiced using established practices and tools, including dimensional modeling, extraction transformation and load (ETL), ad hoc reporting and dashboards. These techniques required that a data warehouse or at least a data mart is able to support management reporting that was not generally available from transaction processing systems.
Today, we have to tap into a broader array of information captured in big data. Here are some essential characteristics to keep in mind as you develop a strategy for big data analysis.
Big data does not completely come from sales, inventory or human resources systems. It is generally a varied mixture of data sourced from application log files, machine sensor data, and social media.
1. Descriptive statistics. It is especially useful with big data sets when you partition the data and compare different groups. Most BI practitioners are familiar with this technique wherein calculations such as mean, median, and standard deviation are commonly used to describe a population. So when you get the data from Application log, you can find the average time spent in your Web application prior to the sale and not just calculate the average dollar value of a sales transaction from a sales data
2. Grouping your customers. Now normally you would do that by sales region but that is not the most informative way to organize and analyze your data. In lieu, you should think of clustering. This machine learning technique is used to identify subsets of data with similar characteristics. For example, clustering might help you identify different groups of customers based on the time spent in your Web application and the amount of their sales transaction.
3. Last you can explore big data sets and testing hypotheses to complement the types of analysis you do with your traditional BI systems.
Once you have identified a group of customers, you can also analyze their navigation patterns on the site. Since they spend a significant amount of time on your site, you can reasonably hypothesize whether they are interested in making a purchase or not.
4. Dealing with Big Data Types
• For large volumes, deploy enough commodity servers and storage along with a distributed file system, like the Hadoop File System (HDFS), and you can collect petabytes of data.
• If you do not need to store the data for too long, you can take advantage of public clouds like Amazon AWS, Microsoft Azure and Rackspace, but watch out for long term storage costs.
• If you have rapidly changing data consider using a real time big data analytics tool like Storm.
Traditional BI is not dead yet and will not be anytime soon. It is here to stay as long as the business fundamentals are the same. So the management reporting based on data from transaction processing systems will remain useful. However, big data shall introduce new ways of understanding business operations that complement existing management reporting systems.
0 comments:
Post a Comment