Data Lakes vs. Data Warehouses in Library Analytics: An Innovation Management Perspective
Keywords:
Data governance, library innovation, data lakes, data warehouses, quality management, information strategy, business intelligence.Abstract
Academic libraries are already changing their service focused units to strategic data centres in which innovation, informed decision-making, and operational performance are based on strong information structures. Here, the data lakes or data warehouses debate is of paramount importance to the acquisition, handling, and utilisation of data by libraries to develop superior analytics. Within this paper, a comparative study of data lakes and data warehouse was provided on the basis of an innovation management viewpoint specific to academic libraries. It evaluates the critical quality parameters such as data integrity, accessibility, governance, scalability, and adaptability of analytics to determine how each of the architectures supports the various forms of library analytics. The research points to the fact that data lakes, which have a schema-on-reads capability and are capable of accommodating heterogeneous, large-scale, and multi-format data, offer increased flexibility to exploratory analytics, machine learning, and fast prototyping of new innovative services like personalised recommendations, predictive user engagement models. On the other hand, databases with the properties of schema-on-write design and with tightly regulated ETL operations provide a higher degree of data consistency, reliability and auditability, which are more appropriate to standardised reporting, accreditation metrics and compliance-driven analytics. Having acknowledged that libraries need to ensure innovativeness and high data quality standards at the same time, the paper suggests an Innovation Governance Framework that combines the two architectures in a complementary way. In this hybrid system, the data lake serves as an experimentation, discovery system and the data warehouse as the system of record to the validated indicators and institutional dashboards. With real-time adaptation of technical architectures to the data governance, ethical matters or considerations and continuous improvement activities, the framework facilitates more intelligent decisions made by academic libraries, strategic planning and increased contribution to the excellence of organisational data as an essential factor.