At last week’s HP Big Data Conference, HP announced exciting new features around the upcoming Excavator release. These features include machine data log text search, ROLAP enhancements, backup/restore enhancements and open sourcing the Flex Table library.
— JGrover (@JGrover) August 11, 2015
Loading bulk data was challenging enough, but now there’s a demand to stream in real-time. Vertica will now be able to seamlessly integrate with Apache Kafka allowing for various source systems and complex data formats. In this integration, Vertica will be able to continuously consume from Kafka and also produce by exporting query results to Kafka to close the processing loop.
…organizations will now be able to simplify and automate data load-and-query functionality, enabling them to empower any application with real-time analytics.
Moving onward from traditional ETL and parallel copy, Kafka will be able to address scalability, latency, throughput and coupling while embracing this backbone standard for an enterprise pub-sub messaging system. Kafka decouples data sources such as logs and apps to analytics and adds a message bus layer which is fast, scalable and fault tolerant.
Machine data has been a valuable asset of IT management and analytics. Excavator will be a single-system solution for text search on machine log data that is high performance and customizable without the need to move data around. This will be accomplished by using text indicies which are maintained transactionally within their base tables. The index will be easily searchable with regular SQL and result sets can be joined with other tables to retrieve data.
Support for Relational OLAP will be improved with the ability to build cubes up front and perform smaller roll-ups for less detailed data with flexibility and fast query time.
The backup/restore functionality will also be receiving much welcome enhancements which include object level (schema or table) restores from a full backup and lock improvements during recovery. The ability to replicate objects between databases (i.e. between primary & DR) will be possible by syncing consistent snapshots.
For more information, read the official press release.